In my survey, I'm asking people to rate a bunch of stuff on a scale from feminine to masculine. Three of the items are 'dildos', 'butts', and 'unconditional love'.
Which one will be voted, on average, most feminine?
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ571 | |
2 | Ṁ309 | |
3 | Ṁ301 | |
4 | Ṁ242 | |
5 | Ṁ180 |
Very important question: how much of this poll will you be doing in person? (Is it all online? All in person? Or a mix?)
If it's like reddit/twitter, I'll use GPT for guessing (otherwise I'll try to do some other research), which says:
Dildos: 30%
Butts: 40%
Unconditional love: 30%
@Aella I don't have a strong argument either way, but I think I'd prefer "straight up what was the average score". Seems like the most straightforward interpretation of the market description. (Although I myself did not think as far as considering the gender ratio of your respondents before betting..)
In any case I'd be interested in how men and women differ in the end
@Aella Calculate the male and female averages separately and give the final average weighted by the proportion of males and females worldwide.
More than that, I recommend stratifying by [age]
× [socioeconomic status/income]
× [geographic location]
× [birth assignment] × [gender ∈ {M,F,X}]
(if you have SES/income & location in your study) since those are also likely confounding variables where your sample won't be representative. Further, since some resulting bins are likely to be underpopulated, use a Bayesian framework with Beta(½,½) (Jeffreys) priors. Then, sample from the posteriors of the bins according to the world population for the bins, so the high variance in the small bins will be reflected as more uncertainty in the result. IIRC, the means will be the same if you take a weighted average of the bin means (as above), but the uncertainty is a crucial consideration for interpretation. Mainland China and India are likely to be severely under-represented when making inferences about "people."
Since it's a priori likely that there will be correlations between answers not explained by the demographic variables, you can reduce the uncertainty more and get better averages by making a graphical model for the whole study, but that's an advanced technique. A first pass would be to add one more "person type" latent variable between the demographics and the answers. The most rigorous way to determine the number of "types" is to calculate the Bayes factor for each number of types. Once you've chosen the number of types, use the EM algorithm to assign types to each answer. The rigorous procedure is quite slow, so instead you can use the EM algorithm for each number of types to find the most likely assignment of types to data. Then use the BIC to choose the number of types. Either way, make the maximum number of types a significant fraction of your population so you know your upper bound is sure to overfit. Your final tables would be P(answer|person type) and P(person type|demographic vars)
- each represented by Beta(½,½) (Jeffreys) priors. Then you get your final results by sampling the "person type" according to the world distribution of the demographic variables and then sampling from answers according to the "person type." Besides lowering uncertainty, the types may be interpretable.
You, in particular, may want to take this approach because you run many surveys like this. If you include sufficient identical questions on each survey to impute the same "person type" variable across surveys, you can make a unified "person type" with broad explanatory power. And each survey can make the results of the previous one more accurate.
@EricMoyer The length of this response to me seems misproportioned to my perceived sillyness of the question.
@EricMoyer Makes sense if this were supposed to be representative of the average person. But it presumably isn’t.
@ShadowyZephyr I took the word "people" extremely generally and ran (as JRP pointed out) way too far with it. I need to change my portrait to a lance pointed at a windmill.
@EricMoyer If the creator @Aella has a serious purpose/motivation behind this question (I ask seriously & out of curiosity), I'll take back what I said.
@parhizj I don't know Aella's motivation for the survey outside of curiosity and continuing to interact with and provide engaging content for her audience. However, every bit of truth we wrestle from the universe makes us a tiny bit freer. So, I applaud even the silliest science.
@tailcalled I think she should create a Dirty Wordle game where you have to guess the dirty word and each guess it tells you how far you are in vector space or whatever. Like how many words between your word and the dirty word. Make sense?
@EvanDaniel Oh yes that was what gave me the idea and I had forgotten the name. Thanks for sharing! Good fun!