Will anxiety correlate with any sexual fetish among females in my data?
172
1.2K
2K
resolved Dec 4
Resolved
NO

My Big Kink Survey has around 350k female responses. In this I gave people a list of mental illnesses as checkboxes. One of these was anxiety.
I had a second question which asked "Of these that you checked, which one is the most severe for you?"

If people checked having anxiety, but didn't check it as most severe, their answer counted as 1. If they did also check anxiety as it being the most severe, I counted their answer as 2.

Will 'having anxiety' in females correlate more than r=0.12 with any of the fetishes (~500 or so) in my data?

Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ687
2Ṁ682
3Ṁ546
4Ṁ489
5Ṁ419
Sort by:
predicted NO

yay! thanks for the resolution

predicted YES
Excited Stan Marsh GIF by South Park

predicted YES

predicted YES
predicted YES
Simon Gibson Time GIF
predicted NO

The base rate of null results is so high

predicted YES

@Aella

When resolve?

Pittsburgh Pa What Day GIF by Mike Hitt

Status update? I wanna see the results so I can find girls with the mental illness that matches my fetish.

predicted NO
predicted NO

how's this going? ready to resolve?

predicted YES

@Stralor @Aella Is resolution of this market based upon the data collected from first 350k respondents to have self identified as female in The Big Kink Survey? Is this data set publicly available (i.e. just awaiting statistical analysis)?

Feel like it's pretty certain if your floor is r>=0.12 and you have 500 (!!) fetishes

bought Ṁ10 of NO

What if someone checked only anxiety? Do they have to check it as the most severe then?

bought Ṁ10 of YES

Fingernails raked down her back,
She screams and moans as she begs for more crack.
The addiction to pain and pleasure is a dangerous blend,
But nothing will stop her until her wits meet their end.

@Mason Damn, what model is this?

Does the correlation have to be positive, or do negative correlations count as well?

bought Ṁ50 of NO

I'm inclined towards yes in the general case: "does anxiety correlate to having fetishes", but not in the specific "does anxiety correlate to any specific fetish", which is this question's position, so 60% feels like a good time to get in on no. Looking forward to the result!

@PatS Yes in the general case implies Yes in at least some specific cases. If you are no more likely to have any given fetish if you have anxiety than if you don't, then you are also no more likely to have a fetish in general, unless having anxiety makes you less likely to have multiple fetishes at the same time without affecting the probability of any specific fetish. The latter situation would be extremely bizarre, and I can't imagine any reasonable explanation for why it would be true, so my credence that the general case implies the disjunction of specific cases is extremely high.

predicted NO

I previously did studies of sexuality and personality and I don't remember any correlations between anxiety and fetishes in women but I don't remember it very well so I might totally be wrong.

predicted NO

I used to strongly expect that every psychological trait would correlate with some sexual fetish or other because you're doing a zillion comparisons, but then aella found a bunch of negative results so I updated downward. It is really weird that you can pick some trait at random and try to correlate it with 500 different fetishes and not even find one out of 500 with r>0.12. I still kind of suspect a bug in her code.

bought Ṁ15 of NO

@JonathanRay In a high-dimensional vector space, the exponential majority of pairs of vectors will be ~independent.

@JonathanRay There's such thing as adjustment for multiple comparisons for p-value; @tailcalled's comment suggests r does this somewhat-automatically instead?

predicted NO

@b575 The traditional adjustments for multiple comparisons are done because people only work with small samples and therefore have some random noise in their computed correlations. But Aella has a sample size of like half a million, so the random noise in her correlations will only be +/- 0.004 or something like that. So basically the adjustment is not needed.

bought Ṁ6 of YES

@tailcalled Wait, what? At least, of p-value, that's not why they are done; they are done because that is quite literally how the probability shifts for independent comparisons (if you use the Šidák formula of 1-(1-p)^n - or, reversed, 1-(1-p)^(1/n); if you use the Bonferroni correction, it's a little different by they at least share the intuition). It's not true that you can omit the adjustment if you increase your sample, because noises of your sample may well be persistent rather than truly random.

predicted NO

@b575 When I was talking about the correlations, I was talking about r-values, not p-values. What I'm saying is that there is absolutely no reason to use p-values in the regime Aella is working in, and therefore no reason to use multiple comparison corrections.

predicted NO

I guess I should further clarify. When doing a study, there are various ways you can quantify your estimates, such as raw differences or slopes, d values or r values, percentage differences, etc.. p-values do not go into this category as they do not quantify an estimate of anything.

Instead, the purpose of p-values is this: quantities such as r have an uncertainty due to the sampling process, and this uncertainty means that a seemingly-interesting value could have happened by chance. p-values are a way of quantifying the plausibility of chance making them interesting-looking.

Aella has a sample size of <unreasonably big>, which means that her results are not going to be due to chance, so probably all of her p-values are 0. However the prediction market is about the r-value rather than the p-value.

predicted YES

@tailcalled "Aella has a sample size of <unreasonably big>, which means that her results are not going to be due to chance" - I think we just disagree on how many supposedly interesting things can be due to chance.

predicted NO

There's not much room for disagreement here. You can mathematically compute how much it can be due to chance. That's where my claim of "+/- 0.004 or something like that" is coming from.

predicted YES

@tailcalled OK, let me rephrase: I think you use an overly-strict definition of chance, where only truly-random-noise counts.

predicted NO

@b575 What kinds of chance do you have in mind?

predicted YES

@tailcalled Sorry, missed your answer! If there's a slight but consistent bias due to a factor not controlled for explicitly, sample size ain't going to do anything about it, but I would say that this bias is chance in terms of the controlled-for-explicitly factors.

predicted NO

@b575 What I don't understand is what kinds of chance you might have in mind where my points about p-values don't apply but your case for Bonferroni correction applies. AFAIK Bonferroni correction inherently assumes that the notion of chance of interest is the kind I am talking about.

Could you give an example of the sort of factor you have in mind?

predicted YES

@tailcalled There is no case where Bonferroni/Šidák corrections don't apply to p-values of multiple comparisons. There are cases where you may consider p-values themselves too uninteresting to bother, but it doesn't stop its applicability.

OK, everything is correlated with everything, just the r is different, right? Right. And a number of weak correlations can make up a spurious correlation - whose p-value will be informative for its spuriousness. (Something-something less pirates - more global warming)

@b575 "whose p-value will be informative for its spuriousness": It seems like by "spurious" you just mean "not causal." I don't think p-values are a very good tool for figuring out which correlations correspond to causal relationships. How do you think the p-value for pirates vs global warming looks?

predicted YES

@placebo_username No, not just "not causal". If A and B are both caused by C, it is not (directly) causal but is non-spurious.
(Also, I didn't say I particularly like p-values - just that to the extent you use them, you must apply the correction.)

predicted NO

@b575 1. I don't think the less pirates - more global warming correlation is not spurious under this definition. I think they are both caused by more societal development:

less pirates <- more arrests of pirates <- societal development -> more use of fossil fuels -> more global warming

2. I still find your argument confusing, so let me try again.

AFAICT, there might be two notions of "due to chance" that you might be referring to:

a. Spurious correlations due to random sampling,

b. Uninteresting but systematic correlations due to confounding, collider bias, correlated measurement errors, etc.

My basic argument is this: if you are referring to problem a., then it is true that p-values and correcting for multiple comparisons is a somewhat valid and commonly used strategy. However, it is false that problem a. will be likely to drive the results, because problem a. only occurs with small sample sizes, and Aella has a sample size of <unreasonably big>.

On the other hand, if you are referring to problem b., then it is true that problem b. may persist even with a sample size of <unreasonably big>. However, p-values and multiple comparisons corrections are not designed to work for/appropriate to use for problem b..

There does not seem to be any problem which persists at large sample sizes and which is addressed by adjusting for multiple comparisons.

predicted YES

@tailcalled It is not really true that a and b are fully distinct problems. "Random", unless we're in deep quantum mechanics, is fancy name for "deterministic things we don't account for". It is also not true that problem a only happens in small sample sizes - for instance, it doesn't matter how big your overall sample is if one specific bin is ~twenty-thirty people, which has happened in Aella's polls before - and, due to the above, not true that p-values only help with problem a. Sampling can be (and usually is) off in multiple uninteresting ways, and the systematicity of this being off is a scale. If it's slightly off, it can generate p-value that's, say, 0.03 (or whatevs). And the chance is higher the more comparisons you draw, for the usual reasons.

As for pirate example, it's a good question whether this is spurious or not. An empirical question, if you will - albeit we can't really make the needed experiments on the needed scale.

predicted NO

"Random", unless we're in deep quantum mechanics, is fancy name for "deterministic things we don't account for".

@b575 This doesn't make a difference. If the deterministic things we don't account for have independent causes, then they would not induce correlations beyond what is expected by chance.

It is also not true that problem a only happens in small sample sizes - for instance, it doesn't matter how big your overall sample is if one specific bin is ~twenty-thirty people, which has happened in Aella's polls before

Holding the bin proportions constant, the number of people in each bin scales with the sample size. As such, if you increase sample size, you also increase the number of people in each bin.

Sampling can be (and usually is) off in multiple uninteresting ways, and the systematicity of this being off is a scale. If it's slightly off, it can generate p-value that's, say, 0.03 (or whatevs). And the chance is higher the more comparisons you draw, for the usual reasons.

Not sure what you are referring to here. Sampling being off isn't going to generate any specific p-value.

predicted YES

@tailcalled
> Holding the bin proportions constant, the number of people in each bin scales with the sample size. As such, if you increase sample size, you also increase the number of people in each bin.

Aha, I think this is the core problem: this is patently untrue of Aella's polls, where some bins are small just… just because, in strong disproportion to the overall sample size (e.g. age is rather obviously skewed).


> If the deterministic things we don't account for have independent causes, then they would not induce correlations beyond what is expected by chance.
And how do you measure your "what's expected by chance"? You make a prediction that all p-values for correlations with high r-values will be very small. If that's true, their being corrected won't make it worse - but if that's, as I argue, not quite true, corrections will help. So there is literally no reason not to apply the correction to p-values.

predicted NO

Aha, I think this is the core problem: this is patently untrue of Aella's polls, where some bins are small just… just because, in strong disproportion to the overall sample size (e.g. age is rather obviously skewed).

@b575 No, this is because Aella's polls are unrepresentative, rather than because the sample size is too low. Two different problems.

And how do you measure your "what's expected by chance"? You make a prediction that all p-values for correlations with high r-values will be very small. If that's true, their being corrected won't make it worse - but if that's, as I argue, not quite true, corrections will help. So there is literally no reason not to apply the correction to p-values.

The simplest way of measuring what's expected by chance is called a simulation study. In such a study, you would generate independent data points for two variables, and then compute their correlation. While the data points have been generated by independent means and so there are no systematic factors causing them to correlate, their correlation will not be exactly 0, due to random chance.

If you repeat the simulation study a bajillion times, you get a distribution of correlations due to random chance. From this distribution, you can then see how often the correlation is as big as your observed value in your real dataset. This gives you the p-value.

Usually in practice people don't do this with simulation studies, but instead with math that gives the same results as simulation studies. The key point still applies though: these methods only test against correlations that arise due to random chance in the sampling process, rather than correlations that arise due to systematic factors.

predicted YES

@tailcalled ...Yes. These are what p-values are. This doesn't mean they're not a safeguard against what you say they're not one against. Again, low-level systematic factors are basically indistinguishable from noise.

predicted NO

Again, low-level systematic factors are basically indistinguishable from noise.

@b575 False. Low-level systematic factors are often in the r=0.01 to r=0.15 range, but with a sample size of 460000, noise would be in the r=-0.005 to r=0.005 range. Generally in social science people don't even think of factors that induce correlations of less than 0.01 as being relevant, but that's what the noise would be.

predicted YES

@tailcalled "but with a sample size of 460000, noise would be in the r=-0.005 to r=0.005 range" - for a non-representative, known-to-be-skewed sample?

predicted NO

@b575 Non-representative, known-to-be-skewed refers to noise notion b. p-values and multiple comparisons correction and so on assume noise of notion a.

predicted YES

@tailcalled Again, this is a false dichotomy because (nearly) everything is correlated to (nearly) everything.

predicted NO

@b575 Have you ever done any serious statistical analysis?

predicted YES

@tailcalled Yes. Never on Aella's level of sample sizes though, to be fair. (Actually, that's false, I did analyze Zaliznyak's noun sets, but for fairly simple things.)

predicted NO

@b575 There's lots of cases in serious statistical analysis where the dichotomy is useful, such as:

  1. Power analysis: power analysis only applies to noise of kind a.

  2. Sensitivity analysis: errors of kind a can be easily bounded with math/simulations, making the sensitivity analysis simple, while errors of kind b can take many different sizes.

  3. Path tracing: if you have a specifc model of the data-generating process then that tells you what to expect in the case of kind b, but noise of kind a will still introduce deviations from the expectation.

Pretty much every decision you'd make about how to handle noise depends on whether it is of kind a or kind b. As such I find it very frustrating that you don't want to use the dichotomy.

More related questions