I'm rerunning / updating my [Manifold calibration analysis code](https://github.com/vluzko/manifold-markets-python).
Considering all resolved binary markets, will Manifold be correctly calibrated halfway between the start and close of the market?
Procedure: All markets get bucketed into 10% intervals (5% at the edges). This gives us 11 buckets: 2.5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 97.5%. Then we compute what fraction of each bucket actually resolved YES. Perfect calibration would be if 2.5% of the 2.5% bucket resolved YES, 10% of the 10% bucket, etc.
I will consider Manifold well-calibrated if 7 out of the 11 are within 5 percentage points of perfect calibration (so if 5% to 15% of 10% confidence markets resolve YES then the 10% bucket is well-calibrated, if 7 buckets are well-calibrated this market resolves YES). (This is not an ideal measure of calibration but I want something simple so the market resolution criteria is easily understandable).
@Yev Was this market midpoint tho?
Hmm this seems pretty consistently undershooting, I wonder if someone could bot everything up by 1% and be profitable (excluding 1-5% range ofc where we expect higher % due to ppl not bidding down as returns deminish. Botting everything at 4-5% down by a percent ot two might also be a reasonable idea in expectation)
@GeorgeVii I think I accidentally measured halfway between start and resolve instead of halfway between start and close. I don't expect it to make much of a difference, but I should redo the calculations.
@SG I am certainly going to run that as a reference class but the purpose of this question isn't to ask about "good" Manifold markets or anything like that. It is just asking "if you see a Manifold market and all you know is that it's a real market intended to be bet on, how much information does that give you?"
(if all, markets like https://manifold.markets/Marketeer1?tab=markets are going to distort the results)
@Yev I was originally planning on all markets (thinking the number of markets like this would be small), but this may actually be enough markets to distort the results, particularly for the 50% bucket. I will filter for markets with at least 5 trades. If that does not work / someone else points out other things that need to be filtered I will update the criteria further.