[M$100-1000] Provide Feedback on Calibration City!

No bounty left

As part of the Manifold Community Fund I submitted a project which included building out various improvements to Calibration City, a prediction market evaluation site. The project has grown tremendously, and I've just finished a new round of features that I'm super excited about!

I'm still working on things, but I've accomplished most of the features on my list. That's where you come in: I've been buried in this thing for so long I can't see its faults. I need fresh sets of eyes who can take a look and see what needs improving or explaining.

Some feedback I'm interested in, and the amount I will pay for each:

M$100: General feedback on how the site feels to use, if you would ever use it on your own in the future, and under what circumstances you find it useful
M$100: Things you found confusing, things that could be explained better, pain points when using the site
M$100: Typos, visual bugs, etc
M$100-500: Feature suggestions, especially those close to the core product
M$100-500: Good external resources I can link to about prediction markets (research papers, news articles, etc.)
M$100-500: Text or documentation that is factually incorrect or misleading
M$1000+: Significant bugs, errors, or mistakes

I may pay out less for:

Issues I am already aware of, such as poor usability on mobile or the general need for more documentation
New features falling very far outside the scope of the project, such as individual user calibration
Things that are very infeasible to implement

I'll be focused on polish, usability, and documentation before the next evaluation, but I want to keep a healthy roadmap of future improvements that I can continue working on over time. I plan to refill this bounty pool if I continue getting good suggestions!

Technology

Manifold

Prediction Markets General Discussion

Calibration

Manifund

Get

1,000

to start trading!

22 Comments

Sort by:

+Ṁ1,000

Damn, this is pretty good 👍

Some feedback:

I would mostly use this as a handy way to show others that prediction markets are pretty well calibrated and accurate.
On mobile, the page navigation is revealed via tapping the 3 dots on the right, and the graph settings via the 3 lines on the left. It seems it would be much more intuitive for page navigation to be via 3 lines on the left, and graph settings under a cog on the right.
Why are the prediction markets listed at the end of the introduction not the same ones included in the later graphs?
Would be nice to include some additional error statistics options besides Brier score (equivalent to mean squared error based on the description) e.g. mean error (keeps +/- signs, so shows bias), mean absolute error, root mean squared error, standard deviation of error etc.
Accuracy graphs page would be much cleaner if by default it had average lines with ribbons of either standard deviation (measure of spread) or standard error (measure of uncertainty in the average) rather than all the component data points.
When setting x-axis to number of unique traders, platforms that don't release that info should be a horizontal line across all values rather than a single data point at 0 traders.
Also regarding traders/volume x-axis, it would be nice to be able to set it to a logarithmic scale, with sparser data points at higher trader/volume numbers grouped into bins if needed, so we can see the effects of very high trader/volume numbers, instead of being restricted to lower values.
Would be nice to have the baseline expected score for a well-calibrated but poorly-informed predictor (e.g. 0.25 Brier score) shown on the graphs as a reference.

Also just something interesting I noticed: Manifold's accuracy increases with trader volume, but it actually decreases for some other platforms lol

+Ṁ500

Problems detected on my first pass through:

The meaning of the "Probability at Specified Percent" option for X-axis-binning on the calibration plot was initially unclear to me, from a standpoint of "percent of what?", and had no explanatory mouseover text; it was only when I switched from it-at-50% to "Probability at Market Midpoint" that its meaning became clear. I'd suggest rephrasing it and/or adding hovertext.
On the Introduction page, there's an empty frame in place of the calibration plot/reliability diagram, rather than any sort of visible chart. (The same problem isn't present elsewhere on the page.)
The list of prediction market platforms at the bottom of the Introduction includes various markets not aggregated into the data on the site, while not including various other markets which are aggregated into the data on the site. I predict that this will produce confusion for newcomers.

Despite these problems, I am overall impressed! This seems like a good resource, and in particular I'm feeling very tempted to send my mother a link to its Introduction page, since I've recently been trying-and-largely-failing to get her to understand how to think in probabilities other than "yes" and "no" and "50/50 coinflip".

+Ṁ400

I might put Kalshi lower down on the list of markets when you load in. It's less recognizable (I think so anyway, this is the first I've heard of it), so it could be good to hook people with a more popular one first.
The bubbles getting cut off at the top looks a little weird visually
A colorblind mode could be good, I don't know off the top of my head if the current site accounts for that.
When you sort by platform, the effect of the bubbles flying up is neat when you bring them back. It seems like it should have the same effect when they go away though, for consistency.

+Ṁ200

Might only be on mobile but filters don't let you choose multiple options at the same time. I.e. I can't see manifold and kalshi only

The Brier plot function is super cool!

Would be nice if I could click the key and highlight just one series (although this again is maybe a mobile issue)

+Ṁ200

The accuracy tracker page takes like 6 seconds to load (in Europe) mainly because of the javascript takes 4 seconds to transfer (even tho it's only like 120 kB) - Try adding Cloudflare into the loop? I don't see anything wrong with the source

+Ṁ100

Coming at the site I did not understand what circle size stands for. Looks cluttered, lineplot would be better.

When I clicked circles the number of predictions did not correlate with sizes.

+Ṁ100

I recommend having an option to only count markets that are present on all (selected) platforms. It's not such a fair comparison to measure a score on a set of questions against a score on an entirely different set of questions.

Related questions