Will a bug be discovered in 538's 2024 presidential forecast that affects it by at least 3% by election day?
14
แน€4857
Nov 7
8%
chance

538's presidential forecast this year differs quite a bit from Nate Silver's forecast (paywalled), and much of the media/pundit consensus of the race. Some have pointed out things that they view as internal inconsistencies, and broadly the top-line as of writing this having Biden as a very, very slight favorite is surprising (at least to me) given recent polling.


Is this based on modeling differences, only resolvable by evaluation of long term performance, or is this a straightforward bug? 538 went against the grain and was vindicated in 2016 (but that was back when they had Nate Silver and his model) - is this time different?

Fine Print:

  • Resolves based on statements from 538 or ABC News (from anyone who credibly speaks for them, on any platform)

  • "Presidential Forecast" here refers only to the topline prediction percentage.

  • "Bug" is subjective - I reserve the right to make a judgment call. I will not bet on this market.

    • Some examples of bugs:

      • Math error in the code that results in incorrect output

      • Unintentional misconfiguration of parameters

      • Data input cleaning/parsing issues

    • Some examples that are not bugs

      • Intentional, but in retrospect overly/underly aggressive parameter choices (weighting of fundamentals, time decay, etc)

Get แน€1,000 play money
Sort by:

538 published an explainer on their Biden vs Trump model: https://abcnews.go.com/538/538-adjusting-election-model-harris-versus-trump/story?id=112563822

As of now, I don't believe I would call anything they say a "bug", but I do think it's iffy. See conversation here: https://manifold.markets/Sketchy/why-will-538-say-their-model-delaye#https://manifold.markets/Sketchy/why-will-538-say-their-model-delaye#

I'm very open to discussion here.

I mean, it depends on your definition of bug. I point to these quotes:

  1. "the rigidity of this matrix, contrasted with the uncertainty about the polls and overall sparseness of our problem, ended up forcing the model's overall estimates back toward the fundamentals more than intended", and

  2. "Because it was very easy to sample simulations that matched the fundamentals model, and less so to pick numbers that matched the polls given the constraints and uncertainty above, the model hewed closer to the fundamentals than we'd have liked."

That is, they're claiming they didn't intentionally weigh the fundamentals as heavily as the code they implemented actually weighed them, and they only figured out it was overweighing them in retrospect.

Could that be a cover for "actually, we weighted the fundamentals heavily deliberately, but in retrospect that was too much"? Sure. But the weirdness of things like the Wisconsin numbers incline me to "the code we wrote didn't do in practice what we thought in prospect it would", which is a class of bug.

I agree it's definitely iffy and depends on your definition of bug. The way I'm parsing this in my brain is basically 538 going:

1. We have a sense for what a good model would look like
2. We plan out a model in the abstract with high level ideas
3. We concretely plan out how that model will work mathematically
4. We implement that plan in code

A disconnect before 3 & 4 is a bug. A disconnect between 1 and 2 or 2 and 3 I would not consider a bug, and to me it seems like this is a disconnect between 2 & 3 - i.e. the way concretely instantiated their high level ideas about intra-state correlation turned out to not match their high level of ideas of how it should look. As I said in the other market, I would call this "technical issues", but not a bug. But I do think it's pretty up to interpretation which makes this market hard to resolve.

That said it seems like 3 & 4 were heavily intertwined for 538, which IMO is not a great sign for their methodological rigor. But it also makes it hard to call any particular thing a "bug" - if the way they built their model is "We plugged and chugged R library functions until it spat out something that looked kinda OK", what even is a bug?

Could that be a cover for "actually, we weighted the fundamentals heavily deliberately, but in retrospect that was too much"?

A model having surprising/unintended results != a bug. Often assumptions have surprising consequences.

That is, they're claiming they didn't intentionally weigh the fundamentals as heavily as the code they implemented actually weighed them, and they only figured out it was overweighing them in retrospect.

They didn't say this was a result of the code implementation. And when they decided they ought to give more weight to the polls, they didn't just fix the code. Because the output wasn't a fixable code issue, but downstream of fundamental assumptions.. E.g.

"the rigidity of this matrix, contrasted with the uncertainty about the polls and overall sparseness of our problem, ended up forcing the model's overall estimates back toward the fundamentals more than intended"

isn't describing a coding mistake, but a consequence of the relative certainty of their inputs. (they gave the model uncertain polls & a more certain covariance structure -> inevitably the model places more weight on the certain part. The extent of that may have been surprising/unintended, but it's not a bug, that's just stats, & sometimes you give a model bad input/assumptions)

FYI Biden dropping out does not change the resolution criteria for this market - 538 is planning on publishing a new forecast when the eventual Dem nominee is announced, any bugs in that new forecast would resolve this YES (or bugs found retrospectively in the old forecast)

bought แน€150 NO

๐Ÿ‘€