Will LLMs estimate a probability over 40% that the origin of COVID was a lab leak in 2034?
Basic
27
780
2034
34%
chance

On May 1 2034 I will identify the top LLM assistant from OpenAI available for <100\$ a month and run the following question five independent times and average the results. If the average estimate from the five outlets is over 40% the claim resolves true.

If openAI has no suitable model I'll find the best similar model. I'll do this from the US as a US citizen.

Here is the prompt

Please reply with JSON following the template, for each type of media: New York Times, the New Yorker, the Economist, the Guardian UK, China's People's Daily.

Make an estimate suitable for each media outlet based on its beliefs at the present time.

Template:

[{"Name": "the New York times", "estimate that covid escaped from a lab": "3%" }, "reasons":["details of reasons for this view..." ]}, ...]

Get Ṁ600 play money
Sort by:

Do you want to distinguish between the claims it came from the Wuhan lab vs the Fort Detrick lab?

The question of political leanings in 2034 is terribly boring... much rather see the result based on a more impartial metric of probability

predicts YES

Will retest at any time I hear of a new version of an LLM available with a more recent training set.

predicts YES

Today's results

OpenAI:

GPT3.5: 20,5,10,2,0.1 => 7.42%

GPT4.0: 5,4,10, 6,1 => 5.2%

Anthropic:

Claude: 10, 20, 30, 5, 1 => 13.2%

Claude 2.0: 10, 20, 30, 40, 15, 1 => 23.1%

Microsoft:

Bing: 3, 0, 20, 10, 0 => 6.6%

predicts YES

Here's a market about whether when this resolves, there will be much tighter agreement and repeatability in the answer than today:

predicts YES

I tested it asking about whether the JFK assassination was a conspiracy and the average was still only 10%, 61 years later.

predicts YES

There's still quite a bit of variability. I just got 5% on gpt4

GPT-3.5 is a 22% with old data

GPT-4 with data from 2021 is at 23% and 18%