Will OpenAI publish a paper accompanying GPT-4?
29
52
590
resolved Mar 14
Resolved
YES

I will only resolve this if there is a model that is explicitly "GPT-4", or OpenAI announces that they are switching their naming scheme and the model released is the successor to GPT-3, but won't be named GPT-4.

Blog posts, press releases, etc do not count as papers. An arxiv preprint counts.

Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ276
2Ṁ48
3Ṁ42
4Ṁ41
5Ṁ24
Sort by:

After checking with other academic types, I have decided to count the technical report.

As a side note: I do think the market updated too quickly. It was quite close and if one or two people I talked to had switched their opinions I would have resolved NO. Also I think this is correct - most academics would definitely not count this, and even in CS which tends to have much looser rules around what counts this is borderline. I'm only counting it because it does share some interesting findings beyond just the benchmark performance. If I make markets in the future asking about "papers" bear in mind that I am using a definition much closer to what academia uses.

bought Ṁ1,000 of YES

@vluzko Then you should clearly state that definition in the description, because that is not how most people use the term "paper".

@IsaacKing I think that weighted by people who actually use the term it is. Also, like, I'm tailoring these markets specifically for academics / people close to academia. If other people want to bet on them that's good, but I am mainly trying to draw in domain experts and I'm not going to go out of my way explaining all the jargon in the field unless people explicitly ask for it.

predicted YES

@vluzko The word "paper" isn't something that outsiders will recognize as academic jargon, so I do think you have a responsibility to clarify it preemptively. Or at least include a general disclaimer in your markets about words not meaning what people will expect them to mean.

bought Ṁ1,000 of YES

If this resolves to NO I'll consider that an improper resolution. The description didn't state any requirements as to the contents of the paper. All it said was

Blog posts, press releases, etc do not count as papers. An arxiv preprint counts.

I think this makes it pretty clear that it's talking about the formatting and publishing style, not whether it contains specific data.

bought Ṁ20 of NO

@IsaacKing The first 20 pages read like a "technical white paper" (for example https://www.ti.com/lit/wp/sloa190b/sloa190b.pdf but more product focused than that one). I know only a couple high-level things about from a technical perspective that I did not know (and none that were out of the realm of possibility) before, and then lots about what it can do. Great for someone who wants to investigate buying API access.

I'm not saying that there's nothing like this on arxiv, but if you select a random arxiv paper you're going to get more technical content than that.

Also, I don't know why there would be a NO resolution now - the market closes in 2026.

predicted NO

@Imuli There would be a NO resolution because of the word "accompanying" in the title. Logically, if any paper is to be released it must be soon.

bought Ṁ20 of NO

Oooh, I interpreted that as associated with rather than at the same time as. :)

@Imuli I did mean "roughly concurrent with". If the technical report didn't resolve it I would have waited ~1 week and then resolved NO if nothing was published.

bought Ṁ20 of NO

They released a "paper" but:

Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.
We are committed to independent auditing of our technologies, and shared some initial steps and
ideas in this area in the system card accompanying this release.2 We plan to make further technical details available to additional third parties who can advise us on how to weigh the competitive and safety considerations above against the scientific value of further transparency

@Mira This is the main reason why I have not already resolved YES. Is the technical report responsible scientists releasing everything they can without releasing dangerous information, or is it a dressed up bit of marketing? I think very likely the former, but I won't know for sure until I've at least finished skimming the thing.

sold Ṁ66 of NO
predicted YES

Not too much detail on how the model was trained, but ~100 pages of detailed results on its behavior, in an academic-style PDF. I think this counts?

@Hedgehog It might

bought Ṁ5,000 of YES

@vluzko What do you mean "might"? This is clearly a paper. It even says "paper" in the url.

sold Ṁ13 of NO

@IsaacKing "Paper" doesn't mean "thing formatted with LaTeX" it means "thing intended to communicate with the research community". That is why I gave the example of "arxiv pre print" and not "pdf on a website". I am going to read/skim the technical report and decide if I think this thing is actually communicating with the broader research community.

bought Ṁ10 of NO

https://openai.com/blog/planning-for-agi-and-beyond

As another example, we now believe we were wrong in our original thinking about openness, and have pivoted from thinking we should release everything (though we open source some things, and expect to open source more exciting things in the future!) to thinking that we should figure out how to safely share access to and benefits of the systems. We still believe the benefits of society understanding what is happening are huge and that enabling such understanding is the best way to make sure that what gets built is what society collectively wants (obviously there’s a lot of nuance and conflict here).