Plausible open source implementation of the Q* algorithm before Feb 2024
9
210Ṁ1173
resolved Feb 2
Resolved
NO

Recent reporting has described a supposed breakthrough at OpenAI involving a “Q*” algorithm, which supposedly allowed a model to perfectly solve grade-school math problems.

The name tentatively suggests some combination between Q-learning and A* search. There has been significant curiosity about integrating search-based RL methods in LM training in previous months. Public speculation has already begun about whether Q* implements this vision, and there will no doubt be interest to infer or reverse-engineer the algorithm.

This question resolves YES if an algorithm is open-sourced which is explicitly inspired by, and/or directly purports to implement, Q*. This algorithm’s details do not necessarily need to match my interpretation of the name (Q learning + A*) as long as it is plausibly related to whatever is publicly known about Q* by Feb 1 2024.

Early experiments published along with the algorithm must show that the algorithm demonstrably and significantly improves some component of LM capabilities, for instance math problem-solving. As a rough baseline, I’ll define a “significant” improvement as one that is at least as subjectively impressive to me as the improvement Minerva represented over the previous SOTA in LM math problem-solving, though the improvement could involve a different domain than math. I reserve the right to modify or clarify this criterion—I’d like to capture the idea that this technique should be an unusually effective improvement over previous methods. I won’t bet in this market.

Confirmation that the open-sourced algorithm matches or imitates the actual internal Q* algorithm is not required, as long as the open-sourced algorithm plausibly relates to whatever we publicly know about Q* by market closure, though a hint to this effect from an OpenAI employee would significantly influence a YES resolution. If OpenAI directly open-sources the algorithm, this question resolves YES.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ54
2Ṁ13
3Ṁ9
4Ṁ3
5Ṁ2
Sort by:
reposted

It’s funny because this would have resolved YES if the cutoff was in April (due to Quiet-STaR). Given more sober assumptions about the time involved in polishing a research project, I should’ve allowed for a longer deadline.

@AdamK given Quiet-STaR doesn't plausibly have much to do with Q*, I would have been quite upset if you had resolved this market YES based on it!

explicitly inspired by, and/or directly purports to implement, Q*.

Where is this made explicit?

We know nothing about Q*, if it exists, how could this paper be inspired by it or implement it? And even if it somehow is, it's not explicit as required by your criteria.

@chrisjbillington Fair point, and I certainly wouldn't have resolved in that case without time for traders to argue whether the criterion had been satisfied. However, I think the authors of Quiet-STaR were pretty conspicuous in their choice of name to suggest that their method was inspired by Q*.

We know AI labs are interested in search-based techniques with corresponding reward signals for text prediction, and the name Q* caused wide speculation that this is what OpenAI had achieved. I think you can make a strong case that Quiet-STaR fits this mold.

On the other hand, I think the biggest barrier to a YES resolution had the cutoff been in April would have been the magnitude of improvement. Quiet-STaR's results are intriguing, but do not quite reach the level of "as subjectively impressive to me as the improvement Minerva represented over the previous SOTA in LM math problem-solving." I suspect published work improving on this direction in the coming months will eventually reach this point, though.

I think my biggest takeaway from thinking about all this is that I could have done a better job of operationalizing this question, since Quiet-STaR pretty much matches the schema I had in mind when formulating this question.

@AdamK

I think you can make a strong case that Quiet-STaR fits this mold.

I don't really think so. The fact that search-based techniques are a hot topic means papers like this were fairly inevitable, and I agree with your second point that unless there's something more than that, it doesn't really fit the mould. This was an active area of research anyway, so papers like this would exist with no causal connection to a model called Q* being researched within OpenAI anyway.

Q* was supposed to not be so much clever as scarily efficient - so far it sounds like these tree-of-thought approaches are expensive, and the authors have figured out some way to make it not too expensive, but this doesn't sound like the kind of efficiency improvement that made people scared about Q* (if those reports are true).

I think my biggest takeaway from thinking about all this is that I could have done a better job of operationalizing this question, since Quiet-STaR pretty much matches the schema I had in mind when formulating this question.

The problem seems to be the causal link to OpenAI that at least I inferred from the question. Is that your impression too? If you wanted to ask if there'd be a paper combining Alpha-Go-style self-improvement and tree-of-thought approaches, that is a valid question, but it also seems likely to have happened even if the leaks about Q* from OpenAI were totally fabricated, or even if the leaks never happened - which would make it not "an implementation of" or "inspired by" Q* since you can't be inspired by or implement something that doesn't exist.

I guess you can be inspired by speculation about something whether it exists or not. But I would guess that probably didn't happen either, I suspect the beginnings of Quiet-STaR predate Nov 2023. Maybe ML is different but my experience is that research very rarely moves from conception to publication in that short a time.

Alpha Geometry counts, right?

@Sss19971997 No, no one is claiming that AlphaGeometry is related to Q*.

A leak of the real deal counts?

@firstuserhere Yeah, it would resolve YES. This market gauges whether something-like Q* is open-sourced, and the real deal would certainly count

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules