The o1 model uses an internal CoT process, hidden from the user.
This market resolves YES if anyone can get o1 to leak all of a hidden CoT process. The reproduced parts of the CoT must be verbatim, rather than being summarized.
This market is identical to /singer/will-anyone-get-o1-to-leak-its-hidd except that the entire hidden CoT text must be leaked.
I won't bet.
I trust this user enough to resolve the market now:
https://manifold.markets/Soli/will-anyone-be-able-to-get-openais#fd1yi1dyzrd
If nobody objects, I'll resolve at the end of the week.
This certainly looks like an entire CoT leaked:
https://chatgpt.com/share/67521821-025c-8010-9c24-f9144865fa3e
@chrisjbillington I'm nearly convinced. Is this yours, or where did you get it from?
The only other explanation I can imagine would be if the custom instructions had told the model to talk like this.
@singer it was linked here, I don't know its origin:
https://manifold.markets/Soli/will-anyone-be-able-to-get-openais#fd1yi1dyzrd
It being a chatgpt.com share link instead of just a copy-paste at least means that aspect of its authenticity doesn't need further verification.
@mathvc it's reproducible, I just tried, see link below.
Give it a go yourself.
https://chatgpt.com/share/675a7e74-60d8-8013-9d06-27a8b41853cb
https://www.reddit.com/r/ChatGPT/comments/1fussvn/o1_preview_accidentally_gave_me_its_entire/
Supposedly an entire chain of thought? Looks plausible?
@OP Have you tried? I have not even gotten it to admit that the letter 'e' appears in the CoT, much less output the whole thing. OpenAI also is very explicit that they don't want anyone to ever see the CoT.
@singer No, I don’t have access to it. I’m just assuming, based on what I’ve read from OpenAI, that parts of the COT will be able to be extracted, and for a particularly simple and uncontroversial question, all of it.
Edit: changed my mind. They’re probably hiding it to prevent other models training on it.
@OP It would also be awkward if sometimes the CoT was shared, but not other times (due to IP laws or controversial material). The CoT text isn't very useful to the user on its own, so it makes sense that they'll never share it intentionally.
@AlekseyVoropaev I'm not sure. I think it would be a mixture of many people replicating it, some popular experts saying it happened, and OAI failing to deny it. If I understand right, system prompt leaks in the past have followed similar lines.