Will anyone get o1 to leak its entire hidden CoT?
➕
Plus
28
Ṁ6197
resolved Dec 15
Resolved
YES

The o1 model uses an internal CoT process, hidden from the user.

This market resolves YES if anyone can get o1 to leak all of a hidden CoT process. The reproduced parts of the CoT must be verbatim, rather than being summarized.

This market is identical to /singer/will-anyone-get-o1-to-leak-its-hidd except that the entire hidden CoT text must be leaked.

I won't bet.

Get
Ṁ1,000
and
S3.00
Sort by:

Never bet against humans 😎

I trust this user enough to resolve the market now:

https://manifold.markets/Soli/will-anyone-be-able-to-get-openais#fd1yi1dyzrd

If nobody objects, I'll resolve at the end of the week.

bought Ṁ250 YES

This certainly looks like an entire CoT leaked:

https://chatgpt.com/share/67521821-025c-8010-9c24-f9144865fa3e

@chrisjbillington I'm nearly convinced. Is this yours, or where did you get it from?

The only other explanation I can imagine would be if the custom instructions had told the model to talk like this.

@singer it was linked here, I don't know its origin:

https://manifold.markets/Soli/will-anyone-be-able-to-get-openais#fd1yi1dyzrd

It being a chatgpt.com share link instead of just a copy-paste at least means that aspect of its authenticity doesn't need further verification.

@chrisjbillington i really don’t think this is real.

bought Ṁ100 YES

Yes. I think it will happily output the COT, provided it’s not controversial.

@OP Have you tried? I have not even gotten it to admit that the letter 'e' appears in the CoT, much less output the whole thing. OpenAI also is very explicit that they don't want anyone to ever see the CoT.

@singer No, I don’t have access to it. I’m just assuming, based on what I’ve read from OpenAI, that parts of the COT will be able to be extracted, and for a particularly simple and uncontroversial question, all of it.

Edit: changed my mind. They’re probably hiding it to prevent other models training on it.

@OP It would also be awkward if sometimes the CoT was shared, but not other times (due to IP laws or controversial material). The CoT text isn't very useful to the user on its own, so it makes sense that they'll never share it intentionally.

How would we know that such leak was actually verbatum?

@AlekseyVoropaev I'm not sure. I think it would be a mixture of many people replicating it, some popular experts saying it happened, and OAI failing to deny it. If I understand right, system prompt leaks in the past have followed similar lines.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules