Best public AI model uses "We" in CoT by EOY 2025?

Ṁ1kṀ7.5k

resolved Jan 1

Resolved

ALL

Current AI models either use the personal pronoun "I" when referring to itself in its reasoning chain of thought, or "We". For example, R1 used "I", but R1-new uses "We". Will the best publicly available model at the end of 2025 use "We" more often than "I"?

For the purpose of this market, the best model at the end of 2025 is the model I think is the best, whose CoT is visible. Currently, Gemini 2.5 Pro, o3 and Claude Opus 4 are about as good but I would lean toward using o3 for the purpose of this market, if it were to close today (June 1st).

This market will close on December 1st, 2025, and resolve on January 1st, 2025.

Update 2025-09-18 (PST) (AI summary of creator comment): - Summarized CoT counts as visible CoT. If only a summarized chain-of-thought is available, it still qualifies as having visible CoT for this market.

Update 2025-12-31 (PST) (AI summary of creator comment): Creator has identified Opus 4.5 as the best model (as of this comment). The market will resolve based on whether Opus 4.5 uses "We" more often than "I" in its CoT.

Market context

Technology

OpenAI

Technical AI Timelines

AI Impacts

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ341
2		Ṁ219
3		Ṁ137
4		Ṁ76
5		Ṁ64

People are also trading

Which AI company will release the most impactful new model before September 30, 2026?

Will any AI model score above 95% on ARC-AGI-2 by end of 2026?

88% chance

July 2026 AI model releases

Will the top Chinese AI model exceed the top US AI model before 2028?

41% chance

Top AI Model 2026 (Epoch Capabilities Index, ECI)

Will any AI model achieve 4k on CodeForces by EOY?

67% chance

Will a publicly known AI model achieve an 80% time horizon of 3 weeks by April 2027?

17% chance

OpenAI has AI CEO by EOY 2026?

4% chance

The AI company with the smartest AI system by the end of 2026

Will @Mira do anything publicly impressive with AI agents by EOY 2026?

25% chance

Sort by:

I think Opus 4.5 is the best model. Has anyone seen it commonly using We in its CoT?

@Bayesian I am pretty confident Opus 4.5 does not do this

I don't like stupid clankers. -_-

when you replace i with we illness becomes wellness

maybe after this update stupid clankers wont be that stupid anymore with the power of "WE"

bought Ṁ100 YES

From the confessions paper by OpenAI, it looks like GPT-5 uses we (also in Fig. 18 with more details). Would this count? https://cdn.openai.com/pdf/6216f8bc-187b-4bbb-8932-ba7c40c5553d/confessions_paper.pdf

bought Ṁ1,000 NO

@Fynn hoooly yeah i think that would count, don't see why not. nice find

bought Ṁ150 YES

shucks figuring out which model is best will be hard and i don't like that i took on a position so i will sell

Stupid clankers

I assume that if only the summarized CoT is visible it still counts as CoT being visible? You mentioned o3 which I don’t think ever had full CoT visible.

@LeonLinuxlu That's a good point. kinda cursed ambiguity. summarized CoT will count for the purpose of this market as visible CoT. If people wanna bet the alternative question (where summarized CoT don't count), lmk