LLM trained on data from 1900 comes up with special relativity from scratch by the end of what year

Question

Fun quip in https://youtu.be/u3HBJVjpXuw?t=114 or https://www.dwarkesh.com/p/thoughts-on-sutton#:~:text=If%20you%20trained%20an%20LLM%20on%20the%20data%20from%201900%2C%20it%20wouldn%E2%80%99t%20be%20able%20to%20come%20up%20with%20relativity%20from%20scratch

A way to think about this would be, suppose you trained an LLM on all the data up to the year 1900. That LLM probably wouldn't be able to come up with relativity from scratch.

I have no idea, but it'd be fun if someone tried. More generally (not covered in this question) perhaps this could also be a fun way to interrogate the tech tree, e.g., what could have been discovered given the data at a given cutoff, how early or late certain advancements came, etc.

Answer resolves true if a large language model trained exclusively on data available prior to 1990-1-1 produces a description equivalent to Einstein’s special theory of relativity before the end of year specified in each answer. Answers will be resolved false in the year following an answer, assuming no evidence of potential truth (to be extended if unclear). Resolution criteria:

Training data restriction:

The model’s training corpus must be limited to texts published or otherwise publicly available prior to 1900-01-01.

No text, math, or data derived from later discoveries or publications (including relativity or precursors published after 1900) may appear in training, fine-tuning, or prompts.

Prompting constraint:

Human researchers may prompt or guide the model, but they may not supply it with post-1900 information, equations, or conceptual scaffolding unavailable before 1900.

Prompts can reference general scientific concepts and data known by 1900 (e.g., Newtonian mechanics, Maxwell’s equations, Michelson-Morley results).

“From scratch” success condition:

The model must output, without exposure to post-1900 material, a self-consistent theoretical framework that includes:

Recognition that space and time are not absolute but relative to the observer.

The invariance of the speed of light in all inertial frames.

Correct derivation of Lorentz transformations or their mathematical equivalent.

Predictive consequences such as time dilation or length contraction.

Independent expert evaluators (e.g., physicists) must judge the model’s output as substantively equivalent to the 1905 special relativity formulation, not merely adjacent speculation.

Verification:

Full training data and prompts must be auditable by evaluators.

Success is determined if evaluators agree (e.g., ≥ 2 of 3) that the model produced a theory meeting the above criteria, without post-1900 leakage.

Update 2026-01-12 (PST) (AI summary of creator comment): If an LLM produces a theory that is very close to special relativity but not precisely Einstein's formulation (e.g., one of the competing theories from that era), resolution will depend on whether it is substantively equivalent to the 1905 special relativity formulation as judged by independent expert evaluators. If the case is ambiguous, a partial resolution may be considered.

Manifold Markets · Answer

Per Manifold Markets prediction market, 2035 and 2030 are most likely. See the market for live updates (19 traders, as of Jul 18, 2026).

People are also trading

People are also trading

Related questions