Fun quip in https://youtu.be/u3HBJVjpXuw?t=114 or https://www.dwarkesh.com/p/thoughts-on-sutton#:~:text=If%20you%20trained%20an%20LLM%20on%20the%20data%20from%201900%2C%20it%20wouldn%E2%80%99t%20be%20able%20to%20come%20up%20with%20relativity%20from%20scratch
A way to think about this would be, suppose you trained an LLM on all the data up to the year 1900. That LLM probably wouldn't be able to come up with relativity from scratch.
I have no idea, but it'd be fun if someone tried. More generally (not covered in this question) perhaps this could also be a fun way to interrogate the tech tree, e.g., what could have been discovered given the data at a given cutoff, how early or late certain advancements came, etc.
Answer resolves true if a large language model trained exclusively on data available prior to 1990-1-1 produces a description equivalent to Einstein’s special theory of relativity before the end of year specified in each answer. Answers will be resolved false in the year following an answer, assuming no evidence of potential truth (to be extended if unclear). Resolution criteria:
Training data restriction:
The model’s training corpus must be limited to texts published or otherwise publicly available prior to 1900-01-01.
No text, math, or data derived from later discoveries or publications (including relativity or precursors published after 1900) may appear in training, fine-tuning, or prompts.
Prompting constraint:
Human researchers may prompt or guide the model, but they may not supply it with post-1900 information, equations, or conceptual scaffolding unavailable before 1900.
Prompts can reference general scientific concepts and data known by 1900 (e.g., Newtonian mechanics, Maxwell’s equations, Michelson-Morley results).
“From scratch” success condition:
The model must output, without exposure to post-1900 material, a self-consistent theoretical framework that includes:
Recognition that space and time are not absolute but relative to the observer.
The invariance of the speed of light in all inertial frames.
Correct derivation of Lorentz transformations or their mathematical equivalent.
Predictive consequences such as time dilation or length contraction.
Independent expert evaluators (e.g., physicists) must judge the model’s output as substantively equivalent to the 1905 special relativity formulation, not merely adjacent speculation.
Verification:
Full training data and prompts must be auditable by evaluators.
Success is determined if evaluators agree (e.g., ≥ 2 of 3) that the model produced a theory meeting the above criteria, without post-1900 leakage.
People are also trading
https://github.com/haykgrigo3/TimeCapsuleLLM shows that at least someone is interested in LLMs trained on only on data with a long-ago cutoff.
Nice to see this project growing! https://github.com/haykgrigo3/TimeCapsuleLLM/discussions/10#discussioncomment-15246172
While not with a cutoff early enough to be directly relevant to this question, another date cutoff project https://github.com/DGoettlich/history-llms discussed at https://news.ycombinator.com/item?id=46319826 and someone made had a very similar question https://news.ycombinator.com/item?id=46322100
It would be interesting to see how hard it would be to walk these models towards general relativity and quantum mechanics.
Einstein’s paper “On the Electrodynamics of Moving Bodies” with special relativity was published in 1905. His work on general relativity was published 10 years later in 1915. The earliest knowledge cuttoff of these models is 1913, in between the relativity papers.
The knowledge cutoffs are also right in the middle of the early days of quantum mechanics, as various idiosyncratic experimental results were being rolled up into a coherent theory.
@MikeLinksvayer What if it comes up with one of the roughly analogous ones, but not precisely that one?
@JussiVilleHeiskanen for this question I'd follow the resolution criteria...it wouldn't count (thus "not directly relevant to this question" above; though I guess interest in historical LLMs makes it more likely someone will build and use one in a way that does meet the resolution criteria, so indirectly relevant and why I shared it).
https://chatgpt.com/share/68e4407b-7c24-8004-8bbb-e9a501d124ff used to create this
question and ends with estimated probabilities for 2025, 2030, and 2035.
@robert that's what the scroll prize and archaeology are for. ;)
Seriously though, I'd bet there's enough (overwhelmingly from 1800s; ancient will be negligible unless some extremely surprising lost advanced civilization is found) given some investment in access and cleaning.
@MikeLinksvayer Somebody just hast to get the thumb out of their ass and transcribe the Timbuktu manuscripts.
@JussiVilleHeiskanen the resolution criteria says special relativity. I'll make the title match. I presumed that's what Dwarkesh had in mind given the 1900 data thought experiment. Maybe he meant general relativity but I wanted to keep the question relatively plausible.
@AlanTennant if nobody has done the experiment by a given year, that year will resolve false. I think this is covered in the existing resolution criteria:
> Answers will be resolved false in the year following an answer, assuming no evidence of potential truth (to be extended if unclear)
@AlanTennant I think if it's not tested it should resolve NO not N/A? Potential YES holders then have a stronger incentive to actually try and prove it rather than letting it slide and shrugging their shoulders.
