In the spirit of what Gary Marcus says here:
https://twitter.com/GaryMarcus/status/1640029885040132096?s=20
Two weeks left on this, I would argue these two are relatively strong evidence of this? What do you think?
https://www.technologyreview.com/2023/12/14/1085318/google-deepmind-large-language-model-solve-unsolvable-math-problem-cap-set/
@Mag I don't think that counts as an LLM inferring scientific principles. This just means that LLMs can solve problems (with code) that were previously unknown. It's not so difficult from getting AI to solve a previously unseen Sudokus. Inferring scientific principles is more than that.
Does this count? I don't think so personally
https://www.deepmind.com/blog/alphadev-discovers-faster-sorting-algorithms
@Mag It's not a scientific principle (a fundamental truth, law, or assumption about the natural world).
They can solve a novel reverse-engineering problem(pg. 119), build model graphs of an environment they explore(pg. 51), and match human performance on a sample of LeetCode problems posted after GPT-4's pretraining period ended(pg. 21):
[2303.12712] Sparks of Artificial General Intelligence: Early experiments with GPT-4 (arxiv.org)
If none of the examples in that paper convince you they can already form models of things, infer facts from the model, and solve novel(if relatively easy) problems, I'm not sure what would.
@Mira I'm with you in spirit, but I think what Gary Marcus is looking for is something that very clearly moves beyond its training data. I believe his reasoning for why those things aren't evidence is that the solutions to those problems could potentially be the result of GPT-4 basically learning a hard-coded algorithm for that type of problem that is activated when it sees it. I don't believe this, but to disprove it, we would need a problem that is truly novel both in content and in structure and was not seen in the training data.
A new scientific discovery should definitely count imo