A [previous question](https://manifold.markets/Soli/who-will-have-the-best-llm-by-end-o) asks who will have the best LLM at Chatbot Arena. However, Chatbot arena has recently started pitting internet-enabled LLMs ([Bard](https://twitter.com/lmsysorg/status/1749487649520541813) vs non internet-enabled LLMs (gpt-4-turbo). While this is interesting in its own way (who offers the best API overall?) it doesn't tell us who is actually building the best LLMs.
A tool is anything external to the LLM. So this includes wolfram, internet search, etc. I take a broad definition of LLM == Machine Learning models that can comprehend and generate text. Architecture doesn't matter.
In the case of Open Source LLMs that have been fine-tuned externally, credit still goes to the developer of the foundational model. So if someone takes Llama 4 and fine-tunes it such that it slightly outperforms Llama 4, Meta still gets credit.