
e.g. Winograde >= 87.5%
Sort by:


Related markets
Will any LLM have roughly GPT-3-level losses with a context window of at least 50,000 tokens before April of 2024?36%
Will an an LLM be able to pass something equivalent to Yann LeCun's 7-gear test by the end of 2024?65%
Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?57%
Related markets
Will any LLM have roughly GPT-3-level losses with a context window of at least 50,000 tokens before April of 2024?36%
Will an an LLM be able to pass something equivalent to Yann LeCun's 7-gear test by the end of 2024?65%
Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?57%