Will OpenAI release a tokenizer with vocab size > 150k by end of 2024?
9
460Ṁ246Dec 31
42%
chance
1H
6H
1D
1W
1M
ALL
The GPT-2 model used r50k_base: vocab size = 50k
The GPT-3 model used r50k_base: vocab size = 50k
The GPT-3.5 model used cl100k_base: vocab size = 100k
The GPT-4 model used cl100k_base: vocab size = 100k
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
People are also trading
Related questions
Will OpenAI release a tokenizer with more than 210000 tokens before 2026?
24% chance
Will AI (large language models) collapse by may 2026?
8% chance
Will the next major LLM by OpenAI use a new tokenizer?
77% chance
Will a flagship (>60T training bytes) open-weights LLM from Meta which doesn't use a tokenizer be released in 2025?
20% chance
Before 2029, will OpenAI provide API access to a frontier LLM with 100,000,000+ context length?
53% chance
Will OpenAI release another open source LLM before end of 2026?
77% chance
Will OpenAI have a new name by the end of 2025?
4% chance
Will OpenAI announce a new model that EpochAI estimates is at least as large as GPT-4.5, before August 2026?
79% chance
When will OpenAI release their next open-source language model?