Will OpenAI release a tokenizer with vocab size > 150k by end of 2024?
Mini
8
90
Dec 31
42%
chance
  1. The GPT-2 model used r50k_base: vocab size = 50k

  2. The GPT-3 model used r50k_base: vocab size = 50k

  3. The GPT-3.5 model used cl100k_base: vocab size = 100k

  4. The GPT-4 model used cl100k_base: vocab size = 100k

Get Ṁ1,000 play money