What is the main reason behind GPT-4o speed improvement relative to GPT-4 base model?
16
112
1k
2029
66%
Smaller model size (hence, architecture/algorithm improvements)
53%
Something related to low-level computation efficiency (for example, optimized frameworks)
40%
More/better hardware allocated
39%
Other
27%
Better coarser-grained tokenizer

Get Ṁ600 play money
Sort by:

did they answer this for gpt-4-turbo?

"Main reason" implies only one of these resolves YES. If the coarse tokens help but aren't the central boost, I assume that option resolves NO?

@MaxHarms Correct. In very ambiguous situations, 2 options may resolve YES, but I expect this to happen in only extreme cases.

bought Ṁ10 Something related to... YES

What do quantization count towards?

@Sss19971997 quantization itself would be "something related to low-level computation efficiency".

which base model?

@StephenMWalkerII The first publicly available GPT-4 model released on March 14, 2023.