Is GPT-4-turbo (1106-preview) more capable than GPT-4 (0613)?
Jul 1

Resolve using the average over the four best benchmark comparisons, published in academic papers or preprint, at the time of resolution (e.g. four benchmarks: MATH, MMLU, HumanEval, SWAG).

Will resolve to NA if there are no benchmark comparisons between the 1106-preview and 0613.

Get Ṁ600 play money
Sort by:

GPT-4 Turbo is available for all paying developers to try by passing gpt-4-1106-preview in the API and we plan to release the stable production-ready model in the coming weeks.

Is this question just about the preview? Or does it include the stable version set to be released in a few weeks?

@EmilyThomas Good question. It's too late to change it to include the stable version. Let's say only the preview.

Will resolve to NA if there are no benchmark comparisons between the 1106-preview and 0613.