FINAL UPDATE:
Everyone, I am deeply sorry for the way that I handled this market. When GPT-4 Turbo first released, I was inactive on Manifold and didn’t even realize that it could qualify to resolve the market. People took me not resolving it YES upon the release of GPT-4 Turbo as a sign that it wouldn’t count, and started betting the market back down. I was also provided arguments for why it should/should not count. At this point, I still wasn’t interested in Manifold, so instead of delving deeper, I just responded to individual arguments, and I saw points for both NO and YES, so I decided I would resolve N/A instead of doing more research, because I couldn’t decide with that limited evidence. If I had done more research like I should’ve, and looked at all of OpenAI’s relevant pages, I would have resolved YES. However, I felt that because of my indecision upon the initial release of GPT-4 Turbo, resolving YES would be unfair to those who bought NO a week after, thinking it wouldn’t count, therefore I was compelled to resolve N/A.
Here is my solution: The market will resolve YES, but I’m going to personally refund all the people who bought at least M10 of NO after the release of GPT-4 Turbo, but before my update stating that I would likely resolve YES. This will be done tomorrow. Here is the list of people, with the amount of mana they spent:
ww - 20
JohnSmith39f9 - 20
ManuelSalazar - 20
AntonBogun - 25
ChrisMills - 25
rocknrollfinance - 25
Grinchtachu - 30
JouniSeppanen - 30
StavrosKyriakidis - 30
KamilStaszewski - 40
OlegEterevsky - 50
tozac - 50
diegocaples - 50
5bd4 - 50
CellVendetta - 50
MaximPaschke - 50
JonathanColeman - 55
Grease - 60
StanRunge - 65
ChunglamWoo - 70
ArthurBrussee - 94
Seasons - 100
Simon74fe - 100
SanchitAgrawal - 100
jack_gibb - 100
vgnsh - 100
AlanFoster - 100
Be - 100
Cephalopod - 100
NamNguyencaf0 - 100
RaphaelP - 100
FinnMcArthur - 100
Arch1e - 110
FlameWing - 120
AndersKallberg - 150
Roosevelt55 - 200
FranciscoLecumberri - 200
Apple_ - 221
poetr - 405
AidinAbedi - 500
MarcelAguilarGarcia - 503
eclair4151 - 800
I will also refund Byrne Hobart 500 mana, because he bought a lot of YES when Turbo was released, and then was forced to sell some off at a loss.
If you aren’t on this list and bought NO after the Turbo release, please let me know.
Once again, I apologize for my error.
Sam Altman reportedly stated that "Cheaper and faster GPT-4" is OpenAI's top priority (paraphrased). Will the price of the GPT-4 API decrease in 2023?
"GPT-4" is defined as: Any product commonly referred to as "gpt-4" or similar by OpenAI. Names like "gpt-4-0314," "gpt-4-multimodal-2," "gpt-4," or "gpt-4-oct," would count as long as they are commonly referred to as GPT-4 and build off the original GPT-4 models. "gpt-4-plus," "gpt-4.5," or "gpt-4.1" would not count unless OpenAI regularly calls them GPT-4 and the consensus is that they are newer versions of gpt-4. Must be accessible by API, waitlisted is fine.
This specifically refers to the API, services like ChatGPT don't count, even if they switch to per-token pricing. You must be able to use it in code with an API key.
Resolves YES if:
• The price per 1k prompt tokens of any version of GPT-4 goes below $0.03 at any point before close of this market
• The price per 1k output tokens of any version of GPT-4 goes below $0.06 at any point before close of this market
Else, resolves 50% if:
• The price per 1k prompt tokens of a version of GPT-4 with a max text context length of 32K tokens or higher goes below $0.06 at any point before close of this market
• The price per 1k prompt tokens of a version of GPT-4 with a max text context length of 32K tokens or higher goes below $0.12 at any point before close of this market
Resolves N/A if:
• I cannot tell whether a model that may have satisfied the above conditions counts as gpt-4 or not on close date.
• There is credible evidence that a gpt-4 model that may have satisfied the above conditions is being offered through an API to specific people, but OpenAI has not confirmed this on close date.
Resolves NO if:
• None of the above conditions are true.
FINAL UPDATE:
Everyone, I am deeply sorry for the way that I handled this market. When GPT-4 Turbo first released, I was inactive on Manifold and didn’t even realize that it could qualify to resolve the market. People took me not resolving it YES upon the release of GPT-4 Turbo as a sign that it wouldn’t count, and started betting the market back down. I was also provided arguments for why it should/should not count. At this point, I still wasn’t interested in Manifold, so instead of delving deeper, I just responded to individual arguments, and I saw points for both NO and YES, so I decided I would resolve N/A instead of doing more research, because I couldn’t decide with that limited evidence. If I had done more research like I should’ve, and looked at all of OpenAI’s relevant pages, I would have resolved YES. However, I felt that because of my indecision upon the initial release of GPT-4 Turbo, resolving YES would be unfair to those who bought NO a week after, thinking it wouldn’t count, therefore I was compelled to resolve N/A.
Here is my solution: The market will resolve YES, but I’m going to personally refund all the people who bought at least M10 of NO after the release of GPT-4 Turbo, but before my update stating that I would likely resolve YES. This will be done tomorrow. Here is the list of people, with the amount of mana they spent:
ww - 20
JohnSmith39f9 - 20
ManuelSalazar - 20
AntonBogun - 25
ChrisMills - 25
rocknrollfinance - 25
Grinchtachu - 30
JouniSeppanen - 30
StavrosKyriakidis - 30
KamilStaszewski - 40
OlegEterevsky - 50
tozac - 50
diegocaples - 50
5bd4 - 50
CellVendetta - 50
MaximPaschke - 50
JonathanColeman - 55
Grease - 60
StanRunge - 65
ChunglamWoo - 70
ArthurBrussee - 94
Seasons - 100
Simon74fe - 100
SanchitAgrawal - 100
jack_gibb - 100
vgnsh - 100
AlanFoster - 100
Be - 100
Cephalopod - 100
NamNguyencaf0 - 100
RaphaelP - 100
FinnMcArthur - 100
Arch1e - 110
FlameWing - 120
AndersKallberg - 150
Roosevelt55 - 200
FranciscoLecumberri - 200
Apple_ - 221
poetr - 405
AidinAbedi - 500
MarcelAguilarGarcia - 503
eclair4151 - 800
I will also refund Byrne Hobart 500 mana, because he bought a lot of YES when Turbo was released, and then was forced to sell some off at a loss.
If you aren’t on this list and bought NO after the Turbo release, please let me know.
Once again, I apologize for my error.
@ShadowyZephyr, copying this as a top-level comment (with improvements) for visibilty. Curious to hear if this makes sense to you.
The argument for YES would be that they have two versions of GPT-4: GPT-4 and GPT-4 Turbo, and the text above is referring to the VERSION of GPT-4, which is also called GPT-4, but that is a pretty weird way to do it. The pricing page also contradicts this as well.
That's the way that I (and seemingly other YES voters) understand this. To introduce some (unofficial) terminology, the term "GPT-4" can refer to either the top-level Class of "GPT-4" models, or the Family of "GPT-4 (non-turbo)" models. Little diagram to explain:
I agree that it's a pretty weird and confusing way for OpenAI to name the models, but all official documentation is consistent with this model (no pun intended). Most directly, this docs page outlining the different models shows:
"GPT-4" (the Class) as the top-level section in the sidebar
"GPT-4 and GPT-4 Turbo" (the Families) as the section header on the page
The specific Models in the table
(Note that OpenAI refers to all three levels as "model")
You mention that "The pricing page also contradicts this" but it makes sense once you realize that they're grouping by Family, not Class. The "GPT-4" and "GPT-4 Turbo" Families are separate pricing groups within the "GPT-4" Class.
This explanation makes the most sense to me, and explains why OpenAI sometimes refers to "GPT-4 Turbo" as "GPT-4" (the Class), and in other cases compares "GPT-4 Turbo" against "GPT-4" (the Family).
Tying this all back to the market, your original description alludes to this same breakdown:
"GPT-4" is defined as: Any product commonly referred to as "gpt-4" or similar by OpenAI. Names like "gpt-4-0314," "gpt-4-multimodal-2," "gpt-4," or "gpt-4-oct," would count as long as they are commonly referred to as GPT-4 and build off the original GPT-4 models. "gpt-4-plus," "gpt-4.5," or "gpt-4.1" would not count unless OpenAI regularly calls them GPT-4 and the consensus is that they are newer versions of gpt-4. Must be accessible by API, waitlisted is fine.
I don't know exactly how OpenAI would categorize these made-up models, but I would wager that your example models that "would count" all fit under the "GPT-4" Class, whereas your example models that "would not count" would likely all form their own Class (e.g. "GPT-4 Plus" or "GPT-4.5", in the same way that "GPT-3.5" is its own Class separate from "GPT-3").
If that sounds correct, then this market defines "GPT-4" as the Class of "GPT-4" rather than the Family of "GPT-4 (non-turbo)", thereby including the Turbo models (whose price is below the threshold defined in the market).
@MaxMusing Hmmm, this is a pretty compelling argument. The way OpenAI groups these models is VERY confusing.
I am leaning YES pretty heavily now (as opposed to somewhat leaning yes before) If I resolve the market yes based on this, I’ll have to refund all traders who bought NO after I said I’d resolve N/A (I have 39k mana which will hopefully cover that)
I’ve no position in this market but resolving n/a based on a claim that gpt4 turbo isn’t gpt4 when the market says resolves yes if the price for “any version of gpt4” drops is such an absurd position that I struggle to assume good faith.
It is literally listed with gpt4 and a version number on the OpenAI website.
If it’s not a version of gpt4 then what is it? It’s clearly not a version of gpt3.5. and it’s not a version of gpt5. So it’s… undefined?
@umnikos There is a far higher chance of OpenAI suddenly announcing a decrease of prices to the original GPT-4 model compared to the chance of OpenAI suddenly announcing an increase of prices, though both are still unlikely.
This comment is based on my feelings, but the squishy nature of humans is part of what makes prediction markets work. That said, @ShadowyZephyr I feel that you are being very stubborn on this. I don't know why, but you seem dead set on Turbo not counting even though it seems to most (by amount invested) other folks on this market that it should. Many of us are obviously biased based on the Mana we stand to gain, but that does not invalidate our arguments. I don't understand what is convincing you so strongly that Turbo should not count. When I ask GPT-4 a question in the web UI, it is using gpt-4-turbo. They explicitly refer to it as such in this case and others. You have shared your reasoning for disagreeing, but I remain unconvinced.
Of course, you decide based on what you believe, but this market has basically become "Will the YES buyers be able to convince 𝐒𝕙𝕒𝕕𝕠𝕨𝕪𝐙𝕖𝕡𝕙𝕪𝕣 that GPT-4-Turbo is a version of GPT-4?" instead of what it was originally as at this point it is incredibly unlikely that OpenAI will release API changes before the new year.
Sorry if this comes off as harsh. My tone was intended to portray my lack of understanding in your reasoning not any ill will towards you. Have a wonderful day regardless of your decision.
EDIT: Clarified what I meant by most. I did check the actual count of people before posting, but the amount invested reflects confidence. That is what I meant. Not the actual number of people on one side. I see that that was unclear.
@Jacobmpp There have been arguments for NO as well. I also asked a mod and they said they thought it probably shouldn’t count. If I had not wrote the N/A criteria into the market originally I would have resolved YES, because I think YES is more reasonable than NO, but I wrote those criteria in so I’m going to stick with them.
There are more NO holders than YES holders, so “most people think it should count” has no basis.
also, my position is that I’d rather controversial markets resolve N/A (when it is reasonable) as it balances the interests of both parties. I understand that is an unpopular opinion to have on this site, even among admins/mods, but that is the way I wish to run my markets, and as you call it, the “squishy” nature of humans means that there will still be disagreement on what counts or not.
No matter how I resolve this, it will be controversial. So I’m going to stick with what I think most closely reflects the initial criteria, and what I wrote
Responding to your edit: Total amount of mana does not reflect confidence, % of portfolio reflects confidence.
https://openai.com/blog/new-models-and-developer-products-announced-at-devday
It says that GPT-4 Turbo is the next generation of GPT-4.
Wouldn't that count as GPT-4 Turbo as a version of GPT-4?
Also, on the page about the get-4 and get-4 turbo:
https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
It says that GPT-4 Turbo is the latest GPT-4 model.
@Latte_Horse OpenAI seems to be contradictory about it. On their pricing page they then say "GPT-4 Turbo is more powerful than GPT-4."
That would be like Google saying "Gemini Ultra is more powerful than Gemini." That makes no sense if it IS a version of Gemini, the correct statement would be "Gemini Ultra is more powerful than Gemini Pro."
The argument for YES would be that they have two versions of GPT-4: GPT-4 and GPT-4 Turbo, and the text above is referring to the VERSION of GPT-4, which is also called GPT-4, but that is a pretty weird way to do it. The pricing page also contradicts this as well.
Here, if GPT-4 Turbo was a version of GPT-4, the Turbo models would be listed under the heading "GPT-4". But instead there are two headings, GPT-4 and GPT-4 Turbo, each with their own separate models, and it says "GPT-4 Turbo is more powerful than GPT-4." Clearly implying they are different.
So, you are saying that different models have different headings on the pricing page.
Contrapositively, you are saying that the same models will be in the same headings.
But, as you can see, there are different models in the same headings.
So, my argument is that headings do not distinguish the models.
The statement "GPT-4 Turbo is more powerful than GPT-4" implies that the most recent version of the model is more powerful than the previous one.
Yes, they are different, but clearly, they are both GPT-4.
@Latte_Horse
That is a scenario where they are grouped by "Image models" and "Audio models," the context there is quite different. DALLE3 Standard and HD are not the same exact model, but they are both referred to as DALLE3 by OpenAI.
OpenAI probably wouldn't say "DALLE3 HD is more powerful than DALLE3," they would say "DALLE3 HD is more powerful than DALLE3 Standard," (assuming that is true). If the said the first, then the meaning would be ambiguous, as is true in this case.
No, it doesn't. It implies that the models are different models, because they are using the base name GPT-4 and comparing it to GPT-4 Turbo.
Edit: posted as a top-level comment with improvements here.
The argument for YES would be that they have two versions of GPT-4: GPT-4 and GPT-4 Turbo, and the text above is referring to the VERSION of GPT-4, which is also called GPT-4, but that is a pretty weird way to do it. The pricing page also contradicts this as well.
That's the way that I (and seemingly other YES voters) understand this. "GPT-4" can either refer to the top-level grouping of models (which includes "GPT-4 Turbo"), or to the subset of "GPT-4 (non-turbo)" models. Little diagram to explain:
I agree that it's a pretty weird and confusing way for OpenAI to name the models, but all official documentation is consistent with this model (no pun intended). The most telling is this docs page outlining the different models which has:
"GPT-4" as the top-level section in the sidebar
"GPT-4 and GPT-4 Turbo" as the section header on the page
The specific models in the table
You mention that "The pricing page also contradicts this" but it makes sense once you realize that they're grouping by the second level in the diagram. "GPT-4 (non-turbo)" and "GPT-4 Turbo" are separate pricing groups within "GPT-4". Similarly, "GPT-3.5 Turbo" has its own pricing group but still falls within the greater category of "GPT-3.5" (as defined in that same docs page). By contrast, "GPT-3" is its own top-level group separate from "GPT-3.5".
This explanation makes the most sense to me, and explains why OpenAI sometimes refers to "GPT-4 Turbo" as "GPT-4", and in other cases compares one to the other. They're sometimes referring to the top-level group, and sometimes to the second-level group (i.e. "GPT-4 non-turbo").
Tying this all back to the market, your original description alludes to this same breakdown:
"GPT-4" is defined as: Any product commonly referred to as "gpt-4" or similar by OpenAI. Names like "gpt-4-0314," "gpt-4-multimodal-2," "gpt-4," or "gpt-4-oct," would count as long as they are commonly referred to as GPT-4 and build off the original GPT-4 models. "gpt-4-plus," "gpt-4.5," or "gpt-4.1" would not count unless OpenAI regularly calls them GPT-4 and the consensus is that they are newer versions of gpt-4. Must be accessible by API, waitlisted is fine.
The example models that would count all fit under the top-level "GPT-4" group. The example models that would not count would likely all form their own top-level groups (e.g. "GPT-4 Plus" or "GPT-4.5", in the same way that "GPT-3.5" is its own top-level group separate from "GPT-3").
EDIT: I just searched the Internet Archive and found out that the help center article in the comment linked below this was referring to the initial price change, NOT turbo at all.
https://web.archive.org/web/20230610052643/https://help.openai.com/en/articles/7127956-how-much-does-gpt-4-cost
This page has said those words about a price reduction since March, before I made this market.
They just added Turbo to it after it was released. So the "price reduction of GPT-4 tokens" is not that.
@ShadowyZephyr
I don't understand.
Why does that make the GPT-4 Turbo not the version of GPT-4?
@ShadowyZephyr
They added Turbo to make a difference in the price and the performance of the previous version of GPT-4. Does that mean any update or price change on the model will not be counted?
@Latte_Horse If you read the resolution criteria, I said that something like "gpt-4-plus" would not count as it is a significantly different model, both performance wise and in terms of how OpenAI would refer to it. Turbo is in a grey area, because its performance IS better but still within the ballpark, and OpenAI also is contradictory in how they refer to it. See my reply above.