What will be announced by OpenAI during Monday's livestream? [ADD RESPONSES]
Mini
167
17k
resolved May 13
Resolved
YES
Something related to Voice Engine
Resolved
YES
ChatGPT UI changes
Resolved
YES
Voice in voice out assistant
Resolved
YES
Access to a GPT-4 level model is now free
Resolved
YES
Details on GPT2-chatbot
Resolved
YES
Announcement of model with a new "personality", less refusals, or different RLHF
Resolved
YES
Streaming video understanding
Resolved
YES
Live translation through voice input/output
Resolved
YES
Video input
Resolved
YES
Improved tool use
Resolved
50%
Related to Sora / Video
Resolved
NO
Partnership with the government of a country
Resolved
NO
Something Manifold finds disappointing (in a poll created after the announcement)
Resolved
NO
ChatGPT with a knowledge cutoff that is always recent (within 48h)
Resolved
NO
Something related to AI pornography
Resolved
NO
GPT-4-lite, a cheaper version of GPT-4
Resolved
NO
OpenAI branded hardware
Resolved
NO
Chinese Language GPT announced
Resolved
NO
A new physical product in the same niche as the Rabbit R1 or the Humane AI Pin
Resolved
NO
Agents
Get Ṁ600 play money

Related questions

🏅 Top traders

#NameTotal profit
1Ṁ156
2Ṁ134
3Ṁ117
4Ṁ93
5Ṁ92
Sort by:

“GPT-4-lite, a cheaper version of GPT-4”

GPT-4o is cheaper than GPT-4 and GPT-4T…

Would it be consider lite?

Video input
bought Ṁ30 Video input YES

@RaulCavalcante Wow could you send the link to where you got this from. Video input appears certain then

@Ricky30235 Confirmed this is real, saw it on the main GPT-4o page.

Video input
bought Ṁ7 Video input NO

There’s no direct video input like drag and drop video the same way they have for images and text files. They showed live camera, which is not the same as being able to give it a .mov file and ask it questions about it

@Ricky30235 can someone NA this because i get what youre saying but other ppl seemed to interpret it differently , it was pretty vague @shankypanky

@strutheo hm - is there argument about this one? it has the ability to use the camera but as Ricky says that's not what video input means. are there traders who say otherwise?

@shankypanky I really think it matters what is being done with that camera feed. Is it pulling occasional frames and processing them as single images? Or is it keeping history over time, like one would with video?

In the demo, if he had scanned the camera over the equation, instead of framing the whole thing, and it still got the equation, that I would take to mean it's processing video. We didn't get that in the demo so it's ambiguous. It will be easy enough to clear up once someone gets access though.

@robm I'd argue that's not ambiguous for the purpose of a market announcing a feature, though. as it was presented and demonstrated, there was no use case for saving and processing video, no mention of keeping video history over time, etc.

but I'm playing devil's advocate on that

@shankypanky I'm most curious what the model is doing under the hood. Does it accept videos as direct input? Taking in .mp4s would be clear evidence, this phone camera stuff, IDK.

bought Ṁ7 Video input YES

After watching gdb's demo I think this is a yes.

@ErikBjareholt No i believe his demo fits with “Streaming video understanding”, which was resolved yes, but “video input” would be if I give it a video file and ask it questions

@Ricky30235 That is just a matter of UI. The model itself clearly shows it can be given a captured video and answer questions about the footage (as they did with the waving woman in the background). The streaming is just a bonus.

As soon as the API becomes public this should become clear.

@shankypanky

it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs

On the announcement blog linked from that tweet. If it was native video, I think their marketing department would have made sure to say 'video' there.

(BTW I'm not deadset on this either way, genuinely just curious how this will resolve)

@robm yeah I was just sharing the tweet because you said you were curious to know more - the market is framed as "what will be announced during the livestream" so I'm not sure how people were betting based on that. the demo seemed to be from voice/image/video recognition but I'm not sure that qualifies as video input.
of course, people could also read it as "can I input video in some way to the tool" which you can with live video. so that's also a matter of interpretation.

yea i think im doing YES on this based on how most ppl interpreted it

@strutheo I believe this would be more likely to resolve yes if “Streaming video understanding” was not a separate option, but it was and clearly resolves to yes. “Video input” I thus think is not met since the separate option existed to interpret it that way

@Ricky30235 The options were never mutually exclusive, anyone could add an option.

@ErikBjareholt You can imagine streaming video resolving to no and video input resolving to yes if you could drop a video file, but couldn’t use the live camera. So when you invert, you get today’s announced capabilities

I have access to gpt-4o via API already (you can see the model version in the images). As you can see, I tried uploading a .mov file and a .mp4 file, and both are “unsupported”, proving that currently video input is not accepted

@Ricky30235 It only needs to be "announced" not "released". It does seem like it was announced.

@Mira As @robm said, if it was native video they likely would have announced it clearly and explicitly

@Ricky30235 Hmm, you might be right. Altman says "video"

https://twitter.com/sama/status/1790069224171348344

but the model page says "text or image and outputting text":

https://platform.openai.com/docs/models/continuous-model-upgrades

So it's possible the phone app just took a screenshot and the model can request screenshots. But it isn't receiving an entire video clip.

@Ricky30235 That's not proof. You do not have full access:

> We plan to launch support for GPT-4o's new audio and video capabilities to a small group of trusted partners in the API in the coming weeks.


https://openai.com/index/hello-gpt-4o/

@ErikBjareholt Oh yes you might be right about the rolling out, however @Mira above showed the model page saying inputs are text or image. They would say video if there was video input

@Ricky30235 If their own docs are explicitly stating text and image inputs but not “video input” I feel that settles it

@Ricky30235 No they wouldn't, just as they don't with audio, because you are reading the API docs and the modality isn't offered yet, but the capability is clearly demonstrated.

@Mira https://openai.com/index/hello-gpt-4o/

GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs.

It has audio and can "reason over video" if the app takes still images. But there's no explicit claim that the model supports video input. Just the associated app. They would've mentioned that, since it doesn't come for free.

@Ricky30235 If they announced video input anywhere, it should still count even if it's not in the API. But I don't see them claiming video input, so no announcement until someone finds a reference. Even if later the actual model has video input, it needs to be "announced today" so someone needs to find a clear reference to video input today.

I'm updating slightly toward yes. "Quickly" and the decription indicate some understanding on the time axis. No comment on whether this counts as "announced"

Yeah, I'm having a hard time looking at this and thinking that it's not functionally video.

@robm Update all, it’s definitely yes. I just got access to gpt-4o in ChatGPT and it accepted the video as input (regardless if the answer is not very good)

@Mira Do you think what I showed counts? It appears video input is accepted in ChatGPT-4o and was announced today, which would meet the criterion, right?

@Ricky30235 That doesn't prove video input for the model. But I guess the market says "video input" for some unspecified thing. So even a random unrelated app that doesn't even use an LLM could count...