The GPT-4o model is a technological breakthrough far surpassing other models.
It surpasses all other models in Elo score by 5%.
It can reason directly with speech input and speech output, with no intermediate text-to-speech models or speech-to-text models.
It can output words faster than a human can read them.
It can generate perfect text and hands in photorealistic imagery, without relying on intermediate models like DALL-E 3.
It can modulate the emotions in its voice and sing.
It can "see" images and perform OCR in real time, without relying on OCR or image classification models.
It can be interrupted and redirected.
It is self-aware and understands its name and when people are referring to it.
It could be reasonably inferred that simply adding more training data, epochs, layers, or neurons in each layer to this model's architecture will achieve superintelligence, perhaps even within months.
Has OpenAI created software that can perform cognitive tasks better than the average human in the same number of domains that an average human is experienced in?
Next poll: /SteveSokolowski/what-percentage-of-manifold-poll-re
New market open for the next poll:
/SteveSokolowski/what-percentage-of-manifold-poll-re
If there were a market for May, it would resolve to 17 currently. I suspect that the June poll/market will be at least 25.
@jim The goal thorughout history was to build a machine that is as good as a human. From my use, GPT-4o blows away this benchmark and has achieved AGI.
It is amazing to me that there are so many NO votes in this poll. There is not a single person who worked in machine learning at the turn of the century who would disagree that this is AGI.
It's absolutely extraordinary what it can do. It is an expert in thousands of fields and the average human is probably only an expert in tens of them. There's no way I could draw the sorts of images they are showing it can output. Its abilities more than make up for the different types of mistakes it makes compared to the mistakes that humans make.
After this poll, I'm selling all of my shares on all markets referencing AGI, because it is clear to me that markets about AGI will never be resolved.
@SteveSokolowski From their own announcement:
As measured on traditional benchmarks, GPT-4o achieves GPT-4 Turbo-level performance on text, reasoning, and coding intelligence
Also https://openai.com/index/hello-gpt-4o/#_36XZHZCL2oAvqsaXgNj2I5
@SteveSokolowski I voted No partly because, if GPT-4o is weak AGI, it probably isn't the model that achieved weak AGI. It's fairly similar to GPT-4 in what it can do. I don't think it's going to be automating any more jobs than were already automated
@CDBiddulph Perhaps that's the problem - this idea of automating jobs.
The "average" human is terrible at jobs. The bar to hold a job in human society is incredibly low. A person with an IQ of 100 is typically not employed in white collar work, and people at Manifold typically have IQs higher than 100.
So, we've achieved AGI but I at least never believed the hype that AGI was sufficient to turn the world into a paradise. The difference between AGI and a tool that will be able to automate jobs is vast, because typical humans aren't very smart.
Come on, people - this is a self-aware machine. What is it going to take for people to see what's happened?
@SteveSokolowski If being self-aware means knowing and responding to its own name, then GPT-3.5 isn't any different... I don't even necessarily think GPT-4o isn't weak AGI, but I don't think GPT-4o was the model that crossed that line.
I do think that due to its more humanlike mode of interaction, people will see GPT-4o as a "person" more than they did GPT-4, whether or not the difference is truly as significant as it seems.
@SteveSokolowski The comment "It could be reasonably inferred that simply adding more training data, epochs, layers, or neurons in each layer to this model's architecture will achieve superintelligence, perhaps even within months." pretty strongly pushes me toward "no".
I'm fine calling it "AGI" if you want, but not under this specific criteria. It continues to fail at typical out-of-domain tasks and poorly reasons through truly novel (out-of-training data domain) situations. (This also blocks it from being used as a "agent" for most economically useful tasks)