Is Gemini 3.0 basically state of the art at everything?
156
resolved Nov 30
Yes
No
See results
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
People are also trading
Sort by:
Equal/better/significantly better at most things (aka SOTA), but clearly worse at hallucinations (not minor/“pedantry” imo), possibly marginally worse than Opus 4.5 and Codex 5.1 at coding
@MaxHarms What are your thoughts on its apparent poor performance when it comes to hallucinations?
https://manifold.markets/Bayesian/will-gemini-30-be-basically-sota-at#vimhv8artv
@Nat Seems like a good answer to a concrete way it's not SotA!
I haven't personally noticed the hallucinations (and have been in contexts where I can check/notice), but I buy that they're an issue for many use cases.
Comment hidden