Intology’s Locus result on RE-Bench real?
8
1kṀ9762resolved Nov 24
Resolved
NO1H
6H
1D
1W
1M
ALL
ways this would be real:
result independly replicated
Model is clearly found to be strong SOTA at SWE tasks similar to RE-Bench
Ways this would not be real:
they announce that this reported score was in part caused by an error in their setup / due to extensive reward hacking by their model (it ‘cheated’)
Independently replicated and score is nowhere near human level
failing these, resolves to consensus of credible people, let’s say in feb 2025
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
🏅 Top traders
| # | Name | Total profit |
|---|---|---|
| 1 | Ṁ816 | |
| 2 | Ṁ88 | |
| 3 | Ṁ33 | |
| 4 | Ṁ9 | |
| 5 | Ṁ2 |
People are also trading
Sort by:
@AndrewImpellitteri I’m leaning that way but ill keep open for a bit and would prefer stronger evidence