I will ask AI models with search functionality to resolve the market at the end of the year so that we get a relatively unbiased resolution.
Update 2025-19-01 (PST): - Prompt: "Resolve this prediction market created at the beginning of 2024: <title>"
AI Models Used:
Grok
Deepseek search
OpenAI search
Resolution Weights: Each model's response contributes 33% to the final resolution.
Final Resolution: If two models answer YES, the market resolves at 66.6% in favor of YES.
Note: Efforts will be made to encourage the AI models to provide clear YES or NO answers. (AI summary of creator comment)
With regards to the resolution criteria, you may want to consider sycophancy bias, wherein a LLM wants to tell you what it thinks you want to hear. If you insist on using LLMs as resolution criteria, you should use several that have access to search, like Perplexity, Gemini, whatever OpenAI model. I will also suggest that the difference between public opinion and the truth be considered, which may influence how you prompt the LLM to respond.
@WilliamGunn I agree, the prompt will be: "Resolve this prediction market created at the beginning of 2024: <question title>"
I want to have the smartest search AI model resolve the question, so that they don't get distracted by a random newspaper article and misresolve it or so.
By my testing, Grok is better than OpenAI Perplexity didn't seem so smart and didn't give me a proper prediction when asked, so I have doubts, Gemini is censored and refuses to answer.
Conclusion: I will resolve it 33 percent per the answer of
1) Grok
2) Deepseek search
3) OpenAI search
So it will resolve 66.6 percent if two answer YES. I will try to force them to say yes or no.
@HannesLynchburg I appreciate the effort, but I would bet this attempt to get LLMs to do what you want will not come out as you hope. Note that I am not taking a position in this market due to uncertainty.
@WilliamGunn Ask them for a clear description of the current situation and you will see that they understand everything pretty well. You shouldn't criticise it before trying.
Especially when there are traders here that resolved "Was Trump shooter Crooks a right winger?" as YES, which none of the models would do. (I'll admit that I am still salty about that market)
@HannesLynchburg It sounds like you don't know I've just completed a consulting project involving weeks of detailed evaluation of how good various models are at answering questions accurately. tl;dr: trust but verify. They'll give you answers, but always check the references. They don't fail in predictable ways.
@WilliamGunn This is true for asking normal LLMs about stuff, but LLMs with search have a very low probability to hallucinate about current news and are reliable enough to resolve a market.
@HannesLynchburg Can you imagine the existence of any sort of evidence that would make you less certain about this?
@WilliamGunn Good question. This would actually be a really fun scientific project. You could take markets that recently closed on manifold, send them to LLM APIs and look at the percentage of correct resolutions. Only problem: there are currently no APIs for the current search models...
@AlexanderTheGreater People will lose a lot of money that they "invested", so it is more risky on his side than his other grifts.
@AlexanderTheGreater A lot of insider coins will unlock this year, so we will see what is going to happen. I doubt that Trump will have so much restraint to hold everything for long.