🐕 Will A.I., "Hallucinate Significantly Less," by the End of 2023?
30
550Ṁ2748
resolved Jan 10
Resolved
NO

Preface:

Please read the preface for this type of market and other similar third-party validated AI markets here.

Third-Party Validated, Predictive Markets: AI Theme

Market Description:

HALTT4LLM

This project is an attempt to create a common metric to test LLM's for progress in eliminating hallucinations; the most serious current problem in widespread adoption of LLM's for real world purposes.

https://github.com/manyoso/haltt4llm

Market Resolution Threshold:

Resolution criteria is >=1.2*(average score) for this benchmark goes to YES.

Original average score will be accepted as the commit on the readme file at the time of this market having been created.

Note that the current list of LLM's are inferences that can actually be measured and does not include GPT4. To be able to fully evaluate a model, an inference must be usable which GPT4 may not be at the end of the year.

Note, previously the resolution criteria was 1.3*(leftmost score) but this has been updated to 1.2*(average score).

In other words:

C=1.2

Current average score is 78.115% so the top score must be over 93.738% for any of the above inferenceable language models that fit the above criteria, including the fully measurable inference criteria.

Please update me in the comments if I am wrong in any of my assumptions and I will update the resolution criteria.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ189
2Ṁ85
3Ṁ55
4Ṁ22
5Ṁ14
© Manifold Markets, Inc.TermsPrivacy