Preface:
Please read the preface for this type of market and other similar third-party validated AI markets here.
Third-Party Validated, Predictive Markets: AI Theme
Attempting to improve upon this market, and make it a bit more interesting.
Some have pointed out in the below market that the threshold may be too high because it's above human performance.
This is instead an attempt to match human performance within a margin of error. Please read the threshold description at the bottom.
Preface / Inspiration:
There are a lot of questions on Manifold about whether or not we'll see sentience, general A.I., and a lot of other nonsense and faith-based questions which rely on the market maker's interpretation and often close at some far distant point in the future when a lot of us will be dead. This is an effort to create meaningful bets on important A.I. questions which are referenced by a third party.
Market Description
ProPara
ProPara aims to promote the research in natural language understanding in the context of procedural text. This requires identifying the actions described in the paragraph and tracking state changes happening to the entities involved.
Example Question
Given this five-sentence procedural paragraph (id 1167 from the training partition):
โ The gravity of the sun pulls its mass inward. โก There is a lot of pressure on the Sun. โข The pressure forces atoms of hydrogen to fuse together in nuclear reactions. โฃ The energy from the reactions gives off different kinds of light. โค The light travels to the Earth.
Consider the two participant entities:
atoms of hydrogen
sunlight or light
Predict answers to these four questions:
What are the Inputs?
That is, which participants existed before the procedure began, and don't exist after the procedure ended? Or, what participants were consumed?
Answer: The inputs are atoms of hydrogen.
What are the Outputs?
That is, which participants existed after the procedure ended, but didn't exist before the procedure began? Or, what participants were produced?
Answer: The outputs are light (or sunlight).
What are the Conversions?
That is, which participants were converted to which other participants?
Answer: The participant atoms of hydrogen is converted into light (or sunlight) in sentence 3.
What are the Moves?
That is, which participants moved from one location to another?
Answer: The participant light (or sunlight) moves from sun to earth in sentence 5.
Market Resolution Criteria
https://leaderboard.allenai.org/propara/submissions/public
Top score on F1 is 0.731.
Human Performance on F1 is shown as 0.839
A margin of error could be accepted as 2% for the purposes of this market, to make it a bit more interesting.
(0.839*2%) = 1.678%
0.839 - 0.017 = 0.822
Target to Beat = 0.822
Based upon a +/- 2% margin of error, we would need to see any public submission reach 0.822 or greater by the end of the year for this to resolve YES, otherwise NO.
Market for next year on this topic:
https://manifold.markets/PatrickDelaney/-will-ai-be-able-to-meet-just-below