
Preface / Inspiration: There are a lot of questions on Manifold about whether or not we'll see sentience, general A.I., and a lot of other nonsense and faith-based questions which rely on the market maker's interpretation and often close at some far distant point in the future when a lot of us will be dead. This is an effort to create meaningful bets on important A.I. questions which are referenced by a third party.
In short, here's the leaderboard:
Market Description:
Resolved by submissions at:
https://leaderboard.allenai.org/visualcomet/submissions/public
Visual Comet
This leaderboard collects evaluations of current AI systems on Visual commonsense tasks that measure both the knowledge that these systems possess as well as their ability to reason with and use that knowledge in context of an event in an image.
Even from a single frame of a still image, people can reason about the dynamic story of the image before, after, and beyond the frame. For example, given an image of a man struggling to stay afloat in water, we can reason that the man fell into the water sometime in the past, the intent of that man at the moment is to stay alive, and he will need help in the near future or else he will get washed away.
An example input question contains the following fields in JSON format:
{
"img_fn": "lsmdc_3005_ABRAHAM_LINCOLN_VAMPIRE_HUNTER/3005_ABRAHAM_LINCOLN_VAMPIRE_HUNTER_00.27.43.141-00.27.45.534@0.jpg",
"movie": "3005_ABRAHAM_LINCOLN_VAMPIRE_HUNTER",
"metadata_fn": "lsmdc_3005_ABRAHAM_LINCOLN_VAMPIRE_HUNTER/3005_ABRAHAM_LINCOLN_VAMPIRE_HUNTER_00.27.43.141-00.27.45.534@0.json",
"place": "at a fancy party",
"event": "1 is trying to talk to the pretty woman in front of him"
}
Example sets of images (not necessarily related to the above)


Market Resolution Threshold:

Note, at the time of authoring this, human performance is 0.5
Top leftmost score, BLEU 1 is 0.3500 at time of authoring. We would need to see 1.3*0.3500 or >=0.4550 by the end of the year for this market to resolve as YES, otherwise this resolves NO.
Score to Beat -> 0.4550















