Preface:
Please read the preface for this type of market and other similar third-party validated AI markets here.
Third-Party Validated, Predictive Markets: AI Theme
Market Description
AI2-THOR Rearrangement Challenge
This question pertains to the following AI challenge:
https://github.com/allenai/ai2thor-rearrangement#----2022-ai2-thor-rearrangement-challenge
The goal of this challenge is to build a model/agent that move objects in a room to restore them to a given initial configuration.
Example query:
task involves moving and modifying (i.e. opening/closing) randomly placed objects within a room to obtain a goal configuration. There are 2 phases:
Walkthrough π. The agent walks around the room and observes the objects in their ideal goal state.
Unshuffle π. After the walkthrough phase, we randomly change between 1 to 5 objects in the room. The agent's goal is to identify which objects have changed and reset those objects to their state from the walkthrough phase. Changes to an object's state may include changes to its position, orientation, or openness.
The resolution for this market will be here:
https://leaderboard.allenai.org/ithor_rearrangement_1phase_2022/submissions/public
Market Resolution Threshold:
If any 2022 AI2-THOR Rearrangement Challenge Submission Get a % Fixed Strict (Test) Score of >0.4 by end of 2023, this resolves as YES, otherwise NO.
Mar 22, 12:25pm: Will Any 2022 AI2-THOR Rearrangement Challenge Submission Get a % Fixed Strict (Test) Score of >0.4 by end of 2023? β Will AI Achieve Significantly More, "Embodiment" by end of 2023?
Same question for 2024: https://manifold.markets/PatrickDelaney/-will-ai-achieve-significantly-more ... though I have not set the threshold yet.
the only other contest I could find with a quick search was this one: https://leaderboard.allenai.org/ithor_rearrangement_2phase_2022/submissions/public ... which had a score of 0.2894 as the top one thus far in 2023, so I'm resolving NO. Please let me know if you object.
@PatrickDelaney For this market it doesn't change the outcome, but in general I think you should only use the specific contest variant you were supposed to use for resolution.
@na_pewno I agree. I was just posting that as further evidence for anyone who may have come by and pointing this contest out, but yes, I completely agree, and the purpose of this market was to stay strict to the above metric chosen.
I also think it's interesting (though not relevant to market resolution) to just stay informed on this stuff. AllenAI claims now that there is going to be an, "exponential improvement," in embodiment due to something called ProcThor (question posted here https://manifold.markets/PatrickDelaney/-will-ai-achieve-significantly-more)
Another embodiment leaderboard just dropped:
https://github.com/facebookresearch/eai-vc/tree/main/cortexbench