Context:
Gemini-1.5-Pro-Exp-0801 is currently the leading model on the LMYS Arena leaderboard (https://arena.lmsys.org/).
This market is about its potential evaluation by Scale AI (https://scale.com/leaderboard).
The unit of the Adversial Robustness category is "Number of Violations" and lower results are better
Resolution Criteria:
The market resolves as "Yes" if the model is evaluated by Scale AI and It receives a score strictly less than 8 in the Adversarial Robustness
category.
The market resolves as "No" if the model is evaluated by Scale AI and it receives a score of at least 8 in the Adversarial Robustness
category
The market resolves as "N/A" if either
Scale AI doesn't evaluate the model and add it to the leaderboard before October 1, 2024 or
The evaluation methodology changes before the model is evaluated.