Full question: "chance that an off-the-shelf AI system can, if provided the rules to an arbitrary turn-based board game, play roughly as well as a casual player of the game"
Resolution Criteria:
A "casual player" is defined as someone who has, e.g., played similar games frequently, and has played this game for several hours, but has not played it competitively or studied strategy. Think, family game night playing Monopoly. I will consider only board games that are typical family-friendly games you might find in a non-specialty store. The AI is not allowed to be fine-tuned on this specific game by a human. The system takes the rules in, the statement that it's going to be playing this game with some other people, and then it should play. If the AI system makes obviously illegal moves, moves that are completely nonsensical, or otherwise plays in a way that just doesn't make sense, it does not count as playing as well as a casual player. Before each turn, the system will receive a video feed from the point of view of a player, an audio feed of the conversation at the table, and must output actions to take. (A human can make the actual moves; we don't have to have solved robotics.)
Motivation and Context:
Today's systems are very good at skills they are trained on, and very bad at skills they aren't trained on. So models are excellent at programming, because there are billions of examples of programs. It would be an important change to the world if models in the future could easily generalize to entirely new problem domains given just a description of what needs to be done. This obviously doesn't measure that completely, but gets at what I want to measure.
Question copied from: https://nicholas.carlini.com/writing/2024/forecasting-ai-future.html