Phillip is an AI that has been trained to play Super Smash Bros. Melee using imitation learning and deep reinforcement learning. Recently, the bot has been gaining some attention in the community after some top players played exhibition matches against it. Notably, Moky played it in a first-to-10 set in the Fox ditto (10-3 in Phillip's favour) and Zain played it in a first-to-5 set (5-3 in Zain's favour).
Phillip has a long list of agents who are trained to play certain matchups. Some of these agents are much stronger than others. I am interested in tracking the progress of agents who I believe are approaching superhuman level. At the time of writing (2024-11-19), I believe that the strongest agent is fox_d18_ditto_v3 (released 2024-11-18). I will maintain a list of "Plausibly Superhuman Agents", beginning with fox_d18_ditto_v3 and updating the list as the developer releases new agents of similar or greater proficiency in their specialty matchups. As the developer releases stronger agents in a given matchup, they will replace their ancestors on the list of PSAs.
Plausibly Superhuman Agents
fox_d18_ditto_v3 (added 2024-11-19)
Unsportsmanlike Exploit Clause
If a player discovers an exploitable behaviour in a Plausibly Superhuman Agent which trivializes the matchup (e.g. a way of manipulating a Samus agent into getting hit by its own reflected missiles from respawn until death), then the intentional use of such an exploit will invalidate the result of the set.
Resolution Criteria
Resolves immediately to YES if, at any point during the 2025 calendar year, a human player defeats a Plausibly Superhuman Agent in a best-of-5 (or greater*) set in the agent's specialty matchup, without violating the Unsportsmanlike Exploit Clause. The set must be played with the Slippi Ranked ruleset (no wobbling, no transformations on Pokemon Stadium, etc.).
Resolves to NO one week after market close if no evidence has been provided to trigger a YES resolution.
* A "greater-than-Bo5" set is any pre-determined series of more than 5 games. The player must state the planned number of games at the outset, and this number cannot change in a way which would shift the result in the player's favour (e.g. "I was going to do best of 7 but I lost 3-4 so I changed it to best of 9 and then won 5-4").
Possible clarification from creator (AI generated): Examples of Unsportsmanlike Exploits that would invalidate a set:
Repeatedly causing Samus agents to shoot missiles from a specific distance that can be reflected back for reliable damage
Using only Jigglypuff's rollout move to build percent and kill, as the agent cannot respond to this move
Examples of Allowed Strategies:
Reacting to Fox's lasers with Marth's wavedash forward f-smash, even though this strategy would be insufficient against top human players
https://www.reddit.com/r/SSBM/comments/1hha2on/humanity_versus_the_machines_humanity_triumphs_in/
Only problem is its not 2025 yet
@asmith Yeah, it'll need to be repeated in 2025 to resolve YES. If another bounty gets posted, I'll consider whether the agent should be added to the PSA list, in which case the bounty would overlap with this market.
@OP I'll try to be as judicious as possible in the application of the clause. For example, when Cody Schwab recently played Phillip, he claimed that the Fox bot can be reliably beaten by reacting to lasers with Marth's wavedash forward f-smash, which would never be a sufficient strategy for beating a top human player. Although I doubt that Cody could implement this strategy consistently enough to out-punish the bot, I would allow it and count it as a win.
Some behaviours I have discovered for which I would invoke the clause:
Some Samus agents can, in my experience, be reliably stunlocked into shooting missiles from a certain distance, which spacies can just reflect back at them until the missiles kill.
The Fox agent has absolutely no idea how to respond to Jigglypuff's rollout. It simply gets hit every time. I would not count a set in which a Jigglypuff player only uses rollout the entire time in order to build percent and kill.
In both of these cases, I simply do not find the resulting gameplay to be representative enough of human play to be within the spirit of the question (given that the underlying premise of Phillip is meant to be imitating human play).
@Tumbles I know it's not technically tournament-related, but do you think we could add this to the "Related Markets" section of the SSBM Tournament Dashboard?