Challenge me to battle an AI of your choice in one of the following formats:
Gen 6 Balanced Hackmons
Gen 7 Pure Hackmons
Gen 8 National Dex Anything Goes
Gen 9 OU
Gen 9 National Dex OU
Gen 9 National Dex Ubers
Gen 9 Balanced Hackmons
Teams must be submitted blind. We can discuss how to do this, but one easy way is to agree to correspond (via Discord, for example) at a certain time, confirming the plan at the agreed-upon time, and then exchanging teams within a few seconds/minutes of each other.
All battles will be conducted on Pokemon Showdown.
I'm open to different kinds of AI:
If an LLM, I can correspond with the LLM (in which case it either needs to be freely accessible or you need to allow me to temporarily sign into your paid account) or you can (in which case we will need to set up a predetermined time to conduct the battle, during which we both must remain online and correspond via messages). In either case, the conversation with the LLM must be shared afterward. Messages can only provide information about the state of the battle, and each message must include a complete list of all available options (e.g. 4 moves and 1-5 available switches). Messages must not influence the LLM's decision toward any particular option.
If an RL system, it is up to you whether you want to train on my specific team (in which case you can take as much time as you want; then afterward the battle will be conducted at my earliest convenience) or without training on my specific team (in which case the battle can be conducted immediately upon exchanging teams, so I do not have time to prepare). In either case, I need to be able to run the agent so that I can verify the outputs.
If a GOFAI system, you can run the AI and provide the outputs, but in the event that you win, you will need to immediately send the code in order for me to verify that the behaviour was predetermined.
I'm willing to discuss other AI solutions to determine how best to proceed.
Market format and rules shamelessly stolen from here:
"Any Manifold user may challenge me to a match. Any given user may make one attempt if they hold 2000 YES shares in this market. The requirement doubles for every attempt after. So they need to hold 2000 shares before making their first attempt, 6000 total shares if they want to make a second attempt, 14000 shares if they want to make a third [attempt], etc. When you are placing a bet, your 'Max payout' is how many shares you are buying.
"[I am] obligated to accept any challenge from someone with enough shares. Also, [I] must be willing to buy NO shares at [80]% if [I] have the mana to spare to allow challengers to take their shot."
Resolves YES if a challenger beats me in a Bo1 match.
Otherwise, resolves NO at market close time.
For anyone who would like to trigger a YES resolution, I suspect that it might be possible for any decent programmer to use poke-env to train an RL agent to play a specific matchup (i.e. one specific team versus another specific team) at a superhuman level. I tried to do this myself but it was beyond my ability, and current LLMs couldn't get me there. Maybe GPT-5, we'll see.
If simple RL doesn't work, I suspect a relatively barebones implementation of Alphastar-League training (basically a league of historical agents and exploiter agents) would work rather well with Pokemon.
I strongly believe that the problem is tractable, but regrettably, I don't think I'll be the one to solve it. If anyone thinks they've done it, feel free to comment here and I'll pump the market down before your attempt so you can make some good profit if successful.
Made a small change to the description. I copied the description from another market which referenced a Bo5 match, but I realized how tedious this would be with an LLM, so I've changed this market to Bo1. If this influences your position, please update now, and I'm happy to reimburse any resulting losses.