Requires the AI to use a phone. The market resolves YES if an AI is able to reach and talk with a human IRS representative (the conversation itself isn't important) by calling the IRS phone number. It's fine if a human types in the phone number for it and starts the call, but nothing more than that.
This is primarily a test of its speech recognition abilities and ability to navigate the IRS automated voice answering ststem, and to operate over long periods (like being put on hold for over an hour).
If nobody posts evidence of this happening before the close date, I'll test a few mainstream AIs.
For a system tailored to this task to count it needs to retain its generality across other tasks, and not be e.g. a python script that sends DTMF tones.