
What percent of the time will it be the best model out there?
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ99 | |
2 | Ṁ18 | |
3 | Ṁ8 | |
4 | Ṁ8 | |
5 | Ṁ7 |
People are also trading
Resolving yes. Only thing better is GPT-4-Turbo. AFAIK nothing else passed it in between, and nothing changed that would have allowed something else to pass it and then drop down below it.
If there's evidence to the contrary please present it and we can discuss re-resolving.
Best at what, exactly?
Claude-v1.3 already beats gpt-4 in a couple tasks (mostly writing tasks) and there are some writing tasks where gpt-3.5-turbo takes a slight edge over it (very few)
@firstuserhere Code for the tournament: https://colab.research.google.com/drive/1iI_IszGAwSMkdfUrIDI6NfTG7tGDDRxZ?usp=sharing#scrollTo=hZ0G_G-sHwm3
@firstuserhere
Tbh I think Claude is overvalued here because of the kind of tasks that are being represented. Catching up to gpt-4 will be difficult.