This market will resolve YES if, by the end of June 2024, Elon Musk's xAI announces that they have a language model at least as powerful as GPT-3.5 or Claude.
By default, I will use the Arena Elo rating to decide whether a model meets the bar. If there is no such rating, I will use other benchmarks (e.g., MMLU) or my subjective impression. If there is a lot of disagreement, I will resolve NA.
@JonasVollmer It's still not on Chatbot Arena, which is the preferred benchmark. Shouldn't you wait until the market close in case it gets uploaded there? Chatbot Arena scores can be quite different from other benchmarks. I was betting on that.
@benshindel Building products & doing stuff takes time. Consider: There were 2½ years between GPT-3 and ChatGPT
@BenjaminShindel updated the description to remove Llama 2 (thought this would be most fair to you given that you're the largest NO holder)
@JonasVollmer People subjectively prefer LLAMA2 over GPT-3.5 by far.
Try out https://llmboxing.com/
@JonasVollmer Thx! Although tbh it wouldn’t impact my betting that much as I mostly just think it’s <75% likely they’ll have developed any public LLM at all by June
@firstuserhere Added: "By default, I will use the Arena Elo rating to decide whether a model meets the bar. If there is no such rating, I will use other benchmarks (e.g., MMLU) or my subjective impression."
hm, i mean given the list of people + dan advising it, most likely a strong yes, and since timelines are pretty quick june is actually a reasonably solid estimate
i think hardest roadblock would be time for training + finding good enough data. it could also be the case that they dont actually go towards LM's immediately which seems pretty low probability (although i would be interested in looking at if they did some autoformalization stuff especially)