For a YES resolution, I don't require it be competetive with the state of the art at the time. It must be comparable to chatgpt (not strictly as good as, but not far inferior), and I require it actually run on a single consumer GPU.
Added March 19: this market does not use the strict OSI definition of open-source (which would forbid terms like "don't be evil"). It is sufficient if it is legal for one to use the bot for most purposes, including commercial purposes.
Same idea as this market:
I think this is counts as "open source". It isn't a chatbot yet, but that this exists means that derivatives can also be open source, so I expect to see an open-source chatbot in the next month. Then I'll try and figure out if it's comparable to chatgpt (the original, not any GPT-4 deluxe version).
note 99% of production ML violates the ImageNet license or Yolov5+ License or text data license, or all three!
@Gigacasting If companies are using a technically "non-commercial license" chatbot for business purposes, doing so in a publicly visible manner, and doing so without getting sued, then that's good evidence that it's de facto legal.
Meta winked by not deleting the pull request—and Stanford winked right back
This is YES
“Consumer GPU” “usable” imply personal use
"Open source" implies open source.
I was not expecting the most difficult part of resolution to be the phrase "open source". Yikes! Kinda embarrassed that I missed that one, actually...
I think it's unambiguous that LLaMa does not count---leaks aren't the same as open-source. That also excludes Alpaca, at least for now. (Of course, if LLaMa is to be "officially released" later, then that could make it count.)
At the same time, I'm leaning away from actually invoking OSI's full definition (https://opensource.org/osd/). That's very strict, and would exclude any chatbot developed with a term in the license that says "you may not use this to destroy the world". I don't think that's what anybody has in mind when they read this question.
I think the requirements for this to resolve YES are:
I can download the weights and run it myself.
It's competitive with ChatGPT.
I'm not going to require that it be collaboratively developed, or anything like that. That's in some people's definition of open source, but not mine.
@ScottLawrence Tricky. I normally think of model weights as data, not source.
@MartinRandall Now that I've been away from this question for a few days, I don't think it's tricky at all.
For example: LLaMa. In the absence of model weights, it's not a chatbot. It's not even a language model. It's code for training one of the above, but if that code was applied to a different data set, it would yield a different thing. The reason we call these things "chatbots" and so on is because of how they're trained. (Related: scaling discussions necessarily include estimates of how much training data is available.)
For all we know, with the right training data and enough compute, Karpathy's MinGPT might achieve ChatGPT-comparable performance. (In fact this is overwhelmingly likely to be true, right? "Enough compute" and "the right data" are doing heavy lifting here.) But MinGPT was available long before the creation of this market, and of course nobody thought that should yield a YES resolution. The fact that "yes, this code can in principle be used to construct a chatbot, and this code is open source" is not sufficient.
@ScottLawrence Comparison: MediaWiki software isn't an encyclopedia, it's code for hosting and collaboratively writing text. MediaWiki is open source. Wikipedia is produced using open source, and its data is covered by a Creative Commons license, so people informally say that it's an "open source encyclopedia".
If you interpret the model weights as the compiled program, then releasing the model weights would not be enough to be traditional open source, because they are even less scrutable than assembly or byte code.
I roughly think that an "open source" chatbot would be one where all of the code needed to create the model from scratch is open source, including the code that gets training data, filters out bad training data, tunes for chat instead of competition, etc, such that I can "compile" it as easily as compiling Linux, albeit at a much higher cost.
(please nobody make one of those)
But that's not what you're asking about, I think, and I don't know how I'd phrase what you're asking about.
@MartinRandall if the chatbot weights are released under CC, that'll count as open source for this market.
Many websites are hosted by nginx/apache. Those websites are not open source.
This market already requires that it be runnable on a single consumer GPU. If the "compilation" involves training from scratch, that will not be the case.
Twelve meta’rs and twelve hackers >>> OpenAI
LLMs will not be living in the pod or eating the bugs it seems
Interesting question. I think we need a clearer definition. Are you going to use the official error rates, statistically? And then require it to be better than chatGPT? By that I assume you're referring to gpt3.5? What about the distinction between raw da vinci, and da vinci+the safety/helpful improvements they've added on top of it?
@StrayClimb As I say in the description, it is not required to be better than ChatGPT, but merely comparable. I think I'm requiring a conversational model, so I do mean ChatGPT rather than GPT3.5.
I'm not at all confident that I can accurately compare official error rates between ChatGPT and another bot, which may choose to use different metrics.
I can't see how to try DaVinci for myself; is there a place where I can?
How do you feel about a cutoff of "conversational, and better than anything appearing prior to ChatGPT". Is that both clear enough and a reasonable interpretation of "comparable to ChatGPT"?
@ScottLawrence I was under the impression that chatGPT is some variant of GPT3.5
the link is to GPT playground, something similar to chatgpt but with fewer content filters and less post-generation interference for any reason.
"Better" is very hard to evaluate here. I'd also be very interested in whether there is an open source strong LLM.
@StrayClimb I promptly get "you've reached your usage limit", despite never having been on the playground before. Ah well!
ChatGPT is indeed a conversational variant of GPT3.5. I'll only be resolving this market YES if the open-source bot is conversational, so I suppose there's no need to compare against things like davinci.