When a major new technology platform emerges, an associated need—and opportunity—arises to build tools and infrastructure to enable this new platform. Venture capitalists like to think of these supporting tools as “picks and shovels” (for the upcoming gold rush).
In recent years, machine learning tooling—widely referred to as MLOps—has been one of the startup world’s hottest categories. A wave of buzzy MLOps startups has raised large sums of capital at eye-watering valuations: Weights & Biases ($200 million raised at a $1 billion valuation), Tecton ($160 million raised), Snorkel ($138 million raised at a $1 billion valuation), OctoML ($133 million raised at a $850 million valuation), to name a few.
Now, we are witnessing the emergence of a new AI technology platform: large language models (LLMs). Compared to pre-LLM machine learning, large language models represent a new AI paradigm with distinct workflows, skillsets and possibilities. The easy availability of massive pretrained foundation models via API or open source completely changes what it looks like to develop an AI product. A new suite of tools and infrastructure is therefore destined to emerge.
We predict the term “LLMOps” will catch on as a shorthand to refer to this new breed of AI picks and shovels. Examples of new LLMOps offerings will include, for instance: tools for foundation model fine-tuning, no-code LLM deployment, GPU access and optimization, prompt experimentation, prompt chaining, and data synthesis and augmentation.
If you enjoyed this market, please check out the other 9! https://manifold.markets/group/forbes-2023-ai-predictions
This market is from Rob Toews' annual AI predictions at Forbes magazine. This market will resolve based on Rob's own self-assessed score for these predictions when he publishes his retrospective on them at the end of the year.
Since Rob resolved and graded his 2022 predictions before the end of 2022, I am setting the close date ahead of the end of the year, to (try to) avoid a situation where he posts the resolutions before the market closes. In the event that his resolution post falls in 2024, my apologies in advance. If he hasn't posted resolutions at all by February 1, 2024, I will do my best to resolve them personally, and set N/A for any questions that I can't determine with outside source data.
-----
Edit 2023-07-05: Last year Rob used "Right-ish" to grade some of his predictions. In cases of a similar "Right-ish" (or "Wrong-ish") answer this year, I will resolve to 75% PROB or 25% PROB, respectively. This will apply for similar language too ("mostly right", "partial credit", "in the right direction"). If he says something like "hard to say" or "some right, some wrong", or anything else that feels like a cop-out or 50% answer, I will just call that N/A.
Thanks to Henri Thunberg from this comment in requesting clarification!
@MattCWilson if you do these again next year, please put a big disclaimer in the description that this dude has less integrity than the Dee bridge
@MattCWilson New York's Hottest Club is LLMOps. It's got everything. Alien monsters that hug your face, hyperintelligent furbies, men in wedding gowns. It finally answers the question, "are Open Source LLM's better than Bard?"
Google Trends data (data with both terms; data with just LLMOps) suggests that "MLOps" is still very dominant and "LLMOps" is very fringe, with the interest ratio hovering at 1:40.
I think no because LLMOps is still MLOps, and I don't think the ops would be that distinct from any other kind of ML? I think there is a vicious cycle where if you are researching ML, you are thinking in terms of what will run on a A100 etc, and if your are designing the next GPU you are going to make it work for current ML research. Therefore all ML will look similar at an ops level and carry on doing so.
@Quinn related: I don't super think of making a new monitoring/obserbility tool (like wandb) as "devops", I think of using existing such tools as devops. This may not super matter to a lot of people and certainly not to VCs.
@PatrickDelaney Nah, DJ ConfigMap applied for a slot on the schedule, next thing you knew the whole thing came crashing down out of nowhere.