I used ChatGPT to cover the blanket of all their models (GPT-3.5, GPT-4, DALLE-3, Code interpreter, bots, etc) available through the UI due to the character limit in the market titles.
By a single chat, consider an average query size and an average response size and an average back and forth chat.
I will award a fair bit of bounty for someone who can walk me through the calculations and/or provide me with reference material to do it myself. The bounty amount (currently 80Ṁ) is just a placeholder, it can be upto 10X more.
Any talk about related topics is also encouraged. Bounties to all who help answer the question
People are also trading
SemiAnalysis estimates that each ChatGPT query costs 0.36 cents, and each Google query costs 1.06 cents. I'm not sure how it measures up with the claim that LLM is 10x as expensive as a traditional query.
[Google]
Google’s Services business unit has an operating margin of 34.15%. If we allocate the COGS/operating expense per query, you arrive at the cost of 1.06 cents per search query, generating 1.61 cents of revenue. This means that a search query with an LLM has to be significantly less than <0.5 cents per query, or the search business would become tremendously unprofitable for Google.
[ChatGPT]
We built a cost model indicating that ChatGPT costs $694,444 per day to operate in compute hardware costs. OpenAI requires ~3,617 HGX A100 servers (28,936 GPUs) to serve Chat GPT. We estimate the cost per query to be 0.36 cents.
Our model is built from the ground up on a per-inference basis, but it lines up with Sam Altman’s tweet and an interview he did recently. We assume that OpenAI used a GPT-3 dense model architecture with a size of 175 billion parameters, hidden dimension of 16k, sequence length of 4k, average tokens per response of 2k, 15 responses per user, 13 million daily active users, FLOPS utilization rates 2x higher than FasterTransformer at <2000ms latency, int8 quantization, 50% hardware utilization rates due to purely idle time, and $1 cost per GPU hour.
Not quite what you're looking for but may be helpful