I have been seeing many reports/rumors that Apple might be working on local on-device LLMs
The LLM feature can be a combination of on-device and cloud-based processing, but must include some on-device functionality.
ps: i will participate in this market because it is very objective - in case some subjective interpretation is required I will delegate to one of the moderators
Related questions
@horace I don't think so, LLMs are a recent development in AI and Siri's responses are based on a neural network like Alexa and Google Assistant. If you look at all the leaks and rumours of the iPhone 16, they talk about Apple developing an LLM for an improved Siri and that suggests that they are not using an LLM for the current version of Siri. When Siri answers, it doesn't respond differently each time like a LLM, it chooses between different semi-predefined answers and uses one of them.
@AxelRthingCano i think you misunderstood @horace‘s question - @horace was not implying that Siri is already using LLMs
@Soli will bet no then, can’t imagine that apple would risk rolling out anything <7b and a 7b model would tank battery life. i highly doubt that the hardware advancements apple has made are enough to run on mobile
@ashly_webb makes sense - let’s see ☺️ - right now the 7B model runs super super fast on my macbook and has decent quality so I don’t think they are that far off but I agree it is ambitious
@Soli macbook very different than iphone imo. current bottlenecks for iphone are power and ram. i don’t see the value prop for apple to do on device llm instead of just using a cloud solution. i’m fairly sure apple is going to rework siri with a transformer though
@Soli Siri run in the cloud for basically anything non-trivial. Why wouldn't they do the same with the chatbot?
@PlainBG because they can provide offline support, save cloud costs and give the users more privacy all at the same time ;)
@Soli imo the cost of inference is basically negligible at this point. even if you were inferring ~10000 tokens per year, it only works out to a few dollars per user per year (using mixtral as a reference)
@ashly_webb I mean it is apple 😅
1) they have a history of saving every cent they can
2) on their scale this is not a trivial cost
3) offline support + privacy don’t work with cloud inference
Apple removed chargers and wired EarPods from iPhone boxes starting with the iPhone 12 series in 2020, which was presented as an environmental decision. This allowed Apple to reduce the packaging size significantly, save on shipping costs by fitting more units on transport planes, and save nearly $6.5 billion. Apple now sells a separate 20W USB-C power adapter for $19, while the iPhone prices have increased, including a so-called "5G tax" for newer models [oai_citation:1,Apple saved nearly $6.5 billion by removing charger from iPhone box | iLounge](https://www.ilounge.com/news/apple-charger-savings-iphone-box).
@Soli cost savings from local inference are probably order of magnitude lower than from a smaller shipping box. for most users, it’s likely <$/year, compared to $20 for headphones +charger. the value addition of a non neutered cloud gpt-3.5 tier model is much larger than that of a small 2.5B model like phi-2 running locally, even if the 2.5B could run offline. almost every core siri functionality is online, and if this is a siri replacement, it probably will be online too
@ashly_webb I hope many people share your views because then we have a really interesting market here