
"Currently, the technology for 4-bit training does not exists, but research looks promising and I expect the first high performance FP4 Large Language Model (LLM) with competitive predictive performance to be trained in 1-2 years time." (see: https://timdettmers.com/2023/01/16/which-gpu-for-deep-learning/)
Granted, the model must be open source for us to know, so the market will resolve based on publicly available information.
@Gabrielle @Bayesian we got a lesson about this in Discord, here
My executive summary:
There's some evidence that FP4 is 'around the corner' and may demonstrate some of these qualities
But it's not enough to qualify for the market's criteria of:
Open source
Publicly available information
As of 21 January 2025
If someone disagrees with the way I'm spinning the summary of the conversation, post here!
Any AI expert can chime in and resolve this market? According to a prompt on chatGPT that I made, this should resolve no
This seems important @typedfemale
Will this resolve YES if scaling laws suggest a 4-bit model would be competitive if compute-matched to a SOTA 16-bit model?
@NoaNabeshima Yes, you need to be better than everything else, but be trained in 4-bit (to some extent)