"Currently, the technology for 4-bit training does not exists, but research looks promising and I expect the first high performance FP4 Large Language Model (LLM) with competitive predictive performance to be trained in 1-2 years time." (see: https://timdettmers.com/2023/01/16/which-gpu-for-deep-learning/)
Granted, the model must be open source for us to know, so the market will resolve based on publicly available information.
Related questions
This seems important @typedfemale
Will this resolve YES if scaling laws suggest a 4-bit model would be competitive if compute-matched to a SOTA 16-bit model?
@NoaNabeshima Yes, you need to be better than everything else, but be trained in 4-bit (to some extent)
@typedfemale Finetuned w 4 bit would trigger Yes? 80% of parameters in 4 bit would trigger Yes?