In 2023, it was said:
Currently, the technology for 4-bit training does not exists, but research looks promising and I expect the first high performance FP4 Large Language Model (LLM) with competitive predictive performance to be trained in 1-2 years time.
(see: https://timdettmers.com/2023/01/16/which-gpu-for-deep-learning/)
Currently, the technology for 4-bit training does exists, but research doesn't look promising and I expect a high performance FP4 Large Language Model (LLM) with competitive predictive performance to maybe never be trained. Will it be trained, before 2028?
granted, the model must be opensource, or have public information about its training, for it to be considered, as we would otherwise not know.
frontier-level performance is a vibe based estimate that the model is in the top 5 most capable models that have been publicly announced.
See the market this was duplicated from: