In 2023, it was said:
Currently, the technology for 4-bit training does not exists, but research looks promising and I expect the first high performance FP4 Large Language Model (LLM) with competitive predictive performance to be trained in 1-2 years time.
(see: https://timdettmers.com/2023/01/16/which-gpu-for-deep-learning/)
Currently, the technology for 4-bit training does exists, but research doesn't look promising and I expect a high performance FP4 Large Language Model (LLM) with competitive predictive performance to maybe never be trained. Will it be trained, before 2028?
granted, the model must be opensource, or have public information about its training, for it to be considered, as we would otherwise not know.
frontier-level performance is a vibe based estimate that the model is in the top 5 most capable models that have been publicly announced.
See the market this was duplicated from:
People are also trading
If DeepSeek-r1 had the same performance metrics and had been trained in FP4, would this market resolve YES? What does "frontier-level" actually mean?
i linked it in the description but it wasnt that obvious so ill show it more properly. Good market!