
Will future large video model (understanding) use pixel loss or embedding loss?
7
235Ṁ922028
1D
1W
1M
ALL
35%
Pixel loss
27%
Embedding Loss
38%
Neither
Examples of models with pixel loss:
MAE
iGPT
LVM
Examples of models with embedding loss:
I-JEPA
If end up people use diffusion model (DDPM) to pretrain large video understanding model, then resolve pixel level.
Will resolve in EOY 2027 by consulting expert/public opinions. Among all factors that decides the resolution, the paradigm that the SOTA video understanding model uses will be most indicative.
Discrete cross entropy loss (transformer+vqvae) will resolv eto Neither
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
Related questions
Related questions
Will video dominate 2024 machine learning?
16% chance
Will Sparse Autoencoders be successfully used on a downstream task in the next year and beat baselines?
76% chance
Will OpenAI's next major LLM release support video input?
48% chance
Will OpenAI release next-generation models with varying capabilities and sizes?
64% chance
Will video generation AI make more product revenue than text models in 2025?
21% chance
Will AI figure out who has the highest visual frame compression efficiency to working memory ratio by 2030?
50% chance
Will we learn by EOY 2024 that large AI labs use something like activation addition on their best models?
23% chance