Will future large video model (understanding) use pixel loss or embedding loss?

MANIFOLD

Ṁ235Ṁ92

2028

ALL

35%

Pixel loss

27%

Embedding Loss

38%

Neither

Examples of models with pixel loss:

MAE
iGPT
LVM

Examples of models with embedding loss:

I-JEPA

If end up people use diffusion model (DDPM) to pretrain large video understanding model, then resolve pixel level.

Will resolve in EOY 2027 by consulting expert/public opinions. Among all factors that decides the resolution, the paradigm that the SOTA video understanding model uses will be most indicative.

Discrete cross entropy loss (transformer+vqvae) will resolv eto Neither

Market context

Get

1,000

to start trading!

1 Comment

6 Holders

10 Trades

Sort by:

I thought pixel loss will increase with DDPM

here is the ambiguity part. What if someone uses V-JEPA as encoder, diffusion as backbone, and something else as decoder?

People are also trading

Will AI figure out who has the highest visual frame compression efficiency to working memory ratio by 2030?

50% chance

Will transformer architectures lose their dominant position in deep learning before 2028?

15% chance

People are also trading

Related questions