How many non-Transformer based models will be in the top 10 on HuggingFace Leaderboard in the 7B range by July?
Standard
16
Ṁ769
Jul 2
39%
0
32%
1-2
24%
3-5
4%
6+

For resolution I’ll go to the HuggingFace leaderboard and select the ~7B and uncheck everything else.

I’ll refrain from participating in the market to stay neutral in case a hybrid case comes up. I’d count both Mamba and StripedHyena as non-Transformers.

Get
Ṁ1,000
and
S1.00
Sort by:

does sliding window count as non-attention?

@HanchiSun I’d say sliding window is a type of attention. I’d consider LongFormers as a type of Transformer.

@HanchiSun Out of curiosity, would you bet differently if it was for the 3B category rather than 7B?

@KLiamSmith Good question. it is definitely harder to experiment with 7b than 3b. but even for 3b, i doubt more than 2 non-attention architecture will be better