Skip to main content
MANIFOLD
What will the Anthropic SAE paper contain?
6
Ṁ250Ṁ106
resolved May 22
Resolved
YES
Eye-test experiments
Resolved
YES
Some cherry-picked proof of concept for a useful *type* of task
Resolved
YES
Streetlight edits
Resolved
NO
Doing PEFT by training sparse weights and biases for SAE embeddings in a way that beats baselines like LORA
Resolved
NO
Passive scoping
Resolved
NO
Finding and manually fixing a harmful behavior that WAS represented in the SAE training data
Resolved
NO
Using an SAE as a zero-shot anomaly detector
Resolved
NO
Latent adversarial training under perturbations to an SAE's embeddings
Resolved
NO
Experiments to do arbitrary manual model edits
Resolved
NO
Finding and manually fixing a novel bug in the model that WASN'T represented in the SAE training data

This will resolve according to Stephen Casper's judgments.

Market context
Get
Ṁ1,000
to start trading!

🏅 Top traders

#TraderTotal profit
1Ṁ26
2Ṁ12
3Ṁ3
Sort by:

Casper's judgements are out:

Thus, my assessment is that Anthropic did 1-3 but not 4-10.

That is, YES eye-test experiments, YES streetlight edits, YES cherry-picked proof of concept, NO everything else.