Will this project in mechanistic interpretability make me happy by the end of 2024?

11

230Ṁ136

resolved Jan 1

Resolved

NO

1H

6H

1D

1W

1M

ALL

By the end of 2024 will I think that pursuing the following research project was a good idea?

The project: take an equivariant graph neural network or a similar architecture designed for learning to solve physics problems; train it to predict some non-trivial simulated dynamics (e.g. the gravitational three body problem); do mechinterp to find out what exactly the deep learning model has learned.

I will be happy about this if the project succeeds to a degree that I feel comfortable submitting a paper to a physics or astrophysics journal or an extended abstract to some ML conference (including workshops). In that case I will resolve YES. If I will not pursue the project at all (e.g. because of leaving academia) I will resolve NA. If I instead fail to obtain sufficiently interesting results despite spending some effort on this I will resolve NO.

This is subjective so I will not bet. I expect some details of the project to change along the way (if we knew what we were doing it wouldn’t be called research).

Mechanistic interpretability

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ25
2		Ṁ17
3		Ṁ15
4		Ṁ6
5		Ṁ1

People are also trading

Will mechanistic interpretability have more academic impact than representation engineering by the end of 2025?

Will mechanistic interpretability be essentially solved for the human brain before 2040?

Will mechanistic interpretability be essentially solved for GPT-4 before 2030?

Will mechanistic interpretability be essentially solved for GPT-3 before 2030?

Will mechanistic interpretability be essentially solved for GPT-2 before 2030?

Will a model costing >$30M be intentionally trained to be more mechanistically interpretable by end of 2027? (see desc)

Will interpretability be commonplace in physics papers relying on machine learning by the end of 2025?

Will mechanistic/transformer interpretability [eg Neel Nanda] end up affecting p(doom) more than 5%?

Breakthrough in symbolic regression by the end of 2025?

Will I think that the Belief State Geometry research program has achieved something important by October 20th, 2026?

Related questions

Will mechanistic interpretability have more academic impact than representation engineering by the end of 2025?

Will mechanistic interpretability be essentially solved for the human brain before 2040?

Will mechanistic interpretability be essentially solved for GPT-4 before 2030?

Will mechanistic interpretability be essentially solved for GPT-3 before 2030?

Will mechanistic interpretability be essentially solved for GPT-2 before 2030?

Will a model costing >$30M be intentionally trained to be more mechanistically interpretable by end of 2027? (see desc)

Will interpretability be commonplace in physics papers relying on machine learning by the end of 2025?

Will mechanistic/transformer interpretability [eg Neel Nanda] end up affecting p(doom) more than 5%?

Breakthrough in symbolic regression by the end of 2025?

Will I think that the Belief State Geometry research program has achieved something important by October 20th, 2026?

© Manifold Markets, Inc.•Terms•Privacy