The following is an argument for why AI safety organizations should consider my work. If Eliezer is not compelled by this argument, which proposition will he deny?
Will resolve if @EliezerYudkowsky claims to deny any proposition by number in the comment section of this prediction or agrees to review my work.
1. If AI develops the capability to control the environment better than humans, then humanity is doomed.
2. If we continue to scale AI capabilities, then it will eventually be able to control the environment better than humans.
3. 1 and 2 imply that if we continue to scale AI capabilities, then humanity is doomed.
4. We should not be doomed.
5. 3 and 4 imply that we should stop scaling AI.
6. If every person on the planet understood the alignment problem as well as Eliezer Yudkowsky, then we would not scale AI to the point where it can control the environment better than humans.
7. People only understand the things they have learned.
8. People learn the things that they have obvious incentives to learn.
9. 6, 7, and 8 imply that if people have sufficient and obvious incentives to understand the alignment problem, then we would not scale AI to the point where it can control the environment better than humans.
10. It is possible to build a machine that pays individuals for demonstrating they’ve understood something.
11. If individuals can see that they will earn a substantial cash reward for demonstrating they understand something, they will be incentivized to demonstrate they understand it.
12. 10 and 11 imply that it is possible to incentivize people to understand the alignment problem.
13. If a majority of people understood the actual risks posed by scaling AI, then they would vote for representatives that support legislature that prevents the scaling of AI.
14. 9 and 13 imply that if we sufficiently incentivize understanding of the alignment problem, then people would take action to prevent dangerous AI scaling.
15. If your goal is to prevent the scaling of dangerous AI, then you should be working on building mechanisms that incentivize awareness of the issue.
16. Krantz's work is aimed at building a mechanism that incentivizes the demonstration of knowledge.
17. 5, 12, 14, 15 and 16 imply that if your goal is to prevent the scaling of dangerous AI, then you should review the work of Krantz.
18. If AI safety orgs understood there was an effective function that converts capital into public awareness of existential risk from AI, then they would supply that function with capital.
19. 17 and 18 imply that Eliezer Yudkowsky and other safety organizations should review the Krantz system to help prevent doom.
@NivlacM I'm glad you liked the format. If Elon looked at my algorithm, we could transform X into a compiled feed of arguments of this sort. Then, we could allow everyone to earn a living by arguing with the world's most comprehensive collection of arguments. A machine that knows all your priors and which goal propositions you aim to prove, it could recommend the optimally accepted path to the things you don't know and were deemed beneficial by others. The verification of propositions could then also be used as a demonstrable social contract for the truth that rewards public consent. Individuals could effectively argue against any given person's set of priors (It would be super helpful if Eliezer earned a living by writing all of his arguments in a database I could access.) It's like building a machine analytic philosopher that pays you to evaluate its arguments.
What's your confidence in the following propositions?
It is possible to build a machine that demonstrates an individual understands something.
If 10 is false, then degrees and credentials are worthless.
@Krantz i suppose it's a matter of how certain you need to be that an individual truly understands the topic. And maybe I'm thinking of belief more than understanding. One can understand all of Yudkowsky's work but still disbelieve that AI is dangerous
@NivlacM I just want a lot of people to wager (with money sponsored from individuals that think AI is dangerous) on the issue by assigning a confidence to each individual proposition of Eliezer's argument aimed to predict which way consensus will resolve. Along with every other topic people want to gain support for.
13 (and, by extension, 6) seem to me to be obvious candidates for rejection. Aside from the fact that we don't know with certainty that scaling AI will result in existential catastrophe, even if we did know that with certainty, people often simply don't vote rationally (or, perhaps more importantly, altruistically). It would not be at all surprising to me if, even if everyone on Earth knew with certainty that scaling AI would wipe out humanity in 50 years, a majority of people nevertheless voted for a political party that was pro-AI-scaling but also promised to cut taxes/inflation in the interim, for example.
I don't think that's the only (or even the primary) reason. It seems to me that two much more challenging hurdles are hyperbolic discounting and general selfishness. Even if people know that a huge harm is definitely coming at some point in the future, they might nevertheless ignore it in order to focus on avoiding some smaller harm in the interim. And then perhaps more problematic, you have a big percentage of the voting population who simply aren't adequately motivated to avoid the huge future harm because they're old and won't be around to suffer from it.
A couple of gaps that I see right off:
Any incentive ≠ enough incentive
2 bitcoins has come up elsewhere. Sounds like a lot, until you compare to an average salary at OpenAI. No matter how much the incentive, you're up against a change bigger than the industrial revolution; it will have bigger incentives.
It is difficult to get a man to understand something, when his salary depends on his not understanding it.
The Krantz system aims ≠ the system achieves
You skip some steps: what the mechanism is, how it's funded, and evidence that it would be more effective than existing systems. I'm an advocate for off-label uses of prediction markets: insurance contracts, news discovery, etc. This sounds similar, but you don't discuss the difficulties already known in those systems, and you introduce more. You've mentioned elsewhere that your system needs proof-of-humanity, this is an unsolved problem despite many IQ-hours behind it, and it doesn't even make this list! A system that achieves what you're suggesting would require several major research breakthroughs. I'd love to see it happen, but I would not bet on this avenue working out.
1. In '9' I refer to 'sufficient and obvious incentive'. The 2 btc value was an example amount aimed at the general public (primarily individuals that are struggling financially). The actual amount that individuals would be rewarded would be determined by the market. In other words, the more important it is for someone to demonstrate consensus of a particular truth, the more the market will allocate to that particular verification. This value fluctuates as demand for particular truths become more or less controversial.
Let's look at an example. Assume I started a charity that was aimed at raising money to be directed at getting Yann Lecun to consent publicly to the proposition that 'It was dangerous to open source Llama 3.1 405.'. Assume we raised 1 million dollars and gave that to Yann, only to be used as a free wager on the prediction 'In 2030, a majority of experts will look back and agree that there was significant justification for believing with greater than 90% confidence that Llama 3.1 405 would not provide the foundation for nefarious actors to produce existentially risky artificially intelligent systems'.
What's important here, is not the amounts or the specific propositions I'm using, but the general principle that individuals can invest capital in a way that incentivizes any particular user to engage with and take a public stance on any particular crux issue that they would typically avoid asserting a public position on.
2. To determine whether or not the krantz system achieves what it aims to, requires review. That's what I'm hoping to get. Whether it would be more effective than existing systems, I would claim no systems currently exist (a mechanism that allows me to offer a sponsored wager only for the prediction of my choosing). Overall, there are several aspects of this project that will require further development. I don't think that suggests one should write it off. On the contrary, it seems to suggest it could use some funding and collaboration.
Thanks again for the feedback Rob!
In case anyone has a couple extra million dollars laying around and wants to send Yann a voucher to wager on this prediction..
https://manifold.markets/Krantz/in-2030-a-majority-of-experts-will?r=S3JhbnR6