This is a solution to alignment.
40
Ṁ28k
Jan 1
33%
chance

A game theoretic solution to alignment would be to create a function that rewards 2 Bitcoin for demonstrating a proper understanding of the "AI situation" (a sort of Eliezer litmus test for whether a person has a firm grasp of the risks posed by AI and what they can do to stop it) to a humanity verified public ledger (intellectual consent blockchain verification (a sort of crypto credential that's designed to train AI on what different people consent to being true)).

This function can be built.

Building it would also transform the way the world communicates and transform the economy into one that primarily rewards education.  As in, people can pay other people to learn things.  This will scale to replace all advertising, journalism, media, peer-review, educational institution, government and all communication platforms.  It will create a true meritocracy and ensure true free speech (the ability to put an idea into the public domain for consideration) with plenty of resources for everyone on the planet to live an amazing life.

Then.  After everyone is rich and gets along.  We use that data to train AI instead.

Will resolve yes given Eliezer's consent.

Will resolve no given my consent.

I pledge on Kant’s name to try as hard as I can to consent.

If someone can supply the Bitcoin, I'll build this.

If you think that's crazy, please explain why or bet against me.

Thanks 🤓

Get Ṁ1,000 play money
Sort by:

everyone is rich and gets along. We use that data to train AI instead.

Your solution does not address various ways in which AI would ruin the world.

  • misalignment prevention: the goal is not to build an aligned AI, it is to make sure no one ever builds a misaligned, powerful AI. What do you do with your AI that is trained on near-utopian data?

  • deadline: building cryptotopia is a multi-decade project, before which misaligned AIs have probably already been deployed.

  • out-of-distribution (OOD) misbehavior: once society is being influenced by your AI, it will start to look vastly different from what the AI's training data looked like. Our current systems are known to be untrustworthy when dealing with OOD data and ensuring OOD robustness is an open problem you don't address.

  • reward hacking: AIs don't look at problems like humans do and might generalize differently from the data then what we intuitively would expect.

me pre-empting some rebuttals

You didn't address the crypto part.

Yes, the above problems persist even if the crypto scheme improves society.

The only thing you try to solve for is getting high quality data, but the alignment problem is by far not only the bad-training-data problem (which I don't claim that you solve).

You speak of preventing AGI-induced catastrophe, I'm just solving the alignment problem.

That's what the alignment problem is.

I can address your points by [raising something new].

I'll hear you out, please resolve your market to NO though. Or inferiorly, edit the market description to say that the "this" in "this is a solution to alignment" does not refer to the market description.

You ought invest more into my idea by reading / watching 60+ minutes of [recommended stuff].

I plausibly spent more time responding to your market than you did on making it. Read Yudkowsky's AGI Ruin post, since that addresses a variety of ways in which alignment plans like yours tend to fail before you ask me to spend more time.

Nonetheless, thanks for at least trying and good luck.

I'd love a market "will anyone think this market resolved fairly" resolves by poll or something.

What’s stopping someone from claiming the “I understand” prize multiple times?

Great question. Humanity verification is critical.

Users creating multiple accounts to earn extra profit plays a large role in why my work is potentially infohazardous.

As for the same 'humanity verified user' making duplicate claims, that's not really how the program works.

It's similar to asking what prevents a given user from 'liking' a comment on X multiple times.

The basic design of the system?

Biometric verification? Our devices already have the ability.

I'm talking about humanity verification on a decentralized ledger.

Something like worldcoin.

Even if everyone understood the "AI situation" how does that solve alignment? I think it might even increase doom probability because more people would become interested in AI.

bought Ṁ1,000 YES

It's the 'humanity grows up' possible future.

Not claiming it's guaranteed, just that creating this program drastically increases the odds.

Could work. $40 would be a good start to make everyone just watch a short video and demonstrate understanding. I think a lot of relevant information could be packed into one super high quality video. The idea would then obviously be that enough people will make governments slow down or stop AI capabilities research such that sufficient safeguards can be put in place.

I'm not sure whether there aren't less expensive solutions, however.

Economists can write 100s of papers confirming huge negative externalities to humanity and society from certain things and more than 51% of people will still vote, not just against Pigouvian taxes, but for subsidies of these things.

bought Ṁ1,000 YES from 27% to 44%

Right, that's because we haven't paid them to consent to particular propositions, within those writings, being true in a way that's legally binding. In a way where representatives can use that data to legislate on their behalf.

If we had that, these situations would not occur.

That's literally how these Pigouvian taxes often work (with rebates from the tax) and it still occurs.

bought Ṁ500 YES from 39% to 45%

The distinction you are not recognizing is the set of people that get to choose which (Pigouvian taxes / incentive mechanisms) to create.

Sure, once you have control of the government, the media and everyone's attention, you could pick some safe topics for individuals to earn some credit for exploring.

What I'm talking about is the ability for any person on the planet to offer monetary incentives to any set of individuals they select to look at whatever information they choose without needing any governing body to oversee it.

The power lies not in the revenue gained from learning the curriculum.

The power lies in being able to define the curriculum.

There's nothing stopping anyone today from paying people to listen to whatever they have to say. Haven't you heard of time-share scams?

@Krantz what keeps this from settling into the equilibrium where e.g. coca-cola pays many times more for young people's eyeball time than Khan Academy's entire operating budget?

bought Ṁ1,000 YES from 14% to 47%

Thank you for the legitimate question.

It isn't about 'eyeball time' it's about acknowledging steps of inference.

If Verizon is aimed at 'teaching' its consumers that their new phone is waterproof, then that requires individuals to understand their phone is waterproof. That's the objective Verizon has. Currently, they need to blast a ton of advertisements because they do not know which consumers actually saw or paid attention to their ad.

This is different with Khan Academy. There are vastly more points of verifiable information in Khan Academy's 'advertising campaign'.

If I were to ask both Verizon and Khan Academy, "What does the constitution of facts that you would like to get into people's heads look like?", although Verizon's budget for investment will be significantly higher, their constitution will not be nearly as large or profitable in the machine I'm aiming to build.

What it boils down to is 'How much is Verizon willing to pay you to acknowledge the fact that you understand their phone is waterproof?' (This takes a couple of seconds) vs 'How much is Khan Academy (though I'm sure your neighbors will want to contribute) willing to pay you to acknowledge all the facts required to demonstrate you have a good general education?'.

Isn't this like a terrible conflict of interest because the creator is also the person whose idea it's about, and the resolution criteria are very ambiguous, and that person (the creator / subject) also has a bunch of YES bets?

I agree. I'd much rather defer resolution to @EliezerYudkowsky.

To bad that isn't an option on the platform..

He's too busy to listen to me though.

Maybe bet against my proposal on his prediction instead?

We create a 'truth economy'.

https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence-539844cd3ba1?r=S3JhbnR6

I mean... you could just say in the market description that you'll ask him and resolve it based on what he says

I've been trying to get his attention (or anyone with 1/3 of his domain knowledge) for many years to charitably look at my work.

If I had that already, I wouldn't need to post predictions like this.

https://manifold.markets/Krantz/if-eliezer-charitably-reviewed-my-w?r=S3JhbnR6

Pick some powerful entity (eg the Chinese government, Microsoft, or the Catholic church)

Do you think your solution could align that entity to humanity, or at least to it's own citizens/customers/followers? If not, why would ASI be different? If so, how would that go, and how could one bootstrap the process?

bought Ṁ2,000 YES from 9% to 48%

Yes, I believe they can be aligned. You align a powerful entity like that by aligning the individual members that make up the group to perform the game theoretic actions that cause alignment.

For example, there might be an event 'E' that a corrupt government wants to occur while the vast majority of the citizenry do not want it to occur (this could be a particular bill, issue or task like sealing a physical boarder). One way to ensure this happens is to achieve a verifiable public record that the following individuals agree with the following propositions.

Vast majority of citizens:

# 1 I understand how Krantz works.

# 467952 - Event 'E' is significant and requires immediate action from the government.

#3497829 - This proposition is intended to provide record of my intention to support the specific legislature (insert bill here that defines action to be taken on 'E', could also be another proposition in the ledger).

Official congressmen:

#34875 - A majority of the citizens you represent support #3497829.

#68723 - As a congressmen you have pledged to uphold the verifiable requests of a majority of members of your district.

I could continue, but hopefully this is enough to get the point.

I'm talking about a fundamental change in the way we communicate in the public square.

It's a change that I think will accelerate 'communication about complicated ideas over the internet' so fast, that it will eliminate our understanding of traditional media, education, government, companies and essentially every other mechanism who's primary job in the end is to control the sharing of ideas.

That is the primary function of government. We give them tax dollars and they figure out the good stuff that everybody wants done, figures out who's qualified to do those things and then gives the money to those people. I think we can do all that on the internet. Surly this is where some people see AI and blockchain headed.

How that transition happens, is a much longer conversation.

'The thing that's in charge' is a complicated thing now. It used to be people. Kings, presidents.

Now its ideas, its technology, its infrastructure.

If (Bitcoin) wanted to make Donald Trump say the words 'Skibidi toliet' it could.

That's interesting to me.

The only thing actually required to bootstrap the process is to get the idea to the right people, privately. That's nearly impossible. It complicated and requires knowledge from several domains. I'd put my odds at around 20%.

Why do you need the blockchain? I'm not that familiar with crypto, but this seems just a smell test with a money prize.

Read this book maybe..

https://thenetworkstate.com/

Balaji's book is... fundamentally unserious, imma be honest.