Will we fund "Identifying good AI governance behaviours"?

585Ṁ5037

resolved Oct 7

Resolved

ALL

Will the project "Identifying good AI governance behaviours" receive receive any funding from the Clearer Thinking Regranting program run by ClearerThinking.org?

Remember, betting in this market is not the only way you can have a shot at winning part of the $13,000 in cash prizes! As explained here, you can also win money by sharing information or arguments that change our mind about which projects to fund or how much to fund them. If you have an argument or public information for or against this project, share it as a comment below. If you have private information or information that has the potential to harm anyone, please send it to clearerthinkingregrants@gmail.com instead.

Below, you can find some selected quotes from the public copy of the application. The text beneath each heading was written by the applicant. Alternatively, you can click here to see the entire public portion of their application.

Why the applicant thinks we should fund this project

This project will identify the behaviours needed to safely navigate the transition to a world with advanced / transformative AI.

The longtermist AI safety and governance community is starting to recommend behaviours that key actors should do. Some of these behaviours are explicit (e.g., Avin 2021) and others are implied by research agendas, programs, or theories of "failure" and "victory" (e.g., Christiano 2019; Dafoe 2020).

But there is no unified list of behaviours. There is also no information about which behaviours would be endorsed, critiqued, or rejected by those key actors. And most mainstream AI governance policy and practice is focused on narrow AI, which will be insufficient to address challenges unique to advanced AI.

This project will identify behaviours relevant to increasing the safety of advanced AI. It will convene researchers, users, and policymakers to evaluate and discuss those behaviours. The behaviours will also be compared with a current Australian voluntary framework for ethical use of narrow AI.

Identifying and discussing these behaviours is crucial to be able to raise awareness, coordinate action and measure progress towards or away from safe futures with advanced AI systems. Gaps or weaknesses in the framework revealed by the comparison will be used to generate recommendations for improving governance of advanced AI in Australia. The behaviour list and evaluation process will be published to improve work on AI governance internationally.

Expected outcomes

Outside the Australian context, this research process will support the articulation and measurement of actions focused on safe design, development and use of advanced AI. This would support researchers in AI governance / safety to conduct research for these actions to be critiqued, compared with national policies, and assessed in different jurisdictions.

In the Australian context (e.g., practitioners, engineers, policymakers), the key outcomes will be (1) an increase in knowledge about the issue of safe advanced AI, (2) identification and reflection on actions needed to design, develop and use safe advanced AI, grounded in existing policies such as the AI Ethics Framework and the Australian AI Action Plan.

This will be measured in surveys and interviews immediately following the workshops. More meaningful indicators of impact will be if the next iteration of AI safety / ethics policy in Australia includes explicit reference to advanced AI systems and the specific actions that can be taken to improve safe design, development and use of advanced AI.

How much funding are they requesting?

$49,797

Here you can review the entire public portion of the application (which contains a lot more information about the applicant and their project):

https://docs.google.com/document/d/1-ZZlj7AM_ErngBXhcqgw_Es9TjHabaFO4l2kG8V04-w/edit

Sep 20, 3:28pm:

Sep 20, 3:50pm:

Effective Altruism

Clear Thinking

Clearer Thinking Regrants

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ1,059
2		Ṁ270
3		Ṁ164
4		Ṁ134
5		Ṁ105

People are also trading

Will Anthropic be the best on AI safety among major AI labs at the end of 2025?

93% chance

By 2026, will there be a publicly reported instance of a AI social media algorithm displaying power-seeking behaviors?

9% chance

Will a major AI company publish a “responsible scaling policy” for AI consciousness by 2030?

55% chance

Before 2028, will there be a major self-improving AI policy*?

74% chance

Before 2028, will there be a major REFLECTIVELY self-improving AI policy*?

48% chance

OpenAI's for-profit pivot intentionally designed to provoke AGI governance debate before AGI?

11% chance

Will any state or autonomous region switch to AI governance, or majority AI decision making before 2050?

45% chance

AI honesty #2: by 2027 will we have a reasonable outer alignment procedure for training honest AI?

25% chance

Will National Governments Collectively Give More than $100M a year in funding for AI Alignment by 2030?

81% chance

By 2027 will there be a well-accepted training procedure(s) for making AI honest?

Sort by:

There is almost no evidence for a “hard take-off” and AI will have done less computation than humans at least through 2045-2065, possibly later. (This is 1,000x-1,000,000x more compute per dollar and aggregate compute than exists today.)

To the extent anyone today talks about “AI Ethics/Governance/Safety” it’s basically irrelevant.

(Useful for totalitarian control, encoding certain political biases into AI, and the like—but beyond irrelevant to “existential risk.”)

—/

The only paths to “AI Safety” involve limiting how much compute is available (Ie ending the push for better/faster semiconductors at a certain point, where further benefits are limited and risks are large—not necessarily of “bad AI” but bad actors leveraging it, at first.)

Frankly, no one seems to talk at all about “gain of function research” which today can (probably accidentally) do trillions in damage, and intentionally could do far more.

Presumably any form of utilitarianism not focused on “talking about cool, sci-fi things” would spend 1,000x or more on a known risk that is still recklessly pursued than a speculative one.

(And the fact that instead there is focus on “pandemic preparedness” further highlights the biases toward “cool/virtuous sounding things” instead of actually useful things.)

(long way of agreeing with Adam/BTE that it probably gets funded and probably shouldn’t)

I think you'll fund the project, and I think you shouldn't.

predictedYES

@Adam Go on?

predictedYES

@NathanpmYoung I'm disinclined to give an extensive defense of my position here as I don't think I'll convince you or anyone else, but the short version is that I think that to the extent that AI X-risk is an issue, research into mitigation is effectively pointless, and also think that AI X-risk is not likely to be an issue. I also feel that a large portion of rat-sphere people and EAs disagree with me on this and will happily fund projects associated with it that are much more poorly defined or more graft-adjacent than this one appears to be. the relatively small grant amount, the reasonable achievable scope, well-stated intended outcomes, etc, all make this seem like a prime funding target, given you agree with the following precepts:

1) AI X-risk is important
2) AI X-risk research can meaningfully mitigate AI X-risk
3) Australian AI researchers can contribute meaningfully to AI X-risk research.

I happen to disagree with 1 and 2 and have some doubts about 3 (possibly just related to my feelings on 2, hard to separate them out entirely).

Givewell estimates the requested grant amount can save 14.22 lives. I am of the opinion that the proposed actions will generate less value than that.

predictedYES

@Adam There are lots of risks associated with AI, but none of them come close to existential. So I agree with your assessment, but also think it’s likely to get funded because the EA community thinks the burden of proof is on the AI existential risk skeptics for some strange reason I can’t quite figure out.

predictedYES

@Adam After rereading this proposal I think it is inaccurate to imply this project is all about AI X-risk, or even primarily about that. Instead the focus appears to be AI Safety in a more general and pragmatic sense. It is unfortunate that much of the EA community has come to see the terms AI Safety, AI Alignment and AI X-risk as basically interchangeable when they are, in fact, each very different in the scope of activities and outcomes they encompass. AI Safety incorporates the near-term AI risks, which I won't get into in depth here, but suffice it to say these risks are all decidedly NOT existential in nature.

This should be obvious based on the performance measure you mentioned, only 14.22 lives saved. The work of this project will find those lives by increasing the knowledge of what could go wrong amongst those likely to be directly impacted, or worse liable. An example could be a patient whose doctor misused the sepsis alert and failed to prevent a deadly infection, but due to the knowledge gained generally across Australian society about basic safety risks such an outcome will result less frequently.

predictedYES

@Adam very happy to lose my bet here!