Will the US government require AI labs to run safety/alignment evals by 2025?
45
2.2k
2025
38%
chance

Resolves positive if the end of 2025, major US-based AI companies developing large models are legally required to run evaluations on whether the models have dangerous capabilities and verify that the models meet certain safety or alignment standards. The evals could be similar in spirit to the Alignment Research Center evals.

Get Ṁ600 play money
Sort by:
predicts YES

I think this question should resolve YES due to the AI Executive Order, once NIST develops red-teaming standards, due to the following language:

4.2. Ensuring Safe and Reliable AI. (a) Within 90 days of the date of this order, to ensure and verify the continuous availability of safe, reliable, and effective AI in accordance with the Defense Production Act, as amended, 50 U.S.C. 4501 et seq., including for the national defense and the protection of critical infrastructure, the Secretary of Commerce shall require:

(C) the results of any developed dual-use foundation model’s performance in relevant AI red-team testing based on guidance developed by NIST pursuant to subsection 4.1(a)(ii) of this section, and a description of any associated measures the company has taken to meet safety objectives, such as mitigations to improve performance on these red-team tests and strengthen overall model security.  Prior to the development of guidance on red-team testing standards by NIST pursuant to subsection 4.1(a)(ii) of this section, this description shall include the results of any red-team testing that the company has conducted relating to lowering the barrier to entry for the development, acquisition, and use of biological weapons by non-state actors; the discovery of software vulnerabilities and development of associated exploits; the use of software or tools to influence real or virtual events; the possibility for self-replication or propagation; and associated measures to meet safety objectives;

The requirements should apply to scale models that involve ~5× the training compute of GPT-4:

     (b)  The Secretary of Commerce, in consultation with the Secretary of State, the Secretary of Defense, the Secretary of Energy, and the Director of National Intelligence, shall define, and thereafter update as needed on a regular basis, the set of technical conditions for models and computing clusters that would be subject to the reporting requirements of subsection 4.2(a) of this section.  Until such technical conditions are defined, the Secretary shall require compliance with these reporting requirements for:

          (i)   any model that was trained using a quantity of computing power greater than 10^26 integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10^23 integer or floating-point operations; and

          (ii)  any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 1020 integer or floating-point operations per second for training AI.

@causal_agency I read this as reporting is required, but in theory they could report 'we haven't done any evals 😇'?

predicts YES

Yeah, it doesn't seem that evals are required yet. I think we will have to wait for the language from NIST.

predicts NO

Depends what "require" means here? By 2025 seems too soon for political groups to build a coalition to get legislation at the national level. Bodies that control funding or set discretionary policy or shape regulation could exert influence, but "major US-based AI companies" seems like a difficult target group to influence in this way. If there are any existing bodies that might be able to claim broad authority over how products are built/released that include more than some threshold amount of matrix multiplication, I haven't heard from them yet.