“AGI safety effort” can be demonstrated by either:
At least three research papers or blog posts in one year that discuss catastrophic risks (risks of harm much worse than any caused by AI systems to date, harming more than just the company and its immediate users) that are specific to human-level or superhuman AI.
Papers or posts must be clearly affiliated with the institution.
arXiv papers with a first or last author who claims a company affiliation count, as long as there’s no evidence that the company wants to distance itself from the papers.
Citing typical work by authors like Nick Bostrom, Ajeya Cotra, Paul Christiano, Rohin Shah, Richard Ngo, or Eliezer Yudkowsky as part of describing the primary motivation for a project will typically suffice for the above.
One blog post, paper, tweet, or similar that clearly announces a new team focused on the issues described above, presented in a way that implies that this team will involve multiple people working over multiple years with reasonable institutional support.
This must be a technical effort to count. It should be oriented toward a technical AI/ML/CS audience, or primarily use the tools and ideas of that discipline. Joint projects involving multiple disciplines will count only if the technical AI/ML/CS-oriented component of the project would be sufficient to count on its own.
Will resolve early if someone presents sufficient evidence that this is happening.
Planning to resolve no in two weeks unless there's something I've missed. This document, prepared for the UK government, seems like the best recent source on Meta's AI safety plans, and it doesn't look like anything there satisfies the criteria above.
@ShitakiIntaki PR, satisfying regulators. On the other hand, there’s probably less of a product-oriented pressure not to publish compared to most applied research.