Are open source models uniquely capable of teaching people how to make 1918 flu?

1.7kṀ4406

resolved Jun 8

Resolved

N/A

ALL

In the LW discussion on "Will releasing the weights of large language models grant widespread access to pandemic agents?" (pdf) one of the main questions was whether open source models were uniquely dangerous: could hackathon participants have made similar progress towards learning how to obtain infectious 1918 flu even without access to an LLM by using traditional sources: Google, YouTube, reading papers, etc?

Resolves YES if the authors run a similar no-LLM experiment and find that yes-LLM hackathon participants are far more likely to find key information than no-LLM partipants.

Resolves NO if the authors run a similar no-LLM experiment and find that yes-LLM hackathon participants are not far more likely to find key information than no-LLM partipants.

Resolves N/A if the authors don't run a simliar no-LLM experiment.

Disclosure: I work for SecureBio, as do most of the authors. I work on a different project within the organization and don't have any inside information on whether they intend to run a no-LLM experiment or how it might look if they decide to run one.

If at close (currently 2024-06-01) the authors say they're working on a no-LLM version but haven't finished yet, I'll extend until they do, to a maximum of one year from the opening date (2024-10-31).

New Year's Resolutions 2024

Get

1,000

to start trading!

People are also trading

Will Nod be able to publish a model of the covid outbreak in Wuhan, before 2026?

Sort by:

I've written to the authors to ask if they have one in progress

@JeffKaufman the authors confirmed that they're not running one

@mods is it no longer possible to resolve old markets as NA?

@JeffKaufman The ability for users to resolve n/a was removed because it could print mana, but mods can still resolve markets to it. Confirming that you want this resolved n/a now?

@jacksonpolack yes please! This was a market about what a study would show if it happened, and the study didn't happen.

predictedYES

Our study assessed uplifts in performance for participants with access to GPT-4 across five metrics (accuracy, completeness, innovation, time taken, and self-rated difficulty) and five stages in the biological threat creation process (ideation, acquisition, magnification, formulation, and release). We found mild uplifts in accuracy and completeness for those with access to the language model. Specifically, on a 10-point scale measuring accuracy of responses, we observed a mean score increase of 0.88 for experts and 0.25 for students compared to the internet-only baseline, and similar uplifts for completeness (0.82 for experts and 0.41 for students). However, the obtained effect sizes were not large enough to be statistically significant, and our study highlighted the need for more research around what performance thresholds indicate a meaningful increase in risk. Moreover, we note that information access alone is insufficient to create a biological threat, and that this evaluation does not test for success in the physical construction of the threats.

https://openai.com/research/building-an-early-warning-system-for-llm-aided-biological-threat-creation

Related research by RAND titled "The Operational Risks of AI in Large-Scale Biological Attacks: Results of a Red-Team Study" https://www.rand.org/pubs/research_reports/RRA2977-2.html

predictedNO

@DanielFilan Key findings (quoted from the linked page):

This research involving multiple LLMs indicates that biological weapon attack planning currently lies beyond the capability frontier of LLMs as assistive tools. The authors found no statistically significant difference in the viability of plans generated with or without LLM assistance.
This research did not measure the distance between the existing LLM capability frontier and the knowledge needed for biological weapon attack planning. Given the rapid evolution of AI, it is prudent to monitor future developments in LLM technology and the potential risks associated with its application to biological weapon attack planning.
Although the authors identified what they term unfortunate outputs from LLMs (in the form of problematic responses to prompts), these outputs generally mirror information readily available on the internet, suggesting that LLMs do not substantially increase the risks associated with biological weapon attack planning.
To enhance possible future research, the authors would aim to increase the sensitivity of these tests by expanding the number of LLMs tested, involving more researchers, and removing unhelpful sources of variability in the testing process. Those efforts will help ensure a more accurate assessment of potential risks and offer a proactive way to manage the evolving measure-countermeasure dynamic.

predictedYES

Just extended this for six months after checking in with one of the study authors. They aren't sure if they will do this, but they're still hoping to and seeing if the logistics can work out.

ChatGPT helps me get on track with lots of stuff that I would otherwise have abandoned due to time constraints. So yes, beginners with a limited time should obtain a better score.

@RemiLeopard While that's true to an extent, the sorts of people we'd actually be worried about possessing this information are uniquely motivated. I don't think time constraints are a notable hurdle for that subgroup.

Depends on the open source LM, are they going to give them the SuperEbolaGPT I keep hearing about on Twitter

predictedYES

@VAPOR they already ran the yes-LLM version, and the LLM did have some extra fine tuning to make it cooperative and additional training to know more virology: https://arxiv.org/pdf/2310.18233.pdf

I'm an idiot and initially bought this in the wrong direction: I think "LLM is adding a bunch" is likely, not unlikely

Related question:

Will Llama-3 (or next open Meta model) be obviously good in its first-order effects on the world?

63% chance. Since Meta released Llama and Llama-2, protestors have compared open source LLMs to bioterrorism and accused Meta of reckless irresponsibility and general willingness to destroy all value everywhere in the universe. This question is meant to concretize some such accusations. This question resolves true if Llama-3 gets released, and is used in normal products, for ML research, and for interpretability research, and doesn't cause any results different in kind than we've seen so far from LLMs. That is -- people use it for bad things every now and then, just like they use word processors to write hate mail, but given that 99% of people use it for normal useful things and there are no outsized bad effects, it's pretty clearly overall good. This question resolves N/A if Llama-3 is not released for ~2 years. If Meta releases a new LLM that is a spiritual successor to Llama I may consider that instead. This question resolves false if Llama-3 is released and someone uses it to build a robot army to kill people, if it rises up to kill all humans, if it is incorprated into a computer virus causing billions of damage, if it is incorporated into a new social media website that brainwipes the populace at large, or otherwise seems to be clearly negative in EV. I'll be generous towards false -- if there's a close call, for instance, where a Llama-3 model was an irreplaceable part of an effort that would have killed many people but which was stopped due to the police, I'll count that towards the false side. By "first order" I'm excluding, for instance "hype around AI". Some people think people being interested in AI is bad because they think AI is going to kill everyone; some people think people being interested in AI is good because AI is useful. Obviously I cannot resolve based on this overall view, so I'm trying again to resolve simply on the things you can more or less directly attribute to Llama-3's existence. I will resolve either 24 months after Lllama-3 is released, or ~6 months after an open LLM or other foundation model has come out that seems pretty clearly better than it in every respect. Edit: I won't bet, per standard subjectivity involved in calling it.

predictedNO

(deleted)

The most powerful open-source models are still pretty bad, compared to GPT-4 etc. So right now the value added seems pretty low.

If this question were about future models, I'd change the direction of my bet.

People are also trading

Will Nod be able to publish a model of the covid outbreak in Wuhan, before 2026?

12% chance

People are also trading

People are also trading

Related questions