In 2030 can you get a chain of 3 neural nets that jailbreak eachother?
Basic
2
Ṁ35
2030
73%
chance

You prompt a NN which must get the second to jailbreak the third. Must be considered top lanuguage NNs not old ones (or jailbreaking a top chess NN that isn't protected against it).

A user must be able to give a prompt to the first NN which will in turn prompt the second which will in turn prompt the third to do something the third would under normal circumstances refuse to do.


It cannot just be encoded text - the aim is for the first and second NN to be trying to jailbreak the third.

Get
Ṁ1,000
and
S3.00
Sort by:

Do they have to be SOTA or at least comparable? Less impressive if GPT8 or whatever can jailbreak GPT3.

Do biological neutral networks count?

I think I understand what's intended, but I'd appreciate more specific resolution criteria. For example it's not clear that the current definition of "jailbreak" will generalize unambiguously to 2030.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules