"Outer alignment" as in the model is not incentivized to lie to humans (of course it must still do things, the question isn't just about can you build an AI that doesn't lie)
Can you explain what the difference is from your other question here: https://manifold.markets/vluzko/by-2027-will-there-be-a-wellaccepte ?
Does the other question have to handle inner alignment?
@LeoGao Lying due to other things does not count. For this market the system does not need to be close to SOTA. I will not accept trivial/very stripped down solutions, but otherwise I will be quite forgiving. For instance if someone came out with this today for RNNs I would resolve YES. (The target is a procedure for some architecture/problem setup that can in theory generalize even if no one has achieved that generalization in practice).