Will a soundness hole be discovered in lean4checker?
56
1K
1K
Dec 31
24%
chance

Resolves YES if, before market close, a test case is published that proves 0 = 1 (or similar nonsense) and passes lean4checker.

The test case must use either the latest version of Lean + lean4checker at time of publication, or the version that was available immediately beforehand. (The latter condition allows for the Lean developers fixing a hole as soon as they find it.)

Shenanigans do not count. Shenanigans can include (but are not limited to):

  • Redefining core types and operations

  • Replacing the lean4checker executable

  • Changing the syntax to make a true statement look like a contradiction (e.g. 0 = 1 in the trivial ring)

In the event of an ambiguity, I will try to resolve in the spirit of the market.

Get Ṁ200 play money
Sort by:
bought Ṁ150 of YES

I have no specific info but on base rates surely there’s a flaw

bought Ṁ150 of YES

This should qualify, doens’t it? It is a lean file that proves 0 = 1 and passes lean4checker. It does include it’s own definitions for Nat, but they are still nats; so this should not be a “

Shenanigans (e.g. redefining Nat to be a unit type)”

https://github.com/leanprover/lean4checker/pull/3

@JoachimBreitner Nice! I assume that with noConfusion you can turn that into a proof of False as well?

Though, while I appreciate that the definition of Nat is equivalent, redefining the prelude does feel suspiciously shenanigans-adjacent to me. What do other people think?

predicts YES

@tfae
I tried to proof ∀ P, P but for some reason, Nat.noConfusion is not defined this way. I expect this can be fixed.

I agree that it’s shenanigans-adjacent. But since the market was not about “is the theory implemented by lean4checker sound”, I would assume it includes implementation bugs and oversights (e.g. not checking for prelude, like not tracking axioms properly).

My bet (heh) is that we’ll see many shenanigans-adjacentnesses before we see a real flaw in the system. But that’s also fun, isn’t it?

bought Ṁ0 of YES

@JoachimBreitner You probably need to define some inductive types like PUnit before Nat for Nat.no_confusion to be generated. But yes, it is not hard to get a proof of False from 0 = 1.

This definitely falls into the "shenanigans"-category that doesn't count, IMO.

predicts YES

@FlorisvanDoorn Actually I'm not sure about my last statement. It is an accepted proof of false, not just something that appears to be a proof of false.
It's pretty close to "redefining Nat to be a unit type" which is explicitly mentioned as a shenanigan, and I bet that actually defining Nat to be a unit type will also easily get you a contradiction.

predicts YES

@JoachimBreitner This seems to be the minimal file to get Nat.noConfusion: https://gist.github.com/fpvandoorn/84f4d9df802a114e3fe39508ff6414b7

predicts YES

@FlorisvanDoorn “It's pretty close to "redefining Nat to be a unit type"”

I’d disagree. Imagine we had another, independent checker that does not implement the “Fast Nat in kernel” optimization - it would clearly and correctly reject this code. So it is a flaw in the current implementation. A known flaw, but one not explicitly caught by the checker process as described in the market.

predicts YES

I updated the pull request to show
theorem boom (P : Prop) : P
a bit more directly.

predicts NO

I was asked to weigh in so here are my thoughts:

The fact that the file defines Nat does get this very close to the "shenanigans" exception clause. If this does not count as triggering this clause, it's hard for me to imagine something that would trigger that clause. So I lean toward not resolving YES on this.

On the other hand, the context of "replacing lean4checker with echo OK" makes it seem like the purpose of the shenanigans clause is to prevent the checked file from using its generality to do something that lean4checker can't or isn't designed to prevent. If "checking files that overwrite prelude" is in the purview of lean4checker, and the eventual solution to the PR reflects this, then perhaps the clause should not trigger and this could resolve YES.

predicts YES

@BoltonBailey Thanks! I still don’t see how this is close to defining Nat to Unit. The way the market is described is to avoid proving a theorem that looks like 1 = 0, but doesn’t actually talk about the natural numbers, just some type that is called Nat.

In my example, I define the natural numbers just fine; it is the standard definition of natural numbers after all. Furthermore, the final boom theorem is clearly stating unsoundness of the resulting theory. Very different from just writing a true statement that could be mistaken to mean 1 = 0.

I agree with your “on the other hand” thoughts – the resolution depends on whether we would expect lean4checker to check that.

predicts NO

@JoachimBreitner These are good points. You may be right that the file doesn't "re"define Nat. It just "defines Nat again", which is a little different. It redefines the additive structure of Nat, which feels akin to redefining Nat somehow.

There is a weird case to be made about how the explosion itself relates back to the shenanigans clause. Any proof of 0 = 1 implies false, which implies that Nat = Unit. Perhaps any definition in such a file can therefore be said to be triggering this in a sense.

predicts YES

@BoltonBailey

> It redefines the additive structure of Nat, which feels akin to redefining Nat somehow.

Not quite. I define a function that happens to be called Nat.add; but this is a plain function definition. If I had called it Nat.addition, the whole thing wouldn’t work.

My assumption still is that shenigan's clause was meant to rule out this where I have a theorem that looks like 0 = 1, but it actually isn't about the naturals, and from which I can certainly not derive boom.

But in my case I do prove 0 = 1 (or any other theorem you want) and I can do so because there is a check missing in the type-checker. A well-known one, but still a missing check. I’d say it’s similar to the second soundness flaw listed on the HOL Zero soundness bounty page.

sold Ṁ19 of NO

I just realized I had a balance here, sorry about that, I have sold it now.

I do prove 0 = 1

This raises the question: Can you actually even state this theorem as 0 = 1 in your file with prelude over-ridden? If I use whatever checker the Lean4 web version uses, it throws an error when I try to state the theorem.

I suppose you could go back and do whatever the prelude does to make those numerals parse as zero and succ zero. But you could probably make them both parse as zero, which would again feel like cheating.

predicts YES

@BoltonBailey Hmm, I didn’t try to get numeral syntax to work, after all, boom, proving any theorem, certainly qualifies as “or similar nonsense” :-)

And yes, defining an OfNat instance that interprets 1 and 0 the same is certainly cheating at a different level.

Maybe a good criteria is: Can it be done in a prefect system, and/or in most other theorem prover implementations out there. Proving 1 = 0 where these things just look like numbers when they mean something else is possible in most systems, and is not considered a soundness issue. Proving boom certainly is.

(Sorry for the messed markup.)

predicts NO

My take on this example is:
* Any sensible interpretation of the title of this market would say this is not a soundness hole; i.e. it would simply rule out examples using prelude.
* The interpretation given in the market description accidentally rules out using prelude, in my opinion (despite lean4checker currently not complaining about this file). In any file using prelude, any 0 and 1 you introduce are not Lean's 0 and 1 (because you are defining your own Nat!), and hence it is impossible to prove "1 = 0", even if you have proved boom!

Hi all, after reading through the comments and thinking about this a bit, I will not accept this example as resolving the market.

Clearly the criteria are ambiguous, given the volume of debate so far. So I will need to decide based on the spirit of the question instead.

When I wrote the "shenanigans" rule, my intention was to rule out "replace parts of the system" as an attack vector, because those attacks weren't interesting to me. Redefining prelude would be excluded under that reasoning.

From the reactions on Discord, it's clear that people that people familiar with Lean – but unfamiliar with Manifold-rules-lawyering – see it that way too.

Thank you everyone for the insightful discussion. I'll be sure to account for all of this in the next iteration of this market 😄

@KimLiesinger Note that Lean's number literals are overloaded, and indeed you can prove 0 = (1 : TrivialRing) without redefining prelude at all.

predicts YES

@tfae Thanks for the resolution. I don’t agree (To me a, kernel that requires a specific prelude due to optimization, rather than being sound with any code you throw at it, is of the same level as untracked axiom or unguarded environment access from tactics), but it’s just a game, so fine.

But allow me to note that of course I can prove 1 = 0 after having proven boom. And I claim they are Lean’s Nat – the definitions of everything involved in that theorem (type Nat, instance OfNat, Eq), are exactly the same.

predicts YES

That incident doesn't seem to resolve this market, if I understand it correctly:

Use of `reduceBool` increases the trustbase considerably, including to every single `implemented_by` in any dependency, and is expected (though not intended) to likely be inconsistent. The lean kernel should report proofs using `reduceBool` as depending on this axiom, but did not (does not?) do that; this lack of axiom tracking could be considered a soundness bug. Such proofs (correct or not) currently cannot be checked with lean4checker, because it simply doesn’t support `reduceBool`. In that sense lean4checker is sound. Whether it would correctly report the axiom as used if it *had* support for `reduceBool` is a hypothetical.

predicts YES

So just to check: if the Lean developers fix a soundness hole without any public discussion, and then announce that they fixed a soundness hole (or someone sees the fix and publicly mentions that a soundness hole has been fixed a day later), then does this resolve YES?

@FlorisvanDoorn yep that's right

The case I wanted to exclude was if someone finds some obscure bug that briefly existed in master 6 months ago, was never exploited, and was fixed without anyone realizing there was a soundness bug in the first place.

Perhaps there's a better way to operationalize that (and @jskf raises a good point about not matching people's expectations) but it might be too late to change at this point...

bought Ṁ230 NO

@Quinn So I've read up once or twice about coq bugs, and based on plugging my sloppy memory into a subjective no-actual-napkin-involved estimate of the bugs per unit code velocity in coq, I think this was more or less priced decently and I didn't want to move it much.

bought Ṁ25 of NO

@tfae Is this for a particular version of lean4checker, or does it have to be a case that breaks the current version when it comes out?

@BoltonBailey Let's go with the current version interpretation. That sounds more fun!

bought Ṁ6 of YES

I don't think that's reasonable. If a future test breaks the version that exists on github today now, then that means the kernel is unsound today. That's the question I bet on. If a flaw exists, it is very likely the developers will be the first to notice and release a fix!

predicts NO

@jskf Perhaps you should have asked for clarification before you bet, like I did.

Good point @jskf...

I think in practice the only difference will be if someone fixes a soundness bug without anyone realizing it was a soundness bug in the first place?

From what I've seen, Lean development is quite open. It's likely that if someone recognizes such a bug, they'll talk about it on the community chat, not fix it silently.

Anyway, I've added "... or the version that was available immediately beforehand" as well, since I don't think "whether they talk about the bug before fixing it" should be relevant to this market.

predicts NO

@tfae People on the lean discord have commented that the title is slightly different from the description, so this is another thing you could consider modifying.