Assert vs Check vs Enforce vs Constrain in Mechanical
Check
, Assert
, and Enforce
are all runtime error checks, but with very different semantics:
Check
statements let you check for runtime error conditions and early-return, usually#err _
or#none
.Assert
statements let you check for runtime error conditions and crash, to impose function preconditions or declare code invariants.Enforce
statements also check for runtime error conditions and crash, to enforce operational requirements of your program.
Check
statements let you early-return #err
values:
user = (name, email) => {
Check name.length > 0 Else #err #empty_name
Check email.is_valid_email() Else #err #invalid_email
Return #user {name, email}
}
You can also use them for any other value, most commonly #none
. Since they're meant for early-returns, they're only allowed at the beginning of a function, before other statements (but after Constrain
/Assert
/Enforce
statements).
Assert
statements let you impose preconditions on function arguments:
user = (name, email) => {
Assert name.length > 0
Assert email.is_valid_email()
Return #user {name, email}
}
This has the same semantics as bounds-checking on array access: self-test will perform static analysis to find ways to trigger a crash. They can also be used for general code invariants:
Do x = Get user_inputted_number()
Do y = Get user_inputted_number()
z = x*y
sqrt_z = Math.sqrt(z)
If x < sqrt_z:
Assert x < y
Do something_that_errors_unless_x_less_than_y(x, y)
The compiler isn't smart enough to know that between x
and y
, one must be less than sqrt_z
and the other greater (unless all 3 are equal), so without the Assert
it will complain about the error, but you know it's true, so just use an Assert
. The compiler won't be able to prove otherwise (because it's true), and all will be well.
If you Assert
something that's actually wrong, the compiler will error if it notices, although it usually won't, it's pretty dumb because it's designed to be fast. The awesome thing about Assert
ions, though, is that they provide exactly the information that slower, smarter static analysis tools need to look for bugs and/or prove correctness: fuzz testing, generative testing, QuickCheck-style tools, model checkers, symbolic execution can all be used to look for assertion violations; SMT-based tools like symbolic execution and LiquidHaskell-style refinement typing can be used to identify assertions that can and can't be proven correct, so you know which to skip and which to focus your testing and auditing efforts on.
You could have CI run fuzz testers; or you could even have CI try to prove the correctness of every assertion, and have an allow-list of assertions allowed not to be provable--if a new unprovable assertion shows up, that's a CI failure that can only be fixed by refining assertions so that it becomes provable, or adding it to the allow-list and also double-checking the codepath and adding more tests for it.
Enforce
statements also check for runtime error conditions and crash. They are meant for enforcing operational requirements of your program, so their condition is required to depend on some kind of external input or the result of an external effect. If static analysis can prove they won't crash regardless of external input, then a redundant Enforce
statement will be reported. Note that static analysis treats Assert
and Enforce
almost opposite: it complains if an Assert
can crash, whereas it complains if an Enforce
can't crash. (They're not fully opposite because if static analysis can't prove either way, it leaves both be.)
Enforce
statements let you enforce operational requirements:
Do age = Get user_inputted_number()
Enforce age >= 0
x = Math.sqrt(age)
Do something_that_errors_if_x_is_NaN(x)
If you use Assert
instead of Assume
here, the compiler will tell that's wrong, user_inputted_number()
can be any number, positive or negative. But what if your desired runtime behavior if age
is negative is to raise a fatal error, because after all human ages are non-negative? That's where Enforce
comes in.
Whereas Assert
is merely nudging the compiler towards invariants already expressed in your code, Enforce
allows you to expression intentions that aren't expressed anywhere else in your code (in this case, that this code is only intended to work on non-negative human ages).
Whereas static analysis tools will seek out possible assertion violations (to let you know that an invariant that you thought was expressed in your code, wasn't), they will normally ignore possible requirement enforcement violations and any assertion violations. The exception is requirements that provably cannot be satisfied, like if the user input was from something that returns only positive numbers, then Enforce age < 0
would be an error. But if it's satisfiable under at least some circumstance (e.g. Enforce -10 < age < 10
), then it'll be allowed.
Because Enforce
has such a strong silencing effect on static analysis tools, there's an affordance to discourage misuse: it's only allowed on "external input" (TBD: decide on the best terminology for this). For example, this is an error:
Do age = Get user_inputted_number()
x = Math.sqrt(age)
Enforce !isNaN(x) // error: x is not external input
Do something_that_errors_if_x_is_NaN(x)
What if you have a bug in calculating x
from age
? If Enforce
s were allowed everywhere Assert
s were, you might intended to use it temporarly, forget about it and accidentally leave it in, and unknowingly silence important static analysis errors.
Because Mechanical has very strict semantics about purity and determinism, the compiler can statically trace the data flow of all externally-originated data. The current plan is for Enforce
to only be allowed at package boundaries (so the Enforce age >= 0
is an error if user_inputted_number
is defined elsewhere in the same package, but allowed if it's a system built-in, or imported from an external library package and the data it returns comes from user input).
(One could imagine it working at module boundaries instead? Maybe with a ^module_boundary
annotation? TBD)
Finally, Constrain
statements statically preclude error conditions. They're meant to be used just like Assert
but static analysis has to affirmatively prove that they won't be violated, and they're elided by the compiler whereas assertions are compiled to runtime error checks. They're safer than assertions since they're checked statically and cannot crash at runtime, but static analysis is very limited in its ability prove things, so nontrivial constraints will burden the programmer with proving that they're satisfied. Assert
does function as an escape hatch though, for example if you're calling a function with a precondition on an argument declared as a constraint, you can simply assert that the argument does satisfy the constraint before passing it to the function, then the constraint will be statically known to satisfy the constraint since you just did a runtime check on it immediately prior. From this point of view, constraints simply increase the friction of providing arguments that aren't provably correct, and hence encourage you to prove your code correct without necessarily requiring it.
Besides boolean expressions with a runtime meaning, constraints can also be custom predicates defined by Certify
statements, which certify custom predicates on function result values. This is inspired by Ghost of Departed Proofs, but using our contract system instead of the type system.
Definite Error (provable) | Possible Error (can't prove otherwise) | |
---|---|---|
Constrain | fails self-test | fails self-test |
Assert | fails self-test | allowed by self-test |
Enforce | allowed by self-test | allowed by self-test |
All 3 have nearly-identical runtime behavior, which is no-op if a condition is true, otherwise raise a fatal runtime error. (The runtime checks can ofc be omitted by compiler config options.)
However, they are treated very differently by static analysis. Briefly:
Assert
tells the compiler something it doesn't know (but in theory could figure out from the code, if it were smarter)Assume
tells the compiler something it couldn't possibly know (because it can't be deduced from the code even in theory; no matter how smart the compiler is, it can't read your mind)Assume ^force
tells the compiler something it has good reason to think is actually wrong (as a general rule, avoid this one)
This is an escape hatch from the "external input" restriction on Assume
. You can use it anywhere as long as it's satisfiable under some circumstance (if it clearly cannot ever be true, then it's still an error regardless of the ^force
annotation).
I can't think of an example where this would be necessary, but Mechanical always provides escape hatches.
TBD: maybe there should be a few "levels" of annotated assumptions?
Assume ^module_boundary
is only allowed at module boundaries; may makes sense for really large projects or somethingAssume ^internal
is allowed if there's clearly external data involved, even if it's not right at the external boundary, for exampleAssume ^internal !isNaN(x)
would've been allowed above. However, this isn't allowed inside an internal function that is called both with internal and external data; this is only allowed if there is necessarily external data involvedAssume ^force_internal
is allowed if there's the potential for external data, such as an internal function that is called both with internal and external dataAssume ^force
(or^force_pure
?) is allowed anywhere, even if you're just dealing with pure calculations. You really, really shouldn't ever need this.
The "levels" aren't really enforced by the compiler, but you could imagine linters allowing some of these but not others.