So recently a religious debate intense discussion happened on #lisp
about whether the following form:
;; Form 1
(loop for i from 1 to 5 finally (return i))
should return 5
or 6
(or, in other words, (1+ 5)
- this notation is important as it will be used later).
I argue that it is invalid for it to return 6
and 5
must be returned instead.
Let us assume that (declaim (optimize (safety 3)))
is in effect. In other words, we are working on safe code.
An of-type
declaration is allowed in for-as-arithmetic variable bindings in LOOP
and it is meant to be used for optimization. Adding a type declaration should not have any change on the meaning of the program, other than the fact that it must signal a type error in safe code. Therefore, form 1 is equivalent to
;; Form 2
(loop for i of-type fixnum from 1 to 5 finally (return i))
This is true, since, according to the standard, The type fixnum is required to be a supertype of (signed-byte 16).
Numbers from 1 to 5, and even 6, are therefore all fixnums.
The intent of the above form is that the user wants to do something five times - for integers from 1 to 5. Let us modify this loop - for instance, the user might want to do something for all fixnums. This gives us:
;; Form 3
(loop for i of-type fixnum from most-negative-fixnum to most-positive-fixnum finally (return i))
This LOOP
form is logically consistent, since the value of i
is between most-negative-fixnum
and most-positive-fixnum
, and therefore falls within the range of fixnum
. (The intent is also clear - the user wants to iterate over all fixnums.)
In the LOOP
epilogue, we have the form (return i)
. Therefore the variable i
is accessed which is of type fixnum
. Its value therefore must be of type fixnum
, due to the OF-TYPE
declaration.
This means that it is impossible for the value of i
to be (1+ most-positive-fixnum)
, since this value is not of type fixnum
. It would also mean that this value was assigned to the variable i
accessible in the loop epilogue, which is a type violation.
A TYPE
declaration is not allowed to change the semantics of the program since CLHS 3.3.1 states:
In general, an implementation is free to ignore declaration specifiers except for the declaration, notinline, safety, and special declaration specifiers.
Accessing i
in the loop prologue is valid, as the forms inside the LOOP
form (in particular, the epilogue) are in scope of the bindings, so they have access to the variables. According to CLHS 6.1.1.4:
A loop macro form expands into a form containing one or more binding forms (that establish bindings of loop variables) and a block and a tagbody (that express a looping control structure). The variables established in loop are bound as if by let or lambda.
CLHS Declaration TYPE states:
- During the execution of any reference to the declared variable within the scope of the declaration, the consequences are undefined if the value of the declared variable is not of the declared type.
- During the execution of any setq of the declared variable within the scope of the declaration, the consequences are undefined if the newly assigned value of the declared variable is not of the declared type.
CLHS 1.5.2 states:
Conforming code shall not depend on the consequences of undefined or unspecified situations.
The implies that that LOOP
with OF-TYPE
invokes undefined behaviour if the OF-TYPE
is incorrect, even in safe code. Therefore the question is whether the code from form 3 is conforming or not. I argue that it is conforming, since the values that the variable is permitted to take and meant to be taken inside the iteration are fully within the fixnum
type; the implementation is not required to assign a non-fixnum
value to the variable i
in order to successfully perform the iteration. (This agrees with the likely user intent of this LOOP
call - that the user wants to evaluate some code for every fixnum.)
If this code conforms, then the value of i
must not exceed the fixnum range, which requires that (1+ most-positive-fixnum)
must NOT be assigned to the variable i
, which means that finally (return i)
must NOT return (1+ most-positive-fixnum)
, which in turn implies that most-positive-fixnum
must be returned, which in turn implies that the LOOP
variable must not be assigned an out-of-range value.
This behaviour in turn translates to looping from 1
to 5
, at which point 5
must be returned, which completes the proof.
There's a real logic problem in the last few paragraphs. Your assumption of conformance is that the value remains a fixnum "inside the iteration" (presumably meaning inside the
loop
body clauses), but thefinally
clause occurs outside the iteration (specifically, in the epilogue) and thus outside the scope of that assumption. The scope of the iteration variable extends past the iteration and into the prologue and epilogue. The only way the code is conforming is if the type declaration properly describes its value in all of the prologue, iterations, and epilogue. Since no statement about the epilogue's value is made in your assumption, yet the problem is about defining the bounds inclusive of the epilogue value, no logical implication seems to occur.Also, the situation should be addressed where the
to
value is overshot instead of exactly reached. This current example can be processed to your desired implication by equality testing before stepping (which is defined in the glossary as mutating the iteration variable), while the first example in 6.1.2.1.1 cannot. It must perform a>
test on the post-stepped value, which affects whether the (effectively)to 10 by 2
iterated variable should contain 9 or 11 in the epilogue. Both the overshot and exactly-reached situations must be necessarily handled by the same code, as it's generally unpredictable how it will terminate.Also also, since so many implementations return 6 instead of 5, there might be different rationale from the spec that justifies the existing behavior. That should be explored here as well, if this document is a basis for changing behavior. If returning 6 in this case is also defensible according to the spec, and the arguments of the gist hold, it would render the CLHS inconsistent for this matter: Whether the final stepping occurs would be reasonably undefined, instead of defined as being unstepped in the epilogue (e.g. 5 in this example).
To my reading, I don't think the CLHS specifically demands whether a final stepping beyond the
to
limit shall occur or must not occur, although there's much to support the final stepping (returning 6) to the point where I would consider that behavior standard:I would prefer that the main focus be on what the spec says instead of general programmer expectations. But if we are including the latter, there also exists a common and general programming precedent in ensuring the iteration variable can be stepped to hold an exit condition value. If it is bounded to contain only non-exit values, it will generally loop infinitely. This often happens to beginners when trying to iterate to the maximum value of a fixed-size integer by looping while the value is
<=
the max. All values will continue to loop, and there is no available range to step the value into an exit condition.A common way in many programming languages to avoid the final stepping such that there need be no exit condition value in the top-of-loop processing is to break out before stepping past any problematic fenceposts:
This also manages the post-iteration value of the iteration variable, as the default iteration mechanisms usually step past the limit into exit condition territory. Within the loop body, of course, the iteration variable's value always stays within the expected bounds as agreeably presented in the gist, regardless of the post-iteration handling.
CL's
do
also completes a mutating stepping before looping back and testing the stepped values, as another precedent for returning 6.