Skip to content

Instantly share code, notes, and snippets.

@phoe
Last active February 7, 2020 13:22
Show Gist options
  • Save phoe/335fecfdc195bddd47ab0928b0e62e52 to your computer and use it in GitHub Desktop.
Save phoe/335fecfdc195bddd47ab0928b0e62e52 to your computer and use it in GitHub Desktop.
loop return value

So recently a religious debate intense discussion happened on #lisp about whether the following form:

;; Form 1
(loop for i from 1 to 5 finally (return i))

should return 5 or 6 (or, in other words, (1+ 5) - this notation is important as it will be used later).

I argue that it is invalid for it to return 6 and 5 must be returned instead.

Let us assume that (declaim (optimize (safety 3))) is in effect. In other words, we are working on safe code.

An of-type declaration is allowed in for-as-arithmetic variable bindings in LOOP and it is meant to be used for optimization. Adding a type declaration should not have any change on the meaning of the program, other than the fact that it must signal a type error in safe code. Therefore, form 1 is equivalent to

;; Form 2
(loop for i of-type fixnum from 1 to 5 finally (return i))

This is true, since, according to the standard, The type fixnum is required to be a supertype of (signed-byte 16). Numbers from 1 to 5, and even 6, are therefore all fixnums.

The intent of the above form is that the user wants to do something five times - for integers from 1 to 5. Let us modify this loop - for instance, the user might want to do something for all fixnums. This gives us:

;; Form 3
(loop for i of-type fixnum from most-negative-fixnum to most-positive-fixnum finally (return i))

This LOOP form is logically consistent, since the value of i is between most-negative-fixnum and most-positive-fixnum, and therefore falls within the range of fixnum. (The intent is also clear - the user wants to iterate over all fixnums.)

In the LOOP epilogue, we have the form (return i). Therefore the variable i is accessed which is of type fixnum. Its value therefore must be of type fixnum, due to the OF-TYPE declaration.

This means that it is impossible for the value of i to be (1+ most-positive-fixnum), since this value is not of type fixnum. It would also mean that this value was assigned to the variable i accessible in the loop epilogue, which is a type violation.

A TYPE declaration is not allowed to change the semantics of the program since CLHS 3.3.1 states:

In general, an implementation is free to ignore declaration specifiers except for the declaration, notinline, safety, and special declaration specifiers.

Accessing i in the loop prologue is valid, as the forms inside the LOOP form (in particular, the epilogue) are in scope of the bindings, so they have access to the variables. According to CLHS 6.1.1.4:

A loop macro form expands into a form containing one or more binding forms (that establish bindings of loop variables) and a block and a tagbody (that express a looping control structure). The variables established in loop are bound as if by let or lambda.

CLHS Declaration TYPE states:

  1. During the execution of any reference to the declared variable within the scope of the declaration, the consequences are undefined if the value of the declared variable is not of the declared type.
  2. During the execution of any setq of the declared variable within the scope of the declaration, the consequences are undefined if the newly assigned value of the declared variable is not of the declared type.

CLHS 1.5.2 states:

Conforming code shall not depend on the consequences of undefined or unspecified situations.

The implies that that LOOP with OF-TYPE invokes undefined behaviour if the OF-TYPE is incorrect, even in safe code. Therefore the question is whether the code from form 3 is conforming or not. I argue that it is conforming, since the values that the variable is permitted to take and meant to be taken inside the iteration are fully within the fixnum type; the implementation is not required to assign a non-fixnum value to the variable i in order to successfully perform the iteration. (This agrees with the likely user intent of this LOOP call - that the user wants to evaluate some code for every fixnum.)

If this code conforms, then the value of i must not exceed the fixnum range, which requires that (1+ most-positive-fixnum) must NOT be assigned to the variable i, which means that finally (return i) must NOT return (1+ most-positive-fixnum), which in turn implies that most-positive-fixnum must be returned, which in turn implies that the LOOP variable must not be assigned an out-of-range value.

This behaviour in turn translates to looping from 1 to 5, at which point 5 must be returned, which completes the proof.

@bhaible
Copy link

bhaible commented Nov 3, 2019

It is not convincing to make an argument about program A by saying "let me modify the program to program B, then look at what the standard says about B, then from that draw conclusions about program A". That's because the standard is not free of contradictions; there are some known mistakes in it.

Especially, citing a sentence that starts with "In general ..." is not going to provide a convincing argument.

If you were to build an argument about what the standard says about program A directly, that would be more convincing.

@bhaible
Copy link

bhaible commented Nov 4, 2019

It is well-known since at least 2004 that accessing FOR-AS variables in the FINALLY body leads to unexpected results. See https://www.cliki.net/Issue%20LOOP-FINALLY-VARIABLES .

@phoe
Copy link
Author

phoe commented Nov 4, 2019

Copying the full comment from @bhaible from the CLISP bug tracker.


There are two more related cases:

> (loop for i from 1 to 5 finally (return i))
6
> (loop for i from 1 upto 5 finally (return i))
6
> (loop for i from 1 below 6 finally (return i))
6

ANSI CL 6.1.2.1.1 http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1-1.html says: "The variable var is bound to the value of form1 in the first iteration and is stepped[1] by the value of form3 in each succeeding iteration, or by 1 if form3 is not provided." Therefore here, the variable I is set to the values 1, 2, 3, ... consecutively.

The question is whether the last value attained is 5 or 6.

The text regarding TO and UPTO in http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1-1.html is not precise: For UPTO it says "the loop terminates when the variable var passes the value of form2" which - to a certain degree - indicates that FOR I FROM 1 UPTO 5 will set I to 6.

The text regarding BELOW in http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1-1.html, on the other hand, is clearly contradictory: "These keywords stop iteration just before the value of the variable var reaches the value supplied by form2" but also "iteration continues until the value var is stepped to the exclusive or inclusive limit supplied by form2".

I think all three cases should be handled in the same way - either use a temporary variable for the stepping in all three cases, or use an assignment to var during the stepping, in all three cases.

Since SBCL produces the the same results as CLISP for the three cases above, I vote for doing nothing.

Users need to be aware that accessing the loop variables in the FINALLY clause is - and always has been - hairy. This is also true for the other iteration controls, such as FOR I ACROSS or FOR I BEING EACH HASH-KEY and so on.

@white-flame
Copy link

white-flame commented Nov 5, 2019

I think all three cases should be handled in the same way - either use a temporary variable for the stepping in all three cases, or use an assignment to var during the stepping, in all three cases.

While this is an imported comment from elsewhere, it bears repeating that the glossary defines stepping as mutating the iteration variable:

  1. v.t. (an iteration variable) to assign the variable a new value at the end of an iteration, in preparation for a new iteration.

However, this definition matches DO and the notion of "end of an iteration" breaks down for LOOP. Iteration clauses appear only at the beginning of a loop, and an iteration with regards to that clause itself is intended to perform the test at the end of its execution, and the step before (6.1.1.6). But as far as mutating the variable versus using a temporary variable, it clearly denotes the former.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment