on sig_on() and sig_on_no_except()

Summary

When sig_on() … sig_off() is used, to clean up resources when something is interrupted, don't use sig_on_no_except(). Instead use nested try … finally:

cpdef void matrix_multiply_interruptable_3(np.ndarray[np.int_t, ndim=2] c, np.ndarray[np.int_t, ndim=2] a, np.ndarray[np.int_t, ndim=2] b):
    debug_print("allocate")
    try:
        sig_on()
        matrix_multiply_uninterruptable(c, a, b)
        sig_off()
    finally:
        debug_print("cleanup")

Introduction

In cysignals (package used to allow SageMath functions to be interruptable), the functions sig_on, sig_off, sig_on_no_except are provided.

However, if we want to clean up the resources properly when the code is interrupted, there are several ways to do it. We will compare them by code readability and performance here.

Let's start with some helper code.

This overrides %%cython magic to the version provided by Cython. This version allows specifying arguments. See also sagemath/sage#38945

%load_ext cython

Then run this cell.

%%cython -c-w --annotate --annotate-fullc
import numpy as np
from time import time
from cysignals.signals cimport sig_on, sig_off, sig_on_no_except, cython_check_exception
from libc.stdio cimport puts
import numpy as np
cimport numpy as np
cimport cython

cdef bint enable_debug_print = True

cpdef void set_enable_debug_print(bint x):
    global enable_debug_print
    enable_debug_print = x

cpdef void debug_print(const char* s):
    global enable_debug_print
    if enable_debug_print:
        puts(s)

@cython.boundscheck(False)
cpdef void matrix_multiply_uninterruptable(np.ndarray[np.int_t, ndim=2] c, np.ndarray[np.int_t, ndim=2] a, np.ndarray[np.int_t, ndim=2] b):
    cdef int i, j, k, m=a.shape[1], n=a.shape[0], p=b.shape[1]
    if not (m==b.shape[0] and n==c.shape[0] and p==c.shape[1]):
        puts("error")
        return
    for i in range(n):
        for j in range(p):
            c[i, j] = 0
            for k in range(m):
                c[i, j] += a[i, k]*b[k, j]

cpdef void matrix_multiply_interruptable_1(np.ndarray[np.int_t, ndim=2] c, np.ndarray[np.int_t, ndim=2] a, np.ndarray[np.int_t, ndim=2] b):
    """
    this is likely invalid because in the generated C code, ``sig_on()`` is inside a block?
    although I can't find conclusive evidence --- the stack frame itself still exist

    > If the function that called setjmp has exited (whether by return or by a different
    > longjmp higher up the stack), the behavior is undefined. In other words,
    > only long jumps up the call stack are allowed. 

    the function has not exited.
    """
    debug_print("allocate")
    try:
        sig_on()
    except:  # KeyboardInterrupt, AlarmInterrupt
        debug_print("cleanup")
        raise
    matrix_multiply_uninterruptable(c, a, b)
    sig_off()
    debug_print("cleanup")

cpdef void matrix_multiply_interruptable_2(np.ndarray[np.int_t, ndim=2] c, np.ndarray[np.int_t, ndim=2] a, np.ndarray[np.int_t, ndim=2] b):
    debug_print("allocate")
    if not sig_on_no_except():
        debug_print("cleanup")
        cython_check_exception()
    matrix_multiply_uninterruptable(c, a, b)
    sig_off()
    debug_print("cleanup")

cpdef void matrix_multiply_interruptable_3(np.ndarray[np.int_t, ndim=2] c, np.ndarray[np.int_t, ndim=2] a, np.ndarray[np.int_t, ndim=2] b):
    debug_print("allocate")
    try:
        sig_on()
        matrix_multiply_uninterruptable(c, a, b)
        sig_off()
    finally:
        debug_print("cleanup")

cpdef benchmark():
    cdef np.ndarray[np.int_t, ndim=2] c, a, b
    c=np.random.randint(0, 5, size=(3, 5))
    a=np.random.randint(0, 5, size=(3, 4))
    b=np.random.randint(0, 5, size=(4, 5))

    cdef int i, n=2000000

    start=time()
    for i in range(n):
        matrix_multiply_uninterruptable(c, a, b)
    print(time()-start)

    start=time()
    for i in range(n):
        matrix_multiply_interruptable_1(c, a, b)
    print(time()-start)

    start=time()
    for i in range(n):
        matrix_multiply_interruptable_2(c, a, b)
    print(time()-start)

    start=time()
    for i in range(n):
        matrix_multiply_interruptable_3(c, a, b)
    print(time()-start)

It will compile the Cython code and load it into the user namespace. If you're using terminal SageMath (instead of notebook), you can view the annotated code by running the following cell:

d=_
with open("/tmp/a.html", "w") as fi:
    fi.write(d.data)
import webbrowser
webbrowser.open("file:///tmp/a.html")

Explanation

First, we have a simple implementation of matrix multiplication matrix_multiply_uninterruptable.

Then we have three versions of interruptable matrix multiplication. See also sagemath/cysignals#205.

The final function is for benchmarking, see below.

Checking for correctness

c=np.random.randint(0, 5, size=(3, 5))
a=np.random.randint(0, 5, size=(3, 4))
b=np.random.randint(0, 5, size=(4, 5))

for f in [matrix_multiply_uninterruptable, matrix_multiply_interruptable_1, matrix_multiply_interruptable_2, matrix_multiply_interruptable_3]:
    c*=0
    f(c, a, b)
    assert (c==a@b).all()
    f(c, a, b)
    assert (c==a@b).all()

Checking the code is indeed interruptable

Run each of the following cells.

c=np.random.randint(0, 5, size=(1000, 1000))
a=np.random.randint(0, 5, size=(1000, 1000))
b=np.random.randint(0, 5, size=(1000, 1000))

set_enable_debug_print(False)

start=time(); matrix_multiply_uninterruptable(c, a, b); print(time()-start)

On my machine, it takes around 2 seconds.

try:
    start=time(); alarm(1); matrix_multiply_uninterruptable(c, a, b)
    print()  # to check interrupt
    # (if this line is not here then the interrupt may only be noticed in the finally)
finally: print(time()-start)

Again, this takes around 2 seconds. We conclude that the interrupt doesn't work at all.

Nevertheless, AlarmInterrupt is still raised. We conclude that the "interruptable" checks in SageMath source code are mostly useless. (it only checks that once a signal is received, the code terminates within reasonable time like 10 minutes. But to an user, anything more than a second is considered unresponsive program.)

Testing the interruptable versions:

try:
    start=time(); alarm(1); matrix_multiply_interruptable_1(c, a, b)
finally: print(time()-start)

try:
    start=time(); alarm(1); matrix_multiply_interruptable_2(c, a, b)
finally: print(time()-start)

try:
    start=time(); alarm(1); matrix_multiply_interruptable_3(c, a, b)
finally: print(time()-start)

Each of the above takes 1 second. We conclude that the interruption works.

Measuring performance / overhead of `try..finally`, `sig_on` and `sig_off`

benchmark()

On my machine the output is:

0.9549407958984375
1.837050199508667
1.742887020111084
1.7591187953948975

Here is the result from 9 runs:

array([[0.9549, 1.8371, 1.7429, 1.7591],
       [0.9437, 1.8573, 1.6858, 1.6856],
       [0.9185, 1.7233, 1.7168, 1.7363],
       [0.9217, 1.7647, 1.7049, 1.7072],
       [0.9259, 1.7195, 1.6706, 1.6917],
       [0.9207, 1.7355, 1.7315, 1.6674],
       [0.9367, 1.7022, 1.7048, 1.6854],
       [0.941 , 1.7376, 1.6709, 1.679 ],
       [0.9239, 1.7362, 1.6858, 1.6694]])

Some statistics:

sage: np.mean(b, axis=0).round(4)
array([0.9319, 1.757 , 1.7016, 1.6979])
sage: (np.std(b, axis=0)/np.sqrt(len(b)-1)).round(4)
array([0.0042, 0.018 , 0.0085, 0.0104])

We conclude sig_on and sig_off have a significant overhead. The fastest approach turns out to be using try..finally with sig_on and sig_off inside. This also leads to the cleanest code, thus it should be preferred.

As such, it appears that sig_on_no_except is not useful in practice.

user202729/README.md