from numpy_lru_cache_decorator import np_cache
@np_cache()
def function(array):
...Sometimes processing numpy arrays can be slow, even more if we are doing image analysis. Simply using functools.lru_cache won't work because numpy.array is mutable and not hashable. This workaround allows caching functions that take an arbitrary numpy.array as first parameter, other parameters are passed as is. Decorator accepts lru_cache standard parameters (maxsize=128, typed=False).
>>> array = np.array([[1, 2, 3], [4, 5, 6]])
>>> @np_cache(maxsize=256)
... def multiply(array, factor):
... print("Calculating...")
... return factor*array
>>> product = multiply(array, 2)
Calculating...
>>> product
array([[ 2, 4, 6],
[ 8, 10, 12]])
>>> multiply(array, 2)
array([[ 2, 4, 6],
[ 8, 10, 12]])User must be very careful when mutable objects (list, dict, numpy.array...) are returned. A reference to the same object in memory is returned each time from cache and not a copy. Then, if this object is modified, the cache itself looses its validity.
>>> array = np.array([1, 2, 3])
>>> @np_cache()
... def to_list(array):
... print("Calculating...")
... return array.tolist()
>>> result = to_list(array)
Calculating...
>>> result
[1, 2, 3]
>>> result.append("this shouldn't be here") # WARNING, DO NOT do this
>>> result
[1, 2, 3, "this shouldn't be here"]
>>> new_result = to_list(array)
>>> result
[1, 2, 3, "this shouldn't be here"] # CACHE BROKEN!!To avoid this mutability problem, the usual approaches must be followed. In this case, either list(result) or result[:] will create a (shallow) copy. If result were a nested list, deepcopy must be used. For numpy.array, array.copy() must be used, as neither array[:] nor numpy.array(array) will make a copy.
Hi, @CarloNicolini
I spotted 3 lines that were giving you errors:
First, the decorator must be called, not only mentioned, as in
@df_cache()instead of@df_cache. Notice the parenthesis. This could be worked around with something like this https://stackoverflow.com/questions/3931627/how-to-build-a-decorator-with-optional-parametersSecond, my exception handling in
array_to_tuplewas intended to handle the final case when recursion tries to unpack a single element. In your codedataframe_to_tuple, that exception handling was hiding an error thrown when constructing the returned tupled. That error is thattupleexpects a single iterable parameter. You can instead create a tuple implicitly asinstead of
Finally,
testfunction was failing to me becausecached_wrapperwas passing tofunctiona tuple, and not a dataframe as is expected. So thisshould be just
This is the complete code that works for me
Bear in mind that caching full dataframes this way could result in a lot of RAM usage. Maybe this post could be interesting if you run into memory overhead issues https://stackoverflow.com/questions/23477284/memory-aware-lru-caching-in-python