-
-
Save jamesmishra/34bac09176bc07b1f0c33886e4b19dc7 to your computer and use it in GitHub Desktop.
def keras_model_memory_usage_in_bytes(model, *, batch_size: int): | |
""" | |
Return the estimated memory usage of a given Keras model in bytes. | |
This includes the model weights and layers, but excludes the dataset. | |
The model shapes are multipled by the batch size, but the weights are not. | |
Args: | |
model: A Keras model. | |
batch_size: The batch size you intend to run the model with. If you | |
have already specified the batch size in the model itself, then | |
pass `1` as the argument here. | |
Returns: | |
An estimate of the Keras model's memory usage in bytes. | |
""" | |
default_dtype = tf.keras.backend.floatx() | |
shapes_mem_count = 0 | |
internal_model_mem_count = 0 | |
for layer in model.layers: | |
if isinstance(layer, tf.keras.Model): | |
internal_model_mem_count += keras_model_memory_usage_in_bytes( | |
layer, batch_size=batch_size | |
) | |
single_layer_mem = tf.as_dtype(layer.dtype or default_dtype).size | |
out_shape = layer.output_shape | |
if isinstance(out_shape, list): | |
out_shape = out_shape[0] | |
for s in out_shape: | |
if s is None: | |
continue | |
single_layer_mem *= s | |
shapes_mem_count += single_layer_mem | |
trainable_count = sum( | |
[tf.keras.backend.count_params(p) for p in model.trainable_weights] | |
) | |
non_trainable_count = sum( | |
[tf.keras.backend.count_params(p) for p in model.non_trainable_weights] | |
) | |
total_memory = ( | |
batch_size * shapes_mem_count | |
+ internal_model_mem_count | |
+ trainable_count | |
+ non_trainable_count | |
) | |
return total_memory |
[...] I dont think I ever found a simple way to manually compute the output shape for all layers (it has been a long time since I looked at this, so I may be wrong on this point).
@Bidski, I've come to the same conclusion since I last replied to you.
In general, any Keras layer can create an arbitrary amount of tensors in the layer's __init__()
, build()
, and call()
methods, These tensors will not appear in the layer's output shape, so my keras_model_memory_usage_in_bytes()
will continually underestimate a model's actual memory usage.
However, I still find an underestimate to be useful. When I am automatically generating models during a hyperparameter search, I can skip over models that are 100% guaranteed to be too large for my GPUs.
However, I still find an underestimate to be useful. When I am automatically generating models during a hyperparameter search, I can skip over models that are 100% guaranteed to be too large for my GPUs.
Unfortunately, an underestimate doesn't meet the usage that I had in mind as I was interested in knowing whether a particular model would fit into a particular GPU (and what batch size would result in "optimal" memory usage). Since we have no real idea as to how much we are underestimating by it is impossible to answer this question.
Yes, direct subclassing of
tf.keras.Model
.This may be a different issue entirely, but I have a different model that is also subclassed from
tf.keras.Model
. After loading in the trained model and printing the summary the output shapes are still listed asmultiple
(the model and all of its layers have been called as it is a fully trained model, but it was also just loaded from disk so its possible that this information isnt saved in the model).I may be missing something here, but it seems that both of these issues are basically saying that the "best" option is to call the network with dummy data and, hence, actually allocating memory for all of the layers?
I mean, all of the layers are basically combinations of tensorflow ops (convolutions, dense layers, reshapes, etc) so it is easily possible to calculate all of the output shapes at instantiation time if you know the input shapes, but I dont think I ever found a simple way to manually compute the output shape for all layers (it has been a long time since I looked at this, so I may be wrong on this point).