-
-
Save bosswissam/a369b7a31d9dcab46b4a034be7d263b2 to your computer and use it in GitHub Desktop.
import sys | |
def get_size(obj, seen=None): | |
"""Recursively finds size of objects""" | |
size = sys.getsizeof(obj) | |
if seen is None: | |
seen = set() | |
obj_id = id(obj) | |
if obj_id in seen: | |
return 0 | |
# Important mark as seen *before* entering recursion to gracefully handle | |
# self-referential objects | |
seen.add(obj_id) | |
if isinstance(obj, dict): | |
size += sum([get_size(v, seen) for v in obj.values()]) | |
size += sum([get_size(k, seen) for k in obj.keys()]) | |
elif hasattr(obj, '__dict__'): | |
size += get_size(obj.__dict__, seen) | |
elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)): | |
size += sum([get_size(i, seen) for i in obj]) | |
return size |
This is interesting, I have some confusions I hope you will help me on that. Ok so, for getting the size of all attributes from the object what I'm doing is multiplying the length len(object)
of the object with the size of one element(single) from the object like this sys.getsize(1) * len(object)
and this solves my simple problem/feature, however, I'm not getting practically what are the sub-attributes of an objects
that I cannot get size by the simple approach.
sample script :
import sys
simple_list = range(3)
print("size of list", sys.getsizeof(1) * len(simple_list))
Amazing code ! thanks
Is this not weird?
>>> from pysize import get_size
>>> get_size(b)
152
>>> get_size(b[0])
28
>>> get_size(b[1])
28
>>> len(b)
300
I've been happily using this code for a long time, but I just encountered a use case where this breaks down: a class built over a simple namedtuple data core. This pattern is desirable for certain multi-processing/cloud computing contexts.
from __future__ import print_function
from collections import namedtuple
import sys
import numpy as np
my_tup = namedtuple('MyNamedTuple', ['Array','Name'])
class my_class(my_tup):
def __init__(self, *kwargs):
super(my_class, self).__init__(*kwargs)
# Add workhorse functions...
dat_tuple = my_tup(np.zeros([1000,1000]), 'long name'*10)
dat_obj = my_class(np.zeros([1000,1000]), 'long name'*10)
print(get_size(dat_tuple), get_size(dat_obj))
These sizes should be almost the same, but they are not.
8000946 360
The problem is caused because dat_obj
has an empty __dict__
and data stored in __iter__
.
Here is the fix I made. It doesn't come out exactly the same, but it's a lot closer than before:
def get_size2(obj, seen=None):
"""Recursively finds size of objects"""
size = sys.getsizeof(obj)
if seen is None:
seen = set()
obj_id = id(obj)
if obj_id in seen:
return 0
# Important mark as seen *before* entering recursion to gracefully handle
# self-referential objects
seen.add(obj_id)
if isinstance(obj, dict):
size += sum([get_size(v, seen) for v in obj.values()])
size += sum([get_size(k, seen) for k in obj.keys()])
elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
size += sum([get_size(i, seen) for i in obj])
if hasattr(obj, '__dict__'):
size += get_size(obj.__dict__.values(), seen)
elif hasattr(obj, '__dict__'):
size += get_size(obj.__dict__, seen)
return size
print(get_size2(dat_tuple), get_size2(dat_obj))
8000671 8000647
There is python module that provides similar functionality and other things as well such as tracking the memory consumption of the instances of a specific class, etc. called Pympler.
https://pympler.readthedocs.io/en/latest/
Thanks!