Skip to content

Instantly share code, notes, and snippets.

@cpelley
Last active August 29, 2015 14:12
Show Gist options
  • Save cpelley/892cfc9b91f76c3f6be6 to your computer and use it in GitHub Desktop.
Save cpelley/892cfc9b91f76c3f6be6 to your computer and use it in GitHub Desktop.
netcdf memory_blow-up
from netCDF4 import Dataset
import numpy as np
import os
def process_usage():
fnme = os.path.join('/', 'proc', str(os.getpid()), 'status')
usage = {}
with open(fnme, 'r') as fh:
for line in fh:
key, value = line.split(':')
usage[key.strip()] = value.strip()
print ('Virtual memory usage {} (peak {})\n'
'Resident set size {} (peak {})\n'.format(
usage['VmSize'], usage['VmPeak'],
usage['VmRSS'], usage['VmHWM']))
dat = np.arange(100000)
print '{} GB'.format(dat.nbytes * 1e-9)
process_usage()
ncfile = Dataset('tmp.nc', 'w', format='NETCDF4_CLASSIC')
ncfile.createDimension('x', 100000)
#ncfile.createDimension('x', 0) # Unlimited dimension
data = ncfile.createVariable('data', np.dtype('float64').char, ('x',))
data[:] = dat
ncfile.close()
process_usage()
@cpelley
Copy link
Author

cpelley commented Jan 6, 2015

NetCDF saving with unlimited dimensions increases the file size on disk by ~x5.5, it also increases the resident set size (RAM) usage by ~x12 for the test case above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment