Skip to content

Instantly share code, notes, and snippets.

@rbiswas4
Created June 23, 2017 03:47
Show Gist options
  • Save rbiswas4/3dc60449425ad33e3dc7c30fa85f1a6a to your computer and use it in GitHub Desktop.
Save rbiswas4/3dc60449425ad33e3dc7c30fa85f1a6a to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "from sys import getsizeof\nimport numpy as np\nimport pandas as pd",
"execution_count": 5,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "# lsst sims stack \nfrom lsst.sims.photUtils import BandpassDict, Bandpass",
"execution_count": 22,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "# This is a dictionary that take each of the strings 'u', 'g', 'r', 'i', 'z', 'y' as keys\n# and have an instance of `lsst.sims.Bandpass` objects. There are therefore only 6 such \n# instances for what we are looking at\nbpdict = BandpassDict.loadTotalBandpassesFromFiles()",
"execution_count": 4,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "# We can use `sys.getsizeof` to find the sizes in bytes of these objects\nprint(list(getsizeof(bpdict[b]) for b in list('ugrizy')))",
"execution_count": 9,
"outputs": [
{
"output_type": "stream",
"text": "[64, 64, 64, 64, 64, 64]\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "# Now let us create sequences of such band names and bandpass objects corresponding to the bands\nnum = 100000\nbands = np.random.choice(list('ugrizy'), size=num)\nbos = list(bpdict[b] for b in bands)",
"execution_count": 13,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "# We can look at the elements of bands or the first element of bos",
"execution_count": null,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "bands",
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "array(['r', 'i', 'i', ..., 'z', 'i', 'i'], \n dtype='|S1')"
},
"metadata": {},
"execution_count": 14
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "bos[0]",
"execution_count": 19,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "<lsst.sims.photUtils.Bandpass.Bandpass at 0x11f6cd290>"
},
"metadata": {},
"execution_count": 19
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "# And confirm that each element of bands has the same size, as is the case for bos",
"execution_count": null,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "np.unique(list(getsizeof(band) for band in bands))",
"execution_count": 18,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "array([38])"
},
"metadata": {},
"execution_count": 18
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "np.unique(np.array(list(getsizeof(b) for b in bos)))",
"execution_count": 17,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "array([64])"
},
"metadata": {},
"execution_count": 17
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "# We can also confirm that all the elements of bos are Bandpass Objects",
"execution_count": 20,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "all(isinstance(b, Bandpass) for b in bos)",
"execution_count": 24,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "True"
},
"metadata": {},
"execution_count": 24
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "The sizes of these sequences are (per element):"
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "getsizeof(bands) / np.float(num)",
"execution_count": 27,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "1.00096"
},
"metadata": {},
"execution_count": 27
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "getsizeof(bos) / np.float(num)",
"execution_count": 28,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "8.79848"
},
"metadata": {},
"execution_count": 28
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "While the ratio of sizes is more than the ratio of sizes of the elements, I can imagine that the sequences are created in different ways."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "What happens in pandas"
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "df = pd.DataFrame(dict(band=bands))",
"execution_count": 43,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "df_long = pd.DataFrame(dict(band=bos))",
"execution_count": 45,
"outputs": []
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "getsizeof(df_long) / np.float(num)",
"execution_count": 50,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "40.00104"
},
"metadata": {},
"execution_count": 50
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "getsizeof(df) / np.float(num)",
"execution_count": 51,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "46.00104"
},
"metadata": {},
"execution_count": 51
}
]
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"name": "python2",
"display_name": "Python [default]",
"language": "python"
},
"language_info": {
"mimetype": "text/x-python",
"nbconvert_exporter": "python",
"name": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12",
"file_extension": ".py",
"codemirror_mode": {
"version": 2,
"name": "ipython"
}
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment