Skip to content

Instantly share code, notes, and snippets.

@fomightez
Created June 17, 2025 18:47
Show Gist options
  • Save fomightez/8e75fb2b0643b9771213ce4739878331 to your computer and use it in GitHub Desktop.
Save fomightez/8e75fb2b0643b9771213ce4739878331 to your computer and use it in GitHub Desktop.
Converting Bytes to MBytes in the sense used by Sequence Read Archive tables and metadata and Logan Search Results. Allows inter-relating the various numbers given in exported data.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "d84560cc-8702-4136-a726-6ac742cff1ee",
"metadata": {},
"source": [
"# Converting Bytes to MBytes\n",
"\n",
"Need this to relate the Logan Search Results to what Sequence Read Archive shows on individual pages and what they provide in the metadata for the 'Bytes' column.\n",
"\n",
"### Helpful Resources\n",
"- [SO post 'Converting bytes to megabytes'](https://stackoverflow.com/q/2365100/8508004), especially [this succicnt answer](https://stackoverflow.com/a/24325991/8508004)\n",
"- [Superuser post 'Is it true that 1 MB can mean either 1000000 bytes, 1024000 bytes, or 1048576 bytes?'](https://superuser.com/a/373601)\n",
" \n",
"## Situation\n",
"\n",
"- Logan Search Results include a column that is 'mbytes' that has a number that closely matches what Sequence Rearch Archive (SRA) has in table on page like for [SRR12634513](https://www.ncbi.nlm.nih.gov/sra/?term=SRR12634513) it has at the bottom `478.7Mb`. Logan Search Results has `478` in 'mybytes' column and so that is very similar. \n",
"However, if you get metadata from the Sequence Selector page at Sequence Read Archive it comes as 'bytes' column and is not a simple matter of moving decimal place. That is the case because for metadata downloaded under 'Select' section on the page (https://www.ncbi.nlm.nih.gov/Traces/study/?acc=SAMN15915379&o=acc_s%3Aa) by clicking 'Metadata', I see `50191929` in the 'Bytes' column.\n",
"\n",
"Luckily, I can work backwards because I have data from all cases, Sequence Read Archive page, Sequence Read Archive downloadabe metadata, and Logan Search Results. and can check how they relate."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "b69d7547-9d21-4b40-b037-740bb7fce56b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1048576"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"2**20 # this agrees with https://stackoverflow.com/a/24325991/8508004; importantly you don't want `2^20` as Python says that is 22??!?!"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ff3249c5-199e-4e01-bdfa-18b45c821674",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1048576"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"1024 * 1024"
]
},
{
"cell_type": "markdown",
"id": "b96360ce-17f8-4005-9acc-221577e84e93",
"metadata": {},
"source": [
"--------------------\n",
"\n",
"Using that to convert bytes to mbytes:"
]
},
{
"cell_type": "markdown",
"id": "0936257f-c6b0-439c-a7f6-51e0ef5735bb",
"metadata": {},
"source": [
"For example in metadata for SRR12634513 downloadard from [the Run Selector page for 'SRR12634513'](https://www.ncbi.nlm.nih.gov/Traces/study/?acc=SAMN15915379&o=acc_s%3Aa), I see `50191929` in the 'Bytes' column."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "54f1c94d-a89f-42ab-ab4e-ee3b6d2c286d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"478.6675443649292"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"501919299/(2**20)"
]
},
{
"cell_type": "markdown",
"id": "17a64c2a-27cb-4673-9f34-ec3430060d33",
"metadata": {},
"source": [
"478 mbytes agrees Logan Search Results and table at the bottom of [here](https://www.ncbi.nlm.nih.gov/sra/?term=SRR12634513) and [bottom of here](https://www.ncbi.nlm.nih.gov/Traces/study/?acc=SAMN15915379&o=acc_s%3Aa)."
]
},
{
"cell_type": "markdown",
"id": "f72ea008-aad1-44a4-9389-f22483daf934",
"metadata": {},
"source": [
"---------\n",
"\n",
"Enjoy!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment