Last active
February 11, 2016 11:04
-
-
Save bede/e4b59c6399d3b1ebbd81 to your computer and use it in GitHub Desktop.
OneCodex real-time search API
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# OneCodex 'real-time' *k*-mer search API\n", | |
"OneCodex has a mature asynchronous API for dealing with whole files, but they also provide an undocumented API that returns lowest common ancestor (LCA) results for a single query from its in-memory 31mer database. It's millisecond fast, so the round trip to the US west coast is the limiting factor in terms of speed.\n", | |
"\n", | |
"You'll need to [register](https://app.onecodex.com/register) an account to receive your [API key](https://app.onecodex.com/settings)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"{'elapsed_secs': '0.0001',\n", | |
" 'k': 31,\n", | |
" 'n_hits': 40,\n", | |
" 'n_lookups': 40,\n", | |
" 'tax_id': 11676}" | |
] | |
}, | |
"execution_count": 20, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import json\n", | |
"import requests\n", | |
"\n", | |
"onecodex_api_key = 'YOUR_API_KEY'\n", | |
"test_sequence = 'TAGAACGATTCGCAGTTAATCCTGGCCTGTTAGAAACATCAGAAGGCTGTAGACAAATACTGGGACAGCT'\n", | |
"\n", | |
"def onecodex_lca(seq, onecodex_api_key):\n", | |
" url = 'https://app.onecodex.com/api/v0/search'\n", | |
" payload = {'sequence':str(seq)}\n", | |
" auth = requests.auth.HTTPBasicAuth(onecodex_api_key, '')\n", | |
" response = requests.post(url, payload, auth=auth, timeout=5)\n", | |
" result = json.loads(response.text)\n", | |
" return result\n", | |
"\n", | |
"lca = onecodex_lca(test_sequence, onecodex_api_key)\n", | |
"lca" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"One can easily look up the taxid using an EBI API" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 21, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"('Human immunodeficiency virus 1',\n", | |
" ['Viruses',\n", | |
" 'Retro-transcribing viruses',\n", | |
" 'Retroviridae',\n", | |
" 'Orthoretrovirinae',\n", | |
" 'Lentivirus',\n", | |
" 'Primate lentivirus group'])" | |
] | |
}, | |
"execution_count": 21, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"def ebi_taxid_to_lineage(tax_id):\n", | |
" url = 'http://www.ebi.ac.uk/ena/data/taxonomy/v1/taxon/tax-id/{}'\n", | |
" if tax_id == 0 or tax_id == 1:\n", | |
" return None, None\n", | |
" response = requests.get(url.format(tax_id), timeout=5)\n", | |
" result = json.loads(response.text)\n", | |
" sciname = result['scientificName']\n", | |
" taxonomy = [x for x in result['lineage'].split('; ') if x]\n", | |
" return sciname, taxonomy\n", | |
"\n", | |
"taxon = ebi_taxid_to_lineage(lca['tax_id'])\n", | |
"taxon" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This can be trivially parallelised for fast sequence characterisation, bearing in mind that this is an undocumented API which I'm informed runs on a single node… Be gentle." | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.5.1" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment