Created
September 3, 2014 15:09
-
-
Save crazyhottommy/bc0a8982df6a22957f3a to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin/env python | |
# ID mapping using mygene | |
# https://pypi.python.org/pypi/mygene | |
# http://nbviewer.ipython.org/gist/newgene/6771106 | |
# http://mygene-py.readthedocs.org/en/latest/ | |
# 08/30/14 | |
__author__ = 'tommy' | |
import mygene | |
import fileinput | |
import sys | |
mg = mygene.MyGeneInfo() | |
# mapping gene symbols to Entrez gene ids and Ensemble gene ids. | |
# fileinput will loop through all the lines in the input specified as file names given in command-line arguments, | |
# or the standard input if no arguments are provided. | |
# build a list from an input file with one gene name in each line | |
def get_gene_symbols(): | |
gene_symbols = [] | |
for line in fileinput.input(): | |
gene_symbol = line.strip() # assume each line contains only one gene symbol | |
gene_symbols.append(gene_symbol) | |
fileinput.close() | |
return gene_symbols | |
Entrez_ids = mg.querymany(get_gene_symbols(), scopes='symbol', fields='entrezgene, ensembl.gene', species='human', | |
as_dataframe=True, verbose=False) | |
# set as_dataframe to True will return a pandas dataframe object, verbose=False suppress the messages like "finished". | |
# Entrez_ids.to_csv(sys.stdout, sep="\t") # write the dataframe to stdout, but will not have NaNs on the screen | |
# if no matches were found | |
sys.stdout.write(Entrez_ids.to_string()) # sys.stdout.write() expects the character buffer object | |
# Entrez_ids.to_csv("Entrez_ids.txt", sep="\t") # write the pandas dataframe to csv |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I didn't know that mygene existed until I came across this blog post on friday.
This was greatly helpful. Also, I created a gist. https://gist.github.com/sdhutchins/57db1bd1d979bb5408241212dfc1aec9