Skip to content

Instantly share code, notes, and snippets.

@elipapa
Created April 16, 2016 02:55
Show Gist options
  • Save elipapa/c37a405b2ca3500587cb24c2c9b61362 to your computer and use it in GitHub Desktop.
Save elipapa/c37a405b2ca3500587cb24c2c9b61362 to your computer and use it in GitHub Desktop.
Map human gene symbols to ensembl gene ids, using the ensembl REST api, curl and jq
#!/usr/bin/env bash
# USAGE:
# 0. install jq from https://stedolan.github.io/jq/ if not already present
# 1. make this script executable
# chmod u+x gene2ensembl.sh
# 2. then run it on a gene list with one gene ID per line
./gene2ensembl.sh genesymbollist.txt > ensembl_list.txt
# NOTE: this command does not run in parallel on purpose
# so that ordering of the gene IDs remains constant
while read gene; do
curl -X GET http://rest.ensembl.org/xrefs/symbol/homo_sapiens/$gene?content-type=application/json 2>/dev/null | jq -r '.[0].id'
done < "${1:-/dev/stdin}"
#The substitution ${1:-...} takes $1 if defined otherwise the file name
# of the standard input of the own process is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment