Skip to content

Instantly share code, notes, and snippets.

@vjcitn
Created September 3, 2019 16:49
Show Gist options
  • Save vjcitn/7295067e6592213823c28b9421eb5fdc to your computer and use it in GitHub Desktop.
Save vjcitn/7295067e6592213823c28b9421eb5fdc to your computer and use it in GitHub Desktop.
function to use ensembl API to convert rsid to hg38 coordinates ... note the seqnames
get_rslocs_38 = function(rsids = c("rs6060535", "rs56116432")) {
server <- "https://rest.ensembl.org"
ext <- "/variant_recoder/homo_sapiens"
r <- httr::POST(paste(server, ext, sep = ""),
httr::content_type("application/json"),
httr::accept("application/json"),
body = list(ids=rsids), encode="json")
httr::stop_for_status(r)
ans = rjson::fromJSON( rjson::toJSON( httr::content(r)))
ids = lapply(ans, "[[", "id")
spd = sapply(lapply(ans, "[[", "spdi"), "[", 1)
sspd = strsplit(spd, ":")
chrs = sapply(sspd, "[", 1)
locs = as.numeric(sapply(sspd, "[", 2))+1
ans = GenomicRanges::GRanges(chrs, IRanges::IRanges(locs, width=1))
GenomeInfoDb::genome(ans) = "GRCh38"
names(ans) = ids
ans
}
@vjcitn
Copy link
Author

vjcitn commented Sep 3, 2019

> get_rslocs_38()
GRanges object with 2 ranges and 0 metadata columns:
                 seqnames    ranges strand
                    <Rle> <IRanges>  <Rle>
   rs6060535 NC_000020.11  35647600      *
  rs56116432 NC_000009.12 133256042      *
  -------
  seqinfo: 2 sequences from GRCh38 genome; no seqlengths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment