Skip to content

Instantly share code, notes, and snippets.

View seandavi's full-sized avatar

Sean Davis seandavi

View GitHub Profile
@seandavi
seandavi / Dockerfile
Created February 6, 2019 19:04
Dockerfile for blog post on using GCR. Builds SRA-toolkit with dbGaP access as an example
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y wget
# We do things this way to keep the docker image
# size down. See https://nickjanetakis.com/blog/docker-tip-3-chain-your-docker-run-instructions-to-shrink-your-images
RUN wget http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.9.2/sratoolkit.2.9.2-ubuntu64.tar.gz \
&& tar -xvzf sratoolkit.2.9.2-ubuntu64.tar.gz \
&& rm sratoolkit.2.9.2-ubuntu64.tar.gz
@seandavi
seandavi / read_and_process_files_beam.py
Created January 29, 2019 22:21
Read and process full files based on wildcard path using Apache Beam/Google Cloud Platform/DataFlow
from __future__ import print_function
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.io.filesystems import FileSystems
import urllib
import json
import argparse
import logging
logging.basicConfig(level=logging.INFO)
@seandavi
seandavi / fusion_genes.Rmd
Created January 24, 2019 00:21
TARGET osteosarcoma fusion gene analysis sketch
---
title: "fusion genes"
output:
html_document:
self_contained: true
---
```{r include=FALSE}
library(knitr)
@seandavi
seandavi / dataflow_example_sra.py
Last active February 1, 2019 18:49
simple dataflow pipeline from sra json
# requires python 2.7
# pip install apache_beam
from __future__ import print_function
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
import json
import argparse
import logging
import urllib2
import urllib
@seandavi
seandavi / mapping_example.json
Created November 20, 2018 15:33
For luqum issue
{
"sra_experiment_joined2": {
"mappings": {
"doc": {
"properties": {
"library_name": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
@seandavi
seandavi / workshop-syllabus.md
Last active September 25, 2021 16:53
Workshop syllabus template

Insert Workshop Title

Instructor(s) name(s) and contact information

Provide names and contact information for all instructors.

Workshop Description

Along with the topic of your workshop, include how students can expect to spend their time. For the description may also include information

@seandavi
seandavi / CSHL_Biological_Data_2018_software.csv
Last active November 8, 2018 14:39
Software list mined from twitter feed for CSHL Biological Data meeting, 2018
url name user type
https://github.com/mploenzke/talk talk mploenzke github
https://github.com/dewyman/TranscriptClean TranscriptClean dewyman github
https://github.com/dewyman/TALON TALON dewyman github
https://github.com/Illumina/strelka strelka Illumina github
https://github.com/gymreklab/GangSTR GangSTR gymreklab github
https://github.com/dewyman/talon talon dewyman github
https://github.com/haghshenas/PhISCS PhISCS haghshenas github
https://github.com/alshai/r-index r-index alshai github
https://github.com/shenwei356/bwt bwt shenwei356 github
# salmon wdl
#
# Assumes gencode here!!!!
#
#############################
task salmon_index {
File transcript_file
Int kmer
@seandavi
seandavi / test.nf
Last active October 6, 2018 22:05
nextflow test script
import groovy.json.JsonSlurper
def jsonSlurper = new JsonSlurper()
res = jsonSlurper.parse(new URL('https://api-omicidx.cancerdatasci.org/sra/1.0/search/run?q=experiment_accession:' + params.experiment + '&size=500'))
l = res.hits.hits.collect {
[ it._source.experiment_accession, it._source.accession ]
}
@seandavi
seandavi / aws_sqs_example.ipynb
Created September 27, 2018 11:46
Example usage of AWS SQS
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.