Skip to content

Instantly share code, notes, and snippets.

View seandavi's full-sized avatar

Sean Davis seandavi

View GitHub Profile
@seandavi
seandavi / workshop-syllabus.md
Last active September 25, 2021 16:53
Workshop syllabus template

Insert Workshop Title

Instructor(s) name(s) and contact information

Provide names and contact information for all instructors.

Workshop Description

Along with the topic of your workshop, include how students can expect to spend their time. For the description may also include information

@seandavi
seandavi / CSHL_Biological_Data_2018_software.csv
Last active November 8, 2018 14:39
Software list mined from twitter feed for CSHL Biological Data meeting, 2018
url name user type
https://github.com/mploenzke/talk talk mploenzke github
https://github.com/dewyman/TranscriptClean TranscriptClean dewyman github
https://github.com/dewyman/TALON TALON dewyman github
https://github.com/Illumina/strelka strelka Illumina github
https://github.com/gymreklab/GangSTR GangSTR gymreklab github
https://github.com/dewyman/talon talon dewyman github
https://github.com/haghshenas/PhISCS PhISCS haghshenas github
https://github.com/alshai/r-index r-index alshai github
https://github.com/shenwei356/bwt bwt shenwei356 github
# salmon wdl
#
# Assumes gencode here!!!!
#
#############################
task salmon_index {
File transcript_file
Int kmer
@seandavi
seandavi / test.nf
Last active October 6, 2018 22:05
nextflow test script
import groovy.json.JsonSlurper
def jsonSlurper = new JsonSlurper()
res = jsonSlurper.parse(new URL('https://api-omicidx.cancerdatasci.org/sra/1.0/search/run?q=experiment_accession:' + params.experiment + '&size=500'))
l = res.hits.hits.collect {
[ it._source.experiment_accession, it._source.accession ]
}
@seandavi
seandavi / aws_sqs_example.ipynb
Created September 27, 2018 11:46
Example usage of AWS SQS
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@seandavi
seandavi / prepTargetVCFMetadata.R
Created February 13, 2018 04:03
Preparing VCF file metadata for TARGET Osteosarcoma
library(dplyr)
library(RMySQL)
con = dbConnect(MySQL(),user='USERNAME',password='PASSWORD',host='solexadb.XXXXXXXXXX.us-east-1.rds.amazonaws.com',port=3306,dbname='solexa')
res = dbGetQuery(con,
"select study_id,source.name as source_name, sample.name as sample_name, fcb.type as software,
fcb.dateStamp as basecalldate, fcb.softwareVersion as version, study_id,
sr.date as run_date, sr.sequencer as sequencer, sm.model, ssr.library_id as library_id,
sr.ID as run_id, sample.ID as sample_id, nt.value as sample_source,
@seandavi
seandavi / Snakefile
Created February 2, 2018 12:27
Snakemake with s3 and custom profile
import boto3
# set the profile name based on ~/.aws/credentials entry
boto3.setup_default_session(profile_name='s3')
from snakemake.remote.S3 import RemoteProvider as S3RemoteProvider
s3 = S3RemoteProvider()
# This simply copies the file from local storage to s3
rule all:
@seandavi
seandavi / oncoprint_targetOsteo.R
Created January 8, 2018 13:48
TARGET osteosarcoma oncoprint R code
library(aws.s3)
aws.signature::use_credentials(profile='s3')
disco_maf = s3read_using(readr::read_tsv,object="s3://target-osteosarcoma/TargetOsteoDiscovery/summary/strelka.maf.filtered.tab")
disco_gistic = s3read_using(readr::read_tsv,object="s3://target-osteosarcoma/TargetOsteoDiscovery/all_thresholded.by_genes.txt")
library(tidyr)
library(dplyr)
library(ComplexHeatmap)
x = disco_gistic %>%
gather(key = 'Sample', value='CN', -c('Gene Symbol', "Locus ID", "Cytoband")) %>%
dplyr::select(Sample, Hugo_Symbol = `Gene Symbol`, CN) %>%
@seandavi
seandavi / TCGAtranslateID.R
Last active January 8, 2024 21:12
Translate GDC file_ids to TCGA barcodes
library(GenomicDataCommons)
library(magrittr)
TCGAtranslateID = function(file_ids) {
info = files() %>%
GenomicDataCommons::filter( ~ file_id %in% file_ids) %>%
GenomicDataCommons::select('cases.samples.submitter_id') %>%
results_all()
# The mess of code below is to extract TCGA barcodes
# id_list will contain a list (one item for each file_id)
@seandavi
seandavi / xmlsplitter.py
Created December 22, 2017 12:47
split xml into smaller xmls based on a split "tag"
#!/usr/bin/env python
import argparse
import lxml.etree
import os, sys
import bz2
parser = argparse.ArgumentParser()
parser.add_argument('tag')
parser.add_argument('n',default=100000)
parser.add_argument('wrapper', default=None)