Skip to content

Instantly share code, notes, and snippets.

View seandavi's full-sized avatar

Sean Davis seandavi

View GitHub Profile
digraph Workflow {
node [shape="box"];
subgraph cluster_0 {
SRA_XML [shape=record label="SRA XML|{Study|Sample|Experiment|Run}"]
XML_TO_JSON [label="XML to JSON"]
SRA_JSON [shape=record label="SRA JSON|{Study|Sample|Experiment|Run}"]
BIOSAMPLE_XML [label="Biosample XML"]
BIOSAMPLE_JSON [label="Biosample JSON"]

Install ingress-nginx on aws

Install nginx ingress on AWS

This installs a set of resources.

Using helm:

Assumes kubernetes cluster created and kubectl configured correctly.

kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller

Apache Airflow setup

base installations

sudo apt-get update
sudo apt-get install python3 python3-pip virtualenv
# sudo apt-get install emacs tmux 
@seandavi
seandavi / parallel_bulk_example.py
Created April 27, 2019 13:02
elasticsearch parallel bulk indexing example in python
"""parallel bulk indexing
Indexes a json file using parallel_bulk.
"""
import elasticsearch
from elasticsearch import Elasticsearch
from elasticsearch.helpers import parallel_bulk as pb
from collections import deque
@seandavi
seandavi / create_eks_cluster.sh
Created April 21, 2019 19:12
Small bash script to install eksctl and then create cluster
#!/bin/bash
# Start EKS cluster on AWS
# Install eksctl (https://eksctl.io/)
# On mac, homebrew
brew tap weaveworks/tap
brew install weaveworks/tap/eksctl
# start cluster (takes a few minutes)
@seandavi
seandavi / omicidx_intro.Rmd
Last active March 15, 2019 15:33
Quick introduction to using the OmicIDX from R
---
title: "Playing with OmicIDX"
author: "Sean Davis"
date: "3/14/2019"
output:
BiocStyle::html_document:
toc_float: True
---
# Introduction to the OmicIDX API
@seandavi
seandavi / omicidx-beta-graphql-intro.ipynb
Created March 12, 2019 00:47
Quick introduction to the OmicIDX GraphQL API
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@seandavi
seandavi / Dockerfile
Created February 6, 2019 19:04
Dockerfile for blog post on using GCR. Builds SRA-toolkit with dbGaP access as an example
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y wget
# We do things this way to keep the docker image
# size down. See https://nickjanetakis.com/blog/docker-tip-3-chain-your-docker-run-instructions-to-shrink-your-images
RUN wget http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.9.2/sratoolkit.2.9.2-ubuntu64.tar.gz \
&& tar -xvzf sratoolkit.2.9.2-ubuntu64.tar.gz \
&& rm sratoolkit.2.9.2-ubuntu64.tar.gz
@seandavi
seandavi / read_and_process_files_beam.py
Created January 29, 2019 22:21
Read and process full files based on wildcard path using Apache Beam/Google Cloud Platform/DataFlow
from __future__ import print_function
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.io.filesystems import FileSystems
import urllib
import json
import argparse
import logging
logging.basicConfig(level=logging.INFO)