Skip to content

Instantly share code, notes, and snippets.

View tomsing1's full-sized avatar

Thomas Sandmann tomsing1

View GitHub Profile
@tomsing1
tomsing1 / vcr_configure.R
Created February 8, 2021 04:03
Dealing with regular expressions & parentheses in the vcr::vcr_configure calls
# This is an example of the setup function used to filter sensitive data from
# recorded API responses generated by the vcr R package
library("vcr") # *Required* as vcr is set up on loading
# escapeRegex function from the Hmisc package
escape_regex <- function(x) {
gsub("([.|()\\^{}+$*?]|\\[|\\])", "\\\\\\1", x)
}
@tomsing1
tomsing1 / somalier.sh
Created September 18, 2020 03:27
Run somalier on a set of BAM files stored on AWS S3
#!/usr/bin/env bash
# This script retrieves BAM files from AWS and runs the samtools and somalier tools.
# Requirements:
# - AWS CLI with credentials
# - samtools
# - docker
# - paths.txt file with path to BAM files on AWS S3, one per line
set -e
set -x
@tomsing1
tomsing1 / amplicon_seq.sh
Created September 18, 2020 03:24
Quick stitch and align pipeline for amplicon sequencing reads
#!/usr/bin/env bash
# This script creates QC reports for all FASTQ files found in the 'fastq' directory,
# aligns the (paired-end) reads to the references index & attempts to stitch overlapping
# read pairs into single sequences.
set -e
set -x
set -o pipefail
set -o nounset
@tomsing1
tomsing1 / vtree_examples.R
Created August 29, 2020 19:17
Examples of using the vtree R package to create variable trees of nested information
# Examples are modified from the vtree vignette: https://cran.r-project.org/web/packages/vtree/vignettes/vtree.html
library(vtree)
library(dplyr)
library(gtsummary)
class(Titanic) # array with counts
titanic <- crosstabToCases(Titanic)
head(titanic) # data.frame with individual outcomes
@tomsing1
tomsing1 / goofyfs.md
Created July 14, 2020 16:18
Mounting an S3 bucket as a folder on Mac OS X

This document shows how to mount an AWS S3 bucket on Mac OS X using goofyfs.

The first three steps illustrate how to use goofys

  1. Install goofyfs via brew
brew cask install osxfuse
brew install goofys
@tomsing1
tomsing1 / stringr.Rmd
Last active July 10, 2020 21:57
Introduction to the stringr R package
---
title: "Introduction to the stringr package"
author: "Thomas Sandmann"
date: "7/10/2020"
output: html_document
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
@tomsing1
tomsing1 / feature_barcode_classification.Rmd
Created June 24, 2020 01:25
Deconstructing Seurat's HTODemux R function
# A closer look at the HTODemux() function
1. Centered log-ratio (CLR) normalization of the counts
2. Define K as the number of barcodes + 1
3. Run k-means / k-median clustering
4. Calculate mean (normalized) barcode expression within each cluster
5. For each barcode
- identify the cluster with the lowest mean counts
- fit a negative binomial distribution to the raw counts of each cluster
- define the 99th percentile of the (background) distribution as threshold
@tomsing1
tomsing1 / negative_binomial_mixtures.R
Created June 21, 2020 02:05
Exploring fitting mixtures of Poisson or Negative binomial distributions to count data using the flexmix R package
library(flexmix)
library(countreg)
# poisson
## two clusters of 1000 measurements each
y1 <- rpois(n = 1000, lambda = 20)
y2 <- rpois(n = 1000, lambda = 1)
y <- c(y1, y2)
@tomsing1
tomsing1 / user_defined_faceting_RLA_plot.R
Last active April 7, 2020 02:50
A function with user-specified arguments for faceting & filling a Relative Log Abundance (RLA) plot with ggplot2
#' RLA plot
#'
#' @param df data.frame
#' @param x Scalar character, column of `df` used for the x-axis
#' @param y Scalar character, column of `df` used for the y-axis
#' @param feature Scalar character, column of `df` identifying the feature identifier
#' @param facet_rows Character vector, column(s) of `df` used facet rows
#' @param facet_cols Character vector, column(s) of `df` used facet columns
#' @param group_cols Character vector, column(s) of `df` that define the
#' grouping variable for median centering.
@tomsing1
tomsing1 / gist:f9f87cca8eb159ef0f20ef2ed857acb3
Created October 29, 2019 22:55
R code snippet to extract adjusted voomed abundances from the output of lmFit
#' Adjust voomed expression based on a limma fit
#'
#' When a design that contains both covariates of interest and nuisance
#' covariates (e.g. batch terms) is fit with limma's `lmFit` function,
#' it is helpful to examine the effect of the adjustment, e.g. by
#' performing MDS on the adjusted results. The `limma::removeBatchEffect`
#' could be used, but performs the full fitting procedure. Alternatively
#' The coefficients for the nuisance terms from the full model can be
#' extracted from the fit and used to correct the input matrix (usually
#' log transformed abundances, e.g. from `limma::voom`).