Thomas Sandmann tomsing1

This document shows how to mount an AWS S3 bucket on Mac OS X using goofyfs.

The first three steps illustrate how to use goofys

brew cask install osxfuse
brew install goofys

	# This is an example of the setup function used to filter sensitive data from
	# recorded API responses generated by the vcr R package

	library("vcr") # Required as vcr is set up on loading

	# escapeRegex function from the Hmisc package
	escape_regex <- function(x) {
	gsub("([.\|()\\^{}+$*?]\|\\[\|\\])", "\\\\\\1", x)
	}

	#!/usr/bin/env bash
	# This script retrieves BAM files from AWS and runs the samtools and somalier tools.
	# Requirements:
	# - AWS CLI with credentials
	# - samtools
	# - docker
	# - paths.txt file with path to BAM files on AWS S3, one per line

	set -e
	set -x

	#!/usr/bin/env bash

	# This script creates QC reports for all FASTQ files found in the 'fastq' directory,
	# aligns the (paired-end) reads to the references index & attempts to stitch overlapping
	# read pairs into single sequences.

	set -e
	set -x
	set -o pipefail
	set -o nounset

	# Examples are modified from the vtree vignette: https://cran.r-project.org/web/packages/vtree/vignettes/vtree.html

	library(vtree)
	library(dplyr)
	library(gtsummary)

	class(Titanic) # array with counts
	titanic <- crosstabToCases(Titanic)
	head(titanic) # data.frame with individual outcomes

	---
	title: "Introduction to the stringr package"
	author: "Thomas Sandmann"
	date: "7/10/2020"
	output: html_document
	editor_options:
	chunk_output_type: console
	---

	```{r setup, include=FALSE}

	# A closer look at the HTODemux() function

	1. Centered log-ratio (CLR) normalization of the counts
	2. Define K as the number of barcodes + 1
	3. Run k-means / k-median clustering
	4. Calculate mean (normalized) barcode expression within each cluster
	5. For each barcode
	- identify the cluster with the lowest mean counts
	- fit a negative binomial distribution to the raw counts of each cluster
	- define the 99th percentile of the (background) distribution as threshold

	library(flexmix)
	library(countreg)

	# poisson

	## two clusters of 1000 measurements each
	y1 <- rpois(n = 1000, lambda = 20)
	y2 <- rpois(n = 1000, lambda = 1)

	y <- c(y1, y2)

	#' RLA plot
	#'
	#' @param df data.frame
	#' @param x Scalar character, column of `df` used for the x-axis
	#' @param y Scalar character, column of `df` used for the y-axis
	#' @param feature Scalar character, column of `df` identifying the feature identifier
	#' @param facet_rows Character vector, column(s) of `df` used facet rows
	#' @param facet_cols Character vector, column(s) of `df` used facet columns
	#' @param group_cols Character vector, column(s) of `df` that define the
	#' grouping variable for median centering.

	#' Adjust voomed expression based on a limma fit
	#'
	#' When a design that contains both covariates of interest and nuisance
	#' covariates (e.g. batch terms) is fit with limma's `lmFit` function,
	#' it is helpful to examine the effect of the adjustment, e.g. by
	#' performing MDS on the adjusted results. The `limma::removeBatchEffect`
	#' could be used, but performs the full fitting procedure. Alternatively
	#' The coefficients for the nuisance terms from the full model can be
	#' extracted from the fit and used to correct the input matrix (usually
	#' log transformed abundances, e.g. from `limma::voom`).