Skip to content

Instantly share code, notes, and snippets.

View mbreese's full-sized avatar

Marcus Breese mbreese

  • Indiana University
  • Cincinnati, OH
View GitHub Profile
@mbreese
mbreese / gist:4634262
Last active December 11, 2015 17:28
gitorious ldap config for connecting to Active Directory
production:
disable_default: true
methods:
- adapter: Gitorious::Authentication::LDAPAuthentication
server: 10.1.?.?
port: 389
base_dn: DC=domain,DC=org
# bind_username: binduser
# bind_password: binduserpass
user_filter:
@mbreese
mbreese / gist:c79150b6d504ff0cf7ff
Created May 9, 2014 05:42
Using modules to version your processing pipeline

Environmental Modules

Environment Modules is a utility that has been used to manage executables and paths for high-performance computing clusters for multiple decades (1991!). The basic idea is that you can use modules to adapt your processing environment (and $PATH) to make sure that your environment is consistent. Importantly, this allows system administrators the ability to install and maintain multiple versions of software for different users. This tool can also be used to effectively manage your bioinformatics processing pipelines to ensure consistent analysis runs. For example, if you have a set of samples that will need to be analyzed consistently over a long time span, you could use modules to make sure that the same version of a program is used throughout the entire experiment while letting you use newer versions for different experiments.

Installation

If you are running your samples on a computing cluster, chances are you are already using modules to configure your environment (add progra

@mbreese
mbreese / docker-heredoc-snippet
Last active June 12, 2023 21:28
Running docker with a HEREDOC to script the commands to run inside the container.
docker run -v /Users/mbreese/tmp:/tmp1 -w /tmp1 -i centos:7 /bin/bash -s <<EOF
date > foo
echo 'foo' >> foo
cat /etc/redhat-release >> foo
whoami >> foo
EOF
@mbreese
mbreese / build-tmux.sh
Last active January 23, 2024 10:11
HOWTO build a statically linked tmux in one script (downloads and builds dependencies from source)
#!/bin/bash
TARGETDIR=$1
if [ "$TARGETDIR" = "" ]; then
TARGETDIR=$(python -c 'import os; print os.path.realpath("local")')
fi
mkdir -p $TARGETDIR
libevent() {
curl -LO https://github.com/libevent/libevent/releases/download/release-2.0.22-stable/libevent-2.0.22-stable.tar.gz
tar -zxvf libevent-2.0.22-stable.tar.gz
@mbreese
mbreese / db.sql
Created January 30, 2018 15:56
Add a new user to Postgres
CREATE ROLE db_user LOGIN ENCRYPTED PASSWORD 'secret-password' NOSUPERUSER INHERIT NOCREATEDB NOCREATEROLE NOREPLICATION;
CREATE DATABASE db_name WITH OWNER db_user;
GRANT ALL PRIVILEGES ON DATABASE db_name TO db_user;
@mbreese
mbreese / expand_vep_vcf.py
Created February 21, 2018 14:29
Given a VCF file annotated with VEP, expand the VEP "CSQ" INFO field to add INFO fields for all annotations
#!/usr/bin/env python
import sys
prefix="VEP_"
vep_info_name = "CSQ"
def parse_info_format(line):
###INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|...">
@mbreese
mbreese / vep_vcf_worst_consequence.py
Created February 21, 2018 14:31
Given a VCF file annotated with VEP (and expanded), calculate the worst consequence, gene, SIFT, PolyPhen, etc...
#!/usr/bin/env python
import sys
import itertools
cons_key = "VEP_Consequence"
impact_key = "VEP_IMPACT"
gene_key = "VEP_SYMBOL"
csn_key = "VEP_CSN"
sift_key = "VEP_SIFT"
@mbreese
mbreese / sample_tree.py
Created March 14, 2018 10:30
Sampling program to read write every X lines to stderr, otherwise, just read from stdin and write to stdout (useful for monitoring stream progress)
#!/usr/bin/env python
import sys
import datetime
rate = 100000
if len(sys.argv) > 1:
rate = int(sys.argv[1])
@mbreese
mbreese / daily_zfs_snapshot.sh
Last active January 7, 2022 22:57
This is a script to take daily snapshots of a ZFS filesystem (and prune older snapshots).
#!/bin/bash
#
# This will by default take snapshots of all zfs filesystems,
# but could be adapted to only pull specific ones. This script
# will also rotate snapshots, removing those that are older
# than 2 weeks.
#
CURDATE=$(date +%Y%m%d)
TWOWEEKSAGO=$(date -d'2 weeks ago' +%Y%m%d)
@mbreese
mbreese / fit.sh
Created February 15, 2023 14:52
Quick script to convert stdin to a line size that fits the screen. For example, `ps -e` will return only lines that fit on the current screen. If you want to grep that output, you get all of the data and it wraps the screen. If you pipe `ps -e | grep foo | fit`, then you get the trimmed output again.
#!/bin/bash
COLS="$(tput cols)"
while read LINE; do
echo $LINE | head -c $COLS
done