Skip to content

Instantly share code, notes, and snippets.

@arq5x
arq5x / grantham-dict.py
Last active January 30, 2025 06:31
Convert Grantham Amino Acid matrix into Python dict.
#!/usr/bin/env python
import sys
import pprint
def make_grantham_dict(grantham_mat_file):
"""
Citation: http://www.ncbi.nlm.nih.gov/pubmed/4843792
Provenance: http://www.genome.jp/dbget-bin/www_bget?aaindex:GRAR740104
@leipzig
leipzig / hg19gaps
Created July 31, 2013 16:35
hg19 gaps
#bin chrom chromStart chromEnd ix n size type bridge
0 chr1 124535434 142535434 1271 N 18000000 heterochromatin no
23 chr1 121535434 124535434 1270 N 3000000 centromere no
76 chr1 3845268 3995268 47 N 150000 contig no
85 chr1 13219912 13319912 154 N 100000 contig no
89 chr1 17125658 17175658 196 N 50000 clone yes
101 chr1 29878082 30028082 337 N 150000 contig no
188 chr1 120697156 120747156 1263 N 50000 clone yes
188 chr1 120936695 121086695 1265 N 150000 contig no
188 chr1 121485434 121535434 1269 N 50000 clone no
@ericminikel
ericminikel / exampleRScript1.r
Created January 14, 2014 23:53
An example of how to use Rscript and optparse to run R in batch mode with command line args.
#!/broad/software/free/Linux/redhat_5_x86_64/pkgs/r_3.0.2/bin/Rscript
# Eric Vallabh Minikel
# CureFFI.org
# 2014-01-14
# example of how to use optparse in R scripts
# usage: ./exampleRScript1.r -a thisisa -b hiagain
# ./exampleRScript1.r --avar thisisa --bvar hiagain
@arq5x
arq5x / inheritance_scenarios.md
Last active March 28, 2022 21:45
mendelian violations
dad mom kid Inheritance description
HOM_REF HOM_REF HOM_REF Expected
HOM_REF HOM_REF HET Mendelian violation (plausible de novo)
HOM_REF HOM_REF HOM_ALT Mendelian violation (implausible de novo)
HOM_REF HOM_ALT HOM_REF Mendelian violation (uniparental disomy)
HOM_REF HOM_ALT HET Expected
HOM_REF HOM_ALT HOM_ALT Mendelian violation (uniparental disomy)
HOM_REF HET HOM_REF Expected
HOM_REF HET HET Expected
@peterklipfel
peterklipfel / spark_master.sh
Last active November 24, 2015 19:38
install spark ubuntu 14.04
sudo apt-get update
sudo apt-get install -y openjdk-7-jdk
sudo su -c 'echo "JAVA_HOME=\"/usr/lib/jvm/java-7-openjdk-amd64\"" >> /etc/environment'
cd /opt
sudo wget http://d3kbcqa49mib13.cloudfront.net/spark-1.0.1-bin-hadoop2.tgz
sudo tar -zxf spark-1.0.1-bin-hadoop2.tgz
cd spark-1.0.1-bin-hadoop2
# assumes that hostname is set correctly, and the main interface is on eth0, add HOSTNAME to /etc/hosts
sudo su -c "echo `ifconfig eth0 2>/dev/null|awk '/inet addr:/ {print $2}'|sed 's/addr://'` `cat /etc/hostname` >> /etc/hosts"
sbin/start-master.sh
@arq5x
arq5x / autosomal-dominant.sh
Last active November 13, 2019 17:55
GEMINI Tutorial Commands
# assumes you have SSH'ed and qlogin'ed
cd thu
cd mydata
# slide 5
curl https://s3.amazonaws.com/gemini-tutorials/trio.trim.vep.vcf.gz > trio.trim.vep.vcf.gz
curl https://s3.amazonaws.com/gemini-tutorials/dominant.ped > dominant.ped
gemini load --cores 2 \
-v trio.trim.vep.vcf.gz \
-t VEP \
@tomhopper
tomhopper / plot_aligned_series.R
Last active June 25, 2023 17:36
Align multiple ggplot2 graphs with a common x axis and different y axes, each with different y-axis labels.
#' When plotting multiple data series that share a common x axis but different y axes,
#' we can just plot each graph separately. This suffers from the drawback that the shared axis will typically
#' not align across graphs due to different plot margins.
#' One easy solution is to reshape2::melt() the data and use ggplot2's facet_grid() mapping. However, there is
#' no way to label individual y axes.
#' facet_grid() and facet_wrap() were designed to plot small multiples, where both x- and y-axis ranges are
#' shared acros all plots in the facetting. While the facet_ calls allow us to use different scales with
#' the \code{scales = "free"} argument, they should not be used this way.
#' A more robust approach is to the grid package grid.draw(), rbind() and ggplotGrob() to create a grid of
#' individual plots where the plot axes are properly aligned within the grid.
@sterding
sterding / venn_pie_chart.r
Last active July 28, 2019 20:44
R script to generate multi-layer pie chart (or called it venn pieagram) to visualize the NGS reads distribution in different annotation regions
## data input (number of reads mapped to each category)
total=100
rRNA=5 # mapped to nuclear rRNA regions
mtRNA=7 # mapped to mitochondria genome
# for the rest of above, then we divide into different category, like http://www.biomedcentral.com/1741-7007/8/149 did.
intergenic=48
introns=12
exons=30
upstream=3
downstream=6
@mblondel
mblondel / multiclass_svm.py
Last active March 3, 2023 07:57
Multiclass SVMs
"""
Multiclass SVMs (Crammer-Singer formulation).
A pure Python re-implementation of:
Large-scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex.
Mathieu Blondel, Akinori Fujino, and Naonori Ueda.
ICPR 2014.
http://www.mblondel.org/publications/mblondel-icpr2014.pdf
"""
@slowkow
slowkow / bigWigRegions.sh
Created August 24, 2015 20:45
Print bigWig data for each region in a BED file
#!/usr/bin/env bash
# bigWigRegions
# Kamil Slowikowski
#
# Depends on bigWigToBedGraph: http://hgdownload.cse.ucsc.edu/admin/exe/
if [[ $# -ne 2 ]]; then
echo "bigWigRegions - Print bigWig data for each region in a bed file."
echo -e "usage:\n bigWigRegions in.bigWig in.bed > out.bedGraph"
echo " bigWigRegions in.bigWig <(zcat in.bed.gz) > out.bedGraph"