aseetharam / notebook_launcher.py

Created June 24, 2014 16:42 — forked from timo/notebook_launcher.py

	"""==============================
	Branded IPython Notebook Launcher
	=================================

	Executing this module will create an overlay over ipython notebooks own static
	files and templates and overrides static files and templates and copies over all
	example notebooks into a temporary folder and launches the ipython notebook server.

	You can use this to offer an interactive tutorial for your library/framework/...

aseetharam / heatmap_pca_metabolomics.R

Last active August 29, 2015 14:28

Clustering metabolomics data for different tissues from various location

	library(Heatplus)
	library(vegan)
	library(RColorBrewer)
	library("gplots")
	all.data <- read.csv("C:/Users/Arun Seetharam/OneDrive/PostDoc/Projects/20150303_Perera_metabolomics/bloodroot_data_v2d.csv", quote="")
	row.names(all.data) <- all.data$ID
	all.data <- all.data[, -1]
	data.prop <- all.data/rowSums(all.data)
	scaleyellowred <- colorRampPalette(c("lightyellow", "red"), space = "rgb")(100)
	heatmap(as.matrix(data.prop), Rowv = NA, Colv = NA, col = scaleyellowred)

aseetharam / extract_seq.sh

Created September 11, 2015 01:34

aseetharam / spacer.sh

Last active April 22, 2016 20:31

to extract the sequence of interest

	## OPTION 1
	# convert the gzipped fastq file to a single line sequence files
	zcat input.fastq.gz \| sed -n '2~4p' > single_line_sequences.txt
	# you are aksing it to print 2nd line followed every 4th line after that.
	perl -ne 'while ($_ =~ m/GTGTTCCCCGCGCCAGCGGGGATAAACC([ATCG]{32})/g) {print $1."\t"} {print "\n"}' single_line_sequences.txt
	# here, you are printing the 32 bases after the matching string using perl and using tab as delimiter.
	# the input is the above file you created in the first step.
	# this will generate the output for the first part.

	## OPTION 2

aseetharam / spacers_2.sh

Last active April 23, 2016 13:07

	## find the spacers
	grep -one "GTGTTCCCCGCGCCAGCGGGGATAAACC.\{32\}" example.list \| \
	sed 's/:GTGTTCCCCGCGCCAGCGGGGATAAACC/\t/g' \| \
	awk '{print ">"$1"\n"$2}' > example_spacers.fa
	# find the crispr sequences, extract the 32 bases spacers adjacent to it and print them as fasta sequence
	# multiple spacers from the same sequence are printed with the same sequence id
	# to count:
	grep -c ">" example_spacers.fa
	# will give you total number of spacers
	grep ">" example_spacers.fa \|sort \|uniq \|wc -l

aseetharam / spacers_3.sh

Last active April 25, 2016 14:08

thid version

	# here is an example to write seperate files based on number of occurences of spacers:
	perl -ne 'while ($_ =~ m/GTGTTCCCCGCGCCAGCGGGGATAAACC([ATCG]{32})/g) {push(@matches, $1)}if(@matches){ print "@matches\t";undef @matches; print "\n"}' example.list \| awk 'NF==2'
	GGTAACTTGCCGGAGGGCAGCGACCAGTTTAA GATGCACAGCCTGTTGCCATTCCGCCTCCTGT
	GCAACTCGGTCGCCGCATACACTATTCTCAGA GGAAAGCCTCTTTCCTTTGTTTACGATATTGC
	GGTTTTGCGCCATTCGATGGTGTCCGGGATCT GCGGCCCACGCTGGTTTGCCCCAGCAGGCGAA
	GCGCTGATTTCTTAATGTGATCGGTAGCACGT AAAAAATTATATTGACGCGGCGAGTTATAATA
	GTGCTCCAGTGGCTTCTGTTTCTATCAGCTGT GGGTGAACACTATCCCATATCACCAGCTCACC
	GGAATATTCAGCGATTTGCCCGAGCTTGCGAG GCGTGCCGCCCCCAGCAACAATACGCTACTGA

	# see the last part (awk 'NF==2'), here you are specifying that the number of fields should be exactly 2, you can change this to 1 to-

aseetharam / mem_run

Created September 29, 2017 14:16

	canu \
	-d /project/rw_genome/arun_test/20170928_canu_c \
	-p 20170928_canu_c genomeSize=770m \
	maxMemory=128g \
	maxThreads=16 \
	gridOptionsJobName=canu_as1 \
	gridOptionsExecutive="--mem-per-cpu=5g --time=2:00:00" \
	gridOptionsCORMHAP="--mem-per-cpu=5g" \
	gridOptionsCORMHAP="--time=1:00:00" \
	gridOptionsOBTMHAP="--mem-per-cpu=5g --time=1:00:00" \

aseetharam / read_limit

Created September 29, 2017 14:18

	canu \
	-d /project/rw_genome/arun_test/20170928_canu_b \
	-p 20170928_canu_b \
	genomeSize=770m \
	maxMemory=1096g \
	maxThreads=40 \
	minReadLength=1000 \
	corOutCoverage=35 \
	gridOptionsJobName=canu_as-b \
	corMhapFilterThreshold=0.0000000002 \

aseetharam / bedtools_cheatsheet.md

Last active January 5, 2021 16:44 — forked from raivivek/bedtools_cheatsheet.md

Bedtools cheatsheet

Bedtools Cheatsheet

General:

Tools	Description
flank	Create new intervals from the flanks of existing intervals.
slop	Adjust the size of intervals.
shift	Adjust the position of intervals.
subtract	Remove intervals based on overlaps b/w two files.

aseetharam / geneBed.sh

Last active March 30, 2023 16:44 — forked from dinovski/geneBed.sh

Add padding up and/or downstream of TSS coordinates

	#!/bin/bash

	BEDTOOLS=$(dirname $(which bedtools))
	FEATURECOUNTS=$(which featureCounts)
	R=$(which R)

	IDIR=$(pwd)
	DIR=$(pwd)/GeneBed_output

	GTF=gencode.vM30.annotation.gtf

Arun Seetharam aseetharam