This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin/env python | |
import sys | |
import scipy.stats as stats | |
#The result will be | |
# a p-value where by random chance number of genes with both condition A and B will be <= to your number with condition A and B | |
# a p-value where by random chance number of genes with both condition A and B will be >= to your number with condition A and B | |
# The second p-value is probably what you want. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# search pubmed contains "glioblastoma enhancer" | |
$esearch -db pubmed -query "glioblastoma enhancer" | |
<ENTREZ_DIRECT> | |
<Db>pubmed</Db> | |
<WebEnv>NCID_1_539964707_130.14.18.34_9001_1422280320_2091337226_0MetA0_S_MegaStore_F_1</WebEnv> | |
<QueryKey>1</QueryKey> | |
<Count>97</Count> | |
<Step>1</Step> | |
</ENTREZ_DIRECT> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin | |
# put the coordinates in a bed file | |
infile=$1 | |
while read chr start end | |
do | |
samtools faidx ref.fasta $chr:$start-$end >> test.fa | |
done <$infile |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### This part is from the Edx online Harvard course | |
## HarvardX: PH525.3x Advanced Statistics for the Life Sciences, week1 | |
library(devtools) | |
install_github("genomicsclass/GSE5859Subset") | |
library(GSE5859Subset) | |
data(GSE5859Subset) | |
dim(geneExpression) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# creat a test file | |
$time seq 1 10000000 > ten_million.txt | |
seq 1 10000000 > ten_million.txt 3.51s user 0.13s system 99% cpu 3.663 total | |
# it is a "big" file with size of 109M | |
$ls -lh ten_million.txt | |
-rw-r--r-- 1 Tammy staff 109M Mar 22 20:49 ten_million.txt | |
$man gshuf | |
# randomly select 1000 lines from it |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Overview | |
# central limit theorem (CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed, regardless of the underlying distribution. | |
# I am going to draw 40 numbers from exponential distribution for 1000 times, and calcuate the mean | |
# of each draw (we will have 1000 means), and through this simulation to test if the | |
# distribution of the means will be normal or not. | |
## start simulation | |
# number of simulation, sample size and lambda | |
nosim<- 1000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
options(stringsAsFactors=F) | |
library(gdata) | |
library(parallel) | |
files = list.files(path='ctx/',pattern='*.bd$') | |
meta = read.csv("WGS.coverage.csv") | |
mclapply (files, function(f) { | |
dat = read.delim(sprintf('ctx/%s', f),comment.char='#',header=F,as.is=T)[,-(12:14)] | |
message(sprintf("File: %s, Dim: (%s)", f, paste(dim(dat), collapse=","))) | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## get all the promoter sequences for human hg19 genome | |
## Author: Ming Tang (Tommy) | |
## Date: 04/30/2015 | |
## load the libraries | |
library(GenomicRanges) | |
library(BSgenome.Hsapiens.UCSC.hg19) | |
BSgenome.Hsapiens.UCSC.hg19 | |
# or | |
Hsapiens |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### I am going to demonstrate how to use ipython notebook bash_kernal to do reproducible research.\n", | |
"I can do command line in the notebook and take notes along the way.\n", | |
"Let's go to the directory first." | |
] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This R script is to generate the TF or histone modification heatmap | |
# at certain genomic features (TSS, enhancers) from the ChIP-seq data | |
# the input matrix is got from Homer software. alternative to R, use cluster3 to cluster, and visualize by # java Treeviewer | |
# generate the matrix by Homer: annotatePeaks.pl peak_file.txt hg19 -size 6000 -hist 10 -ghist -d TF1/ # > outputfile_matrix.txt | |
# see several posts for heatmap: | |
# http://davetang.org/muse/2010/12/06/making-a-heatmap-with-r/ | |
# http://www.r-bloggers.com/r-using-rcolorbrewer-to-colour-your-figures-in-r/ | |
# 08/20/13 by Tommy Tang | |
# it is such a simple script but took me several days to get it work...I mean the desired |