Skip to content

Instantly share code, notes, and snippets.

View crazyhottommy's full-sized avatar
🎯
Focusing

Ming Tang crazyhottommy

🎯
Focusing
View GitHub Profile

Pandoc is a very useful tool to convert common formats.

First install pandoc on mac by:

brew install pandoc

pandoc requires pdflatex to convert to pdfs.

install mactex:
download it and just double click it should install it.

human mouse
A1BG A1bg
A1CF A1cf
A2LD1 A2ld1
A2M A2m
A4GALT A4galt
A4GNT A4gnt
AAAS Aaas
AACS Aacs
AADAC Aadac
#!/bin/bash
# function Extract for common file formats
function extract {
if [ -z "$1" ]; then
# display usage if no parameters given
echo "Usage: extract <path/file_name>.<zip|rar|bz2|gz|tar|tbz2|tgz|Z|7z|xz|ex|tar.bz2|tar.gz|tar.xz>"
else
if [ -f "$1" ] ; then
NAME=${1%.*}
cat ref.gene.txt
chr1 1736 4272 DDX11L1 +
chr1 4224 19233 WASH7P -
chr1 4224 7502 LOC100288778 -
chr1 7231 7299 MIR6859-1 -
chr1 7231 7299 MIR6859-2 -
chr1 7231 7299 MIR6859-3 -
chr1 7231 7299 MIR6859-4 -
## http://stackoverflow.com/questions/19876505/boxplot-show-the-value-of-mean
## plot adding mean value
ggplot(NLR.tidy, aes(x=NLR, y=ratio_value, color= NLR,fill= NLR)) +
geom_point(position=position_jitterdodge(dodge.width=0.9)) +
geom_boxplot(fill="white", alpha=0.1, outlier.colour = NA,
position = position_dodge(width=0.9)) +
coord_cartesian(ylim = c(-0.5, 15)) +
stat_summary(fun.y = mean, geom="point",colour="black", size=3, show.legend = FALSE) +
stat_summary(fun.y=mean, colour="red", geom="text", show.legend =FALSE,
vjust=-0.7, aes( label=round(..y.., digits=1)))
@crazyhottommy
crazyhottommy / paired-sample2.py
Last active July 1, 2016 18:54
snakemake-paired-sample
aDict = {"B":"inputG1", "A":"inputG1", "C":"inputG2"}
rule all:
input: ["C.bed", "A.bed", "B.bed"]
def get_files(wildcards):
case = wildcards.case
control = aDict[case]
return [case + ".sorted.bam", control + ".sorted.bam"]
@crazyhottommy
crazyhottommy / htseq_cnt_effective.md
Last active July 25, 2016 02:56
salmon_htseq_compare

get rid of the digits (gene version) in the end for the gene names (gencode v19)

cat STAR_WT-30393468_htseq.cnt| sed -E 's/\.[0-9]+//' > WT_htseq.cnt

transcript to gene mapping file:

library(EnsDb.Hsapiens.v75)
---
title: "lncRNA_heatmap"
author: "Ming Tang"
date: "July 28, 2016"
output: html_document
---
Read in the bigwig files for each mark. bigwig files were generated by Deeptools from bam files.
```{r}
library(EnrichedHeatmap)
## devtools::install_github("stephenturner/msigdf")
library(msigdf)
library(dplyr)
library(clusterProfiler)
c2 <- msigdf.human %>%
filter(collection == "c2") %>% select(geneset, entrez) %>% as.data.frame
data(geneList)
de <- names(geneList)[1:100]
Make a heatmap with colored dendrogram by `complexHeatmap` and `Dendsort`.
See help [here](https://bioconductor.org/packages/release/bioc/vignettes/ComplexHeatmap/inst/doc/s2.single_heatmap.html)
```r
##### a make_hc function to receive different distance_measure and linkage_method
make_hc<- function(x, distance_measure, linkage_method){
if (distance_measure == "pearson"){
## cor calculate for columns, needs to transpose x first
distance <- as.dist(1-cor(t(x), method = "pearson"))
hc<- hclust(distance, method = linkage_method)