Skip to content

Instantly share code, notes, and snippets.

View seb-mueller's full-sized avatar

Sebastian seb-mueller

View GitHub Profile
@seb-mueller
seb-mueller / fastq2blast.sh
Last active May 6, 2018 17:04
Contamination checking for fasta/fastq files
export BLASTDB=/data/public_data/NCBI/NCBI_all/ #and specify the db as -db nt |or -db /data/public_data/NCBI/NCBI_nt/nt
export PATH=/applications/UCSC-tools/:/applications/ncbi-blast+/ncbi-blast-2.2.30+/bin/:$PATH
#subsetting and converting into fasta
file=sample.fq.gz
filefa=${file%.fq.gz}_subset.fa
n=400000
zcat $file | head -n $n > ${file%.fq.gz}_subset.fq
fastqToFa ${file%.fq.gz}_subset.fq $filefa
@seb-mueller
seb-mueller / upsetR_group_overlap.R
Created August 23, 2018 15:45
since UpsetR doesn't report back the elements for the ploted groups this function was created to to just that. See (https://github.com/hms-dbmi/UpSetR/issues/85)
# source of this function: https://github.com/hms-dbmi/UpSetR/issues/85#issuecomment-327900647
fromList <- function (input) {
# Same as original fromList()...
elements <- unique(unlist(input))
data <- unlist(lapply(input, function(x) {
x <- as.vector(match(elements, x))
}))
data[is.na(data)] <- as.integer(0)
data[data != 0] <- as.integer(1)
data <- data.frame(matrix(data, ncol = length(input), byrow = F))