Skip to content

Instantly share code, notes, and snippets.

View genomewalker's full-sized avatar

Antonio Fernandez-Guerra genomewalker

View GitHub Profile
--- pvec.h 2018-12-09 12:08:14.727171805 +0100
+++ pvec_mod.h 2018-12-09 12:09:36.186071505 +0100
@@ -539,22 +539,26 @@
static inline double gap_posterior( double v1, double v2 ) {
- assert( pgap_model.is_valid_ptr() );
- //return v1 / (v1 + v2);
+ assert( pgap_model.is_valid_ptr() );
+ //return v1 / (v1 + v2);
cat SRC_sample_ids.txt | parallel -j 16 esearch -db sra -query {} | efetch -format runinfo > stats
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel Core Processor (Haswell, no TSX, IBRS)
stepping : 1
microcode : 0x1
cpu MHz : 2294.684
cache size : 16384 KB
physical id : 0
Program call:
search QUERY DB results tmp --num-iterations 2 -e 1e-5 -c 0.4
MMseqs Version: 199d9b81f8cd6af9e66f97ef4dc0bd53c8fce12b
Sub Matrix blosum62.out
Add backtrace true
Alignment mode 2
E-value threshold 1e-05
Seq. Id Threshold 0
Seq. Id. Mode 0
library(Nonpareil)
library(tidyverse)
f <- list.files(path="/scratch/antonio/nonpareil/out/", pattern = "npo")
f <- list.files(path="~/tara_nonpareil/", pattern = "npo")
sample.names <- sapply(strsplit(f, ".npo"), `[`, 1)
Nonpareil.curve(f[[1]])
results_np_tara <- lapply(file.path("~/tara_nonpareil/",f), Nonpareil.curve)
# Let’s get the non-merged reads ------------------------------------------
purrr::map_df(merged, tidyr::extract, col = "sequence", into = "sequence" ) %>%
filter(accept == FALSE) %>%
select(nmatch, nmismatch, nindel) %>%
skimr::skim()
concat <- mergePairs(dada_forward, derep_forward, dada_reverse, derep_reverse, justConcatenate = TRUE)
get_nonmerged <- function(X, merg = merg, conc = conc, fn = fn){
m <- merg[[X]]

18S magicblast

dada2_input filtered denoised merged table no_chimeras perc_reads_survived
ERR1018543 6525 4002 3727 3629 3629 3076 47.1
ERR1018546 2421 1513 1276 1229 1229 1095 45.2

Total counts: 4,171

|Kingdom | n|

detachAllPackages <- function() {
basic.packages <- c("package:stats","package:graphics","package:grDevices","package:utils","package:datasets","package:methods","package:base")
package.list <- search()[ifelse(unlist(gregexpr("package:",search()))==1,TRUE,FALSE)]
package.list <- setdiff(package.list,basic.packages)
if (length(package.list)>0) for (package in package.list) detach(package, character.only=TRUE)
}

Exploration of the ubiquitous EUs

Not related to the niche breadth analysis, we just use those that are found everywhere.

Exporting the clusters belonging to the components found in all samples

super_cl <- read_tsv("/Users/ufo/Downloads/all_cluster_components.tsv", col_names = TRUE, trim_ws = TRUE) %>%
  filter(component %in% clstrs_comp_eu_ubi) %>%
  select(clstr_name) %>%
@genomewalker
genomewalker / nog_id.sh
Last active February 13, 2018 13:59
Create TARA subsampled (at nt level) read files
#!/bin/bash
# 1. We read a folder and take all fastq files
# 2. We quality trim and remove short sequences
# 3. Run kaiju and generate output
#. "${MODULESHOME}"/init/bash
set -x
set -e
main(){