Created
April 29, 2019 20:16
-
-
Save genomewalker/96cfc0c4754cd4f2cd1ce6925576c876 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+ set -e | |
+ MMSEQS=/vol/attached/opt/MMseqs2-7-4e23d/bin/mmseqs | |
+ DIR=/vol/attached/gtdb | |
+ SDIR=/vol/scratch/gtdb | |
+ export 'OMPI_MCA_btl=^openib' | |
+ OMPI_MCA_btl='^openib' | |
+ export OMP_NUM_THREADS=28 | |
+ OMP_NUM_THREADS=28 | |
+ RUNNER='mpirun --mca btl_tcp_if_include ens3 -n 10 --map-by ppr:1:node --bind-to none ' | |
+ /vol/attached/opt/MMseqs2-7-4e23d/bin/mmseqs clusterupdate /vol/scratch/gtdb/marine_hmp_db_03112017 /vol/scratch/gtdb/mg_gtdb_orfs_db /vol/scratch/gtdb/marine_hmp_db_03112017_clu /vol/attached/gtdb/mg_gtdb_update/mg_gtdb_db_052019 /vol/attached/gtdb/mg_gtdb_update/mg_gtdb_db_ | |
052019_clu /vol/attached/gtdb/mg_gtdb_update/tmp --min-seq-id 0.3 -s 5 --cov-mode 0 -c 0.8 | |
Program call: | |
clusterupdate /vol/scratch/gtdb/marine_hmp_db_03112017 /vol/scratch/gtdb/mg_gtdb_orfs_db /vol/scratch/gtdb/marine_hmp_db_03112017_clu /vol/attached/gtdb/mg_gtdb_update/mg_gtdb_db_052019 /vol/attached/gtdb/mg_gtdb_update/mg_gtdb_db_052019_clu /vol/attached/gtdb/mg_gtdb_update/tm | |
p --min-seq-id 0.3 -s 5 --cov-mode 0 -c 0.8 | |
MMseqs Version: GITDIR-NOTFOUND-MPI | |
Sub Matrix blosum62.out | |
Add backtrace false | |
Alignment mode 0 | |
E-value threshold 0.001 | |
Seq. Id Threshold 0.3 | |
Seq. Id. Mode 0 | |
Alternative alignments 0 | |
Coverage threshold 0.8 | |
Coverage Mode 0 | |
Max. sequence length 65535 | |
Compositional bias 1 | |
Realign hit false | |
Max Reject 2147483647 | |
Max Accept 2147483647 | |
Include identical Seq. Id. false | |
Preload mode 0 | |
Pseudo count a 1 | |
Pseudo count b 1.5 | |
Score bias 0 | |
Gap open cost 11 | |
Gap extension cost 1 | |
Threads 28 | |
Verbosity 3 | |
Sensitivity 5 | |
K-mer size 0 | |
K-score 2147483647 | |
Alphabet size 21 | |
Offset result 0 | |
Split DB 0 | |
Split mode 2 | |
Split Memory Limit 0 | |
Diagonal Scoring 1 | |
Exact k-mer matching 0 | |
Mask Residues 1 | |
Minimum Diagonal score 15 | |
Spaced Kmer 1 | |
Spaced k-mer pattern | |
Local temporary path | |
Rescore mode 0 | |
Remove hits by seq.id. and coverage false | |
Sort results 0 | |
In substitution scoring mode, performs global alignment along the diagonal false | |
Mask profile 1 | |
Profile e-value threshold 0.001 | |
Use global sequence weighting false | |
Filter MSA 1 | |
Maximum sequence identity threshold 0.9 | |
Minimum seq. id. 0 | |
Minimum score per column -20 | |
Minimum coverage 0 | |
Select n most diverse seqs 1000 | |
Omit Consensus false | |
Min codons in orf 1 | |
Max codons in length 2147483647 | |
Max orf gaps 2147483647 | |
Contig start mode 2 | |
Contig end mode 2 | |
Orf start mode 0 | |
Forward Frames 1,2,3 | |
Reverse Frames 1,2,3 | |
Translation Table 1 | |
Use all table starts false | |
Offset of numeric ids 0 | |
Add Orf Stop false | |
Number search iterations 1 | |
Start sensitivity 4 | |
Search steps 1 | |
Run a seq-profile search in slice mode false | |
Strand selection 1 | |
Disk space limit 0 | |
Sets the MPI runner mpirun --mca btl_tcp_if_include ens3 -n 10 --map-by ppr:1:node --bind-to none | |
Remove Temporary Files false | |
Cluster mode 0 | |
Max depth connected component 1000 | |
Similarity type 2 | |
Single step clustering true | |
Cascaded clustering steps 3 | |
Kmer per sequence 21 | |
Shift hash 5 | |
Include only extendable false | |
Skip sequence with n repeating k-mers 0 | |
Match sequences by their ID false | |
Recover Deleted false | |
=================================================== | |
=== Update the new sequences with the old keys ==== | |
=================================================== | |
=================================================== | |
====== Filter out the new from old sequences ====== | |
=================================================== | |
=================================================== | |
======= Extract representative sequences ========== | |
=================================================== | |
=================================================== | |
======== Search the new sequences against ========= | |
========= previous (rep seq of) clusters ========== | |
=================================================== | |
Program call: | |
search /vol/attached/gtdb/mg_gtdb_update/tmp/NEWDB.newSeqs /vol/attached/gtdb/mg_gtdb_update/tmp/OLDDB.repSeq /vol/attached/gtdb/mg_gtdb_update/tmp/newSeqsHits /vol/attached/gtdb/mg_gtdb_update/tmp/search --sub-mat blosum62.out -a 0 --alignment-mode 0 -e 0.001 --min-seq-id 0.3 | |
--seq-id-mode 0 --alt-ali 0 -c 0.8 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --realign 0 --max-rejected 2147483647 --max-accept 1 --add-self-matches 0 --db-load-mode 0 --pca 1 --pcb 1.5 --score-bias 0 --gap-open 11 --gap-extend 1 --threads 28 -v 3 -s 5 -k 0 --k-score | |
2147483647 --alph-size 21 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --spaced-kmer-mode 1 --rescore-mode 0 --filter-hits 0 --sort-results 0 --global-alignment 0 --mask-profile 1 --e-p | |
rofile 0.001 --wg 0 --filter-msa 1 --max-seq-id 0.9 --qid 0 --qsc -20 --cov 0 --diff 1000 --omit-consensus 0 --min-length 1 --max-length 2147483647 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 0 --forward-frames 1,2,3 --reverse-frames 1,2,3 - | |
-translation-table 1 --use-all-table-starts 0 --id-offset 0 --add-orf-stop 0 --num-iterations 1 --start-sens 4 --sens-steps 1 --slice-search 0 --strand 1 --disk-space-limit 0 --remove-tmp-files 0 | |
MMseqs Version: GITDIR-NOTFOUND-MPI | |
Sub Matrix blosum62.out | |
Add backtrace false | |
Alignment mode 0 | |
E-value threshold 0.001 | |
Seq. Id Threshold 0.3 | |
Seq. Id. Mode 0 | |
Alternative alignments 0 | |
Coverage threshold 0.8 | |
Coverage Mode 0 | |
Max. sequence length 65535 | |
Max. results per query 300 | |
Compositional bias 1 | |
Realign hit false | |
Max Reject 2147483647 | |
Max Accept 1 | |
Include identical Seq. Id. false | |
Preload mode 0 | |
Pseudo count a 1 | |
Pseudo count b 1.5 | |
Score bias 0 | |
Gap open cost 11 | |
Gap extension cost 1 | |
Threads 28 | |
Verbosity 3 | |
Sensitivity 5 | |
K-mer size 0 | |
K-score 2147483647 | |
Alphabet size 21 | |
Offset result 0 | |
Split DB 0 | |
Split mode 2 | |
Split Memory Limit 0 | |
Diagonal Scoring 1 | |
Exact k-mer matching 0 | |
Mask Residues 1 | |
Minimum Diagonal score 15 | |
Spaced Kmer 1 | |
Spaced k-mer pattern | |
Local temporary path | |
Rescore mode 0 | |
Remove hits by seq.id. and coverage false | |
Sort results 0 | |
In substitution scoring mode, performs global alignment along the diagonal false | |
Mask profile 1 | |
Profile e-value threshold 0.001 | |
Use global sequence weighting false | |
Filter MSA 1 | |
Maximum sequence identity threshold 0.9 | |
Minimum seq. id. 0 | |
Minimum score per column -20 | |
Minimum coverage 0 | |
Select n most diverse seqs 1000 | |
Omit Consensus false | |
Min codons in orf 1 | |
Max codons in length 2147483647 | |
Max orf gaps 2147483647 | |
Contig start mode 2 | |
Contig end mode 2 | |
Orf start mode 0 | |
Forward Frames 1,2,3 | |
Reverse Frames 1,2,3 | |
Translation Table 1 | |
Use all table starts false | |
Offset of numeric ids 0 | |
Add Orf Stop false | |
Number search iterations 1 | |
Start sensitivity 4 | |
Search steps 1 | |
Run a seq-profile search in slice mode false | |
Strand selection 1 | |
Disk space limit 0 | |
Sets the MPI runner mpirun --mca btl_tcp_if_include ens3 -n 10 --map-by ppr:1:node --bind-to none | |
Remove Temporary Files false | |
MPI Init... | |
Rank: 0 Size: 10 | |
Program call: | |
prefilter /vol/attached/gtdb/mg_gtdb_update/tmp/NEWDB.newSeqs /vol/attached/gtdb/mg_gtdb_update/tmp/OLDDB.repSeq /vol/attached/gtdb/mg_gtdb_update/tmp/search/5596834897263602452/pref_5.0 --sub-mat blosum62.out -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 65535 --max-s | |
eqs 300 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 -c 0.8 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 28 -v | |
3 -s 5.0 | |
MMseqs Version: GITDIR-NOTFOUND-MPI | |
Sub Matrix blosum62.out | |
Sensitivity 5 | |
K-mer size 0 | |
K-score 2147483647 | |
Alphabet size 21 | |
Max. sequence length 65535 | |
Max. results per query 300 | |
Offset result 0 | |
Split DB 0 | |
Split mode 2 | |
Split Memory Limit 0 | |
Coverage threshold 0.8 | |
Coverage Mode 0 | |
Compositional bias 1 | |
Diagonal Scoring 1 | |
Exact k-mer matching 0 | |
Mask Residues 1 | |
Minimum Diagonal score 15 | |
Include identical Seq. Id. false | |
Spaced Kmer 1 | |
Preload mode 0 | |
Pseudo count a 1 | |
Pseudo count b 1.5 | |
Spaced k-mer pattern | |
Local temporary path | |
Threads 28 | |
Verbosity 3 | |
Initialising data structures... | |
Using 28 threads. | |
Could not find precomputed index. Compute index. | |
Touch data file /vol/attached/gtdb/mg_gtdb_update/tmp/OLDDB.repSeq ... Done. | |
Substitution matrices... | |
Substitution matrices... | |
Use kmer size 7 and split 1 using Query split mode. | |
Needed memory (81983279223 byte) of total memory (243445174272 byte) | |
Target database: /vol/attached/gtdb/mg_gtdb_update/tmp/OLDDB.repSeq(Size: 32465074) | |
Index table k-mer threshold: 106 | |
Index table: counting k-mers... | |
................................................................................................... 1 Mio. sequences processed | |
................................................................................................... 2 Mio. sequences processed | |
................................................................................................... 3 Mio. sequences processed | |
................................................................................................... 4 Mio. sequences processed | |
................................................................................................... 5 Mio. sequences processed | |
................................................................................................... 6 Mio. sequences processed | |
................................................................................................... 7 Mio. sequences processed | |
................................................................................................... 8 Mio. sequences processed | |
................................................................................................... 9 Mio. sequences processed | |
................................................................................................... 10 Mio. sequences processed | |
................................................................................................... 11 Mio. sequences processed | |
................................................................................................... 12 Mio. sequences processed | |
................................................................................................... 13 Mio. sequences processed | |
................................................................................................... 14 Mio. sequences processed | |
................................................................................................... 15 Mio. sequences processed | |
................................................................................................... 16 Mio. sequences processed | |
................................................................................................... 17 Mio. sequences processed | |
................................................................................................... 18 Mio. sequences processed | |
................................................................................................... 19 Mio. sequences processed | |
................................................................................................... 20 Mio. sequences processed | |
................................................................................................... 21 Mio. sequences processed | |
................................................................................................... 22 Mio. sequences processed | |
................................................................................................... 23 Mio. sequences processed | |
................................................................................................... 24 Mio. sequences processed | |
................................................................................................... 25 Mio. sequences processed | |
................................................................................................... 26 Mio. sequences processed | |
................................................................................................... 27 Mio. sequences processed | |
................................................................................................... 28 Mio. sequences processed | |
................................................................................................... 29 Mio. sequences processed | |
................................................................................................... 30 Mio. sequences processed | |
................................................................................................... 31 Mio. sequences processed | |
................................................................................................... 32 Mio. sequences processed | |
.............................................. | |
Index table: Masked residues: 94184143 | |
Index table: fill... | |
................................................................................................... 1 Mio. sequences processed | |
................................................................................................... 2 Mio. sequences processed | |
................................................................................................... 3 Mio. sequences processed | |
................................................................................................... 4 Mio. sequences processed | |
................................................................................................... 5 Mio. sequences processed | |
................................................................................................... 6 Mio. sequences processed | |
................................................................................................... 7 Mio. sequences processed | |
................................................................................................... 8 Mio. sequences processed | |
................................................................................................... 9 Mio. sequences processed | |
................................................................................................... 10 Mio. sequences processed | |
................................................................................................... 11 Mio. sequences processed | |
................................................................................................... 12 Mio. sequences processed | |
................................................................................................... 13 Mio. sequences processed | |
................................................................................................... 14 Mio. sequences processed | |
................................................................................................... 15 Mio. sequences processed | |
................................................................................................... 16 Mio. sequences processed | |
................................................................................................... 17 Mio. sequences processed | |
................................................................................................... 18 Mio. sequences processed | |
................................................................................................... 19 Mio. sequences processed | |
................................................................................................... 20 Mio. sequences processed | |
................................................................................................... 21 Mio. sequences processed | |
................................................................................................... 22 Mio. sequences processed | |
................................................................................................... 23 Mio. sequences processed | |
................................................................................................... 24 Mio. sequences processed | |
................................................................................................... 25 Mio. sequences processed | |
................................................................................................... 26 Mio. sequences processed | |
................................................................................................... 27 Mio. sequences processed | |
................................................................................................... 28 Mio. sequences processed | |
................................................................................................... 29 Mio. sequences processed | |
................................................................................................... 30 Mio. sequences processed | |
................................................................................................... 31 Mio. sequences processed | |
................................................................................................... 32 Mio. sequences processed | |
.............................................. | |
Index table: removing duplicate entries... | |
Index table init done. | |
DB statistic | |
Entries: 3435016814 | |
DB Size: 30850100884 (byte) | |
Avg Kmer Size: 2.68361 | |
Top 10 Kmers | |
GITSPKL 7839 | |
GNGGTPS 5634 | |
DGVIGSP 3658 | |
LLGPGKT 3280 | |
GNGGTPT 3266 | |
IDSNVGT 3030 | |
FLNSHRT 2799 | |
SERSRET 2370 | |
DLIHDNS 2326 | |
ERRDSNV 2257 | |
Min Kmer Size: 0 | |
Empty list: 596886817 | |
Time for index table init: 0h 2m 35s 107ms | |
Query database type: Aminoacid | |
Target database type: Aminoacid | |
Time for init: 0h 2m 39s 698ms | |
Query database: /vol/attached/gtdb/mg_gtdb_update/tmp/NEWDB.newSeqs(size=93723190) | |
Process prefiltering step 1 of 10 | |
k-mer similarity threshold: 106 | |
k-mer match probability: 0 | |
Starting prefiltering scores calculation (step 1 of 10) | |
Query db start 1 to 9887005 | |
Target db start 1 to 32465074 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment