Last active
March 9, 2016 19:11
-
-
Save walterst/7306794 to your computer and use it in GitHub Desktop.
Used to truncate reads of lengths out of a given fasta filepython truncate_seq_lens.py X Y Z AwhereX is input fasta fileY is the minimum sequence lengthZ is the maximum sequence lengthA is output fasta file
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
""" Usage | |
python truncate_seq_lens.py X Y Z A | |
where | |
X is input fasta file | |
Y is the minimum sequence length (discards reads shorter than this) | |
Z is the maximum sequence length (discards reads longer than this) | |
A is target truncation length | |
B is output fasta file | |
""" | |
from sys import argv | |
from cogent.parse.fasta import MinimalFastaParser | |
f = open(argv[1], "U") | |
min_trunc_len = int(argv[2]) | |
max_trunc_len = int(argv[3]) | |
trunc_len = int(argv[4]) | |
out_f = open(argv[5], "w") | |
for label,seq in MinimalFastaParser(f): | |
if len(seq) < min_trunc_len or len(seq) > max_trunc_len: | |
continue | |
out_f.write(">%s\n%s\n" % (label, seq[0:trunc_len])) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment