Skip to content

Instantly share code, notes, and snippets.

@chapmanb
Created July 24, 2016 18:09
Show Gist options
  • Save chapmanb/3731ad009ea9fcb5c435126506bd06b3 to your computer and use it in GitHub Desktop.
Save chapmanb/3731ad009ea9fcb5c435126506bd06b3 to your computer and use it in GitHub Desktop.
Small python script to subset fastq files
"""Trim start from sequences.
Usages:
bcbio_python subset_fastq.py in.fastq num_bases > out.fastq
"""
import sys
from Bio import SeqIO
from bcbio import utils
with utils.open_gzipsafe(sys.argv[1]) as in_handle:
in_recs = SeqIO.parse(in_handle, "fastq")
def trimmed(in_recs):
trim = int(sys.argv[2])
for rec in in_recs:
yield rec[:trim]
SeqIO.write(trimmed(in_recs), sys.stdout, "fastq")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment