Skip to content

Instantly share code, notes, and snippets.

@walterst
Last active December 21, 2015 09:19
Show Gist options
  • Save walterst/6284164 to your computer and use it in GitHub Desktop.
Save walterst/6284164 to your computer and use it in GitHub Desktop.
This script combines fastq index (barcode) reads, e.g., those created by using the parse_bc_reads_labels.py script. Usage: python combine_fastq_barcodes.py X Y Z where X is the first input fastq barcodes file, Y is the second fastq barcodes file, and Z is the output combined fastq barcodes file. This script assumes these are the raw data, i.e., …
#!/usr/bin/env python
from sys import argv
from itertools import izip
from cogent.parse.fastq import MinimalFastqParser
# Usage: python combine_fastq_barcodes X Y Z
# where X is the first input fastq barcodes file, Y is the second fastq
# barcodes file, and Z is the output combined fastq barcodes file.
# This script assumes these are the raw data, i.e., actual paired reads, but
# no explicit check is made, so one should at least confirm that the lines
# are equal in the input files, with a command such as:
# wc -l X
# where X is the fastq file name.
bc_reads1 = open(argv[1], "U")
bc_reads2 = open(argv[2], "U")
combined_bcs = open(argv[3], "w")
header_index = 0
sequence_index = 1
quality_index = 2
for bc_data1,bc_data2 in izip(MinimalFastqParser(bc_reads1,strict=False),
MinimalFastqParser(bc_reads2,strict=False)):
curr_label = bc_data1[header_index].strip()
bc_read1 = bc_data1[sequence_index]
bc_qual1 = bc_data1[quality_index]
bc_read2 = bc_data2[sequence_index]
bc_qual2 = bc_data2[quality_index]
combined_bcs.write("@%s\n" % curr_label)
combined_bcs.write("%s\n" % (bc_read1 + bc_read2))
combined_bcs.write("+\n")
combined_bcs.write("%s\n" % (bc_qual1 + bc_qual2))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment