This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Run IgBLAST and output the AIRR-formatted result table | |
""" | |
# The code in here was copied from IgDiscover 0.15, https://github.com/NBISweden/IgDiscover-legacy | |
# and streamlined a bit to work stand-alone. Relevant files: | |
# - src/igdiscover/cli/igblastwrap.py | |
# - src/igdiscover/igblast.py | |
# - src/igdiscover/species.py | |
# - src/igdiscover/utils.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
"""Quality trimming using a running sum from the 5' to 3' end""" | |
import sys | |
from argparse import ArgumentParser | |
import dnaio | |
def qual_trim_index(qualities_ascii, threshold): | |
qualities = [ord(c) - 33 for c in qualities_ascii] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Split reads in a FASTQ file at adapter occurrences | |
# | |
# Run: | |
# cutadapt -O 100 --times=1000 -g MYADAPTERSEQ --info-file=info.txt -o /dev/null reads.fastq.gz | |
# | |
# Then: | |
# awk -F "\t" -f split.awk info.txt | gzip > split.fastq.gz | |
# Relevant info file fields: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
""" | |
Run with: | |
fasta2fastq < in.fasta > out.fastq | |
""" | |
import dnaio | |
import sys | |
with dnaio.open(sys.stdin.buffer) as inf: | |
with dnaio.open(sys.stdout.buffer, mode="w", fileformat="fastq") as outf: | |
for record in inf: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# This script creates both | |
# - environment.osx.lock.yml and | |
# - environment.linux.lock.yml | |
# regardless of the operating system it is running on. The trick is | |
# temporarily setting the subdir and subdirs keys in .condarc to | |
# what would be appropriate for the other operating system. | |
# | |
# It assumes that there exists a (manually managed) environment.yml file |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# A workaround for an issue with Nextflow (which may actually be a bash bug), | |
# see <https://github.com/SciLifeLab/Sarek/issues/420> | |
# | |
# The problem is that Nextflow does not notice that a job has finished and | |
# hangs indefinitely. | |
# | |
# This script looks for zombie processes that are children of a script named | |
# .command.stub, and kills that script. This seems to let the pipeline continue |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
""" | |
Mask low-quality bases in a FASTQ file with 'N'. | |
Adjust cutoff_front and cutoff_back below to use | |
different thresholds (currently: 20 at 5' end, | |
0 at 3' end). | |
Usage: | |
python3 qualmask.py input.fastq.gz > output.fastq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
set -euo pipefail | |
if [ $# -ne 1 -o x$1 == x-h -o x$1 == x--help ]; then | |
echo \ | |
"Usage: | |
samtools sort -O bam -T prefix ... | bambai BAMPATH | |
Read a sorted BAM file from standard input, write it to BAMPATH and | |
index it at the same time (creating BAMPATH.bai)." |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pysam import AlignmentFile | |
from pyfaidx import Fasta | |
def has_mismatch_in_interval(reference, bamfile, chrom, start, end): | |
""" | |
Return whether there is a mismatch in the interval (start, end) in any read mapping to the given chromosome. | |
reference -- a pyfaidx.Fasta object or something that behaves similarly | |
""" | |
for column in bamfile.pileup(chrom, start, end): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Plot multiple figures into a single PDF with matplotlib, using the | |
object-oriented interface. | |
""" | |
from matplotlib.backends.backend_pdf import FigureCanvasPdf, PdfPages | |
from matplotlib.figure import Figure | |
import numpy as np | |
with PdfPages('multi.pdf') as pages: | |
for i in range(10): |
NewerOlder