This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Extract fasta files by their descriptors stored in a separate file. | |
# Requires biopython | |
from Bio import SeqIO | |
import sys | |
import argparse | |
def getKeys(args): | |
"""Turns the input key file into a list. May be memory intensive.""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
""" | |
This script takes the .hhr files output by HHSuite and | |
turns the quite verbose file in to a fully tabulated | |
version with all the fields separated one, one line per | |
file. Thus, the file can be viewed simply in Excel etc. | |
It requires the non-standard pandas module. | |
""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# This script is designed to take a genbank file and 'slice out'/'subset' | |
# regions (genes/operons etc.) and produce a separate file. This can be | |
# done explicitly by telling the script which base sites to use, or can | |
# 'decide' for itself by blasting a fasta of the sequence you're inter- | |
# ed in against the Genbank you want to slice a record out of. | |
# Note, the script (obviously) does not preseve the index number of the | |
# bases from the original |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This script will calculate Shannon entropy from a MSA. | |
# Dependencies: | |
# Biopython, Matplotlib [optionally], Math | |
""" | |
Shannon's entropy equation (latex format): | |
H=-\sum_{i=1}^{M} P_i\,log_2\,P_i | |
Entropy is a measure of the uncertainty of a probability distribution (p1, ..... , pM) |
NewerOlder