Created
November 29, 2012 17:02
-
-
Save radaniba/4170401 to your computer and use it in GitHub Desktop.
In which we explore the ill-defined and undocumented: In BioRuby, alignments are equipped with several methods for obtaining consensus sequences. Unfortunately, these have terse descriptions which point you at the BioPerl documentation, with the added bon
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # For demonstration purposes, let's create a very simple alignment, where | |
| # everything agrees xcept the last sequence which leads with a differing | |
| # character and ends with a gap: | |
| require 'bio' | |
| aln = Bio::Alignment.new(['acgt', 'acgt', 'acgt', 'ccg-']) | |
| # consensus_iupac produces a "true" consensus sequence across all members. | |
| # If sequences differ, the consensus sequence has an ambiguous character | |
| # that sums these differences: | |
| print aln.consensus_iupac() # "mcg?" | |
| # Note the first position is m (a or c), but where gaps exist, like the final | |
| # position, the final character is marked gnomically as ?. | |
| # An extra complication is the handling of gaps. The option :gap_mode | |
| # takes the values 0, 1 or -1, which correspond to treating gaps like a | |
| # character (the default), any gaps at that site appearing in the output | |
| # (i.e. effectively stripping no-gap characters) and stripping all gaps | |
| # before calculating the consensus: | |
| print aln.consensus_iupac(:gap_mode=>0) # "mcg?" | |
| print aln.consensus_iupac(:gap_mode=>1) # "mcg-" | |
| print aln.consensus_iupac(:gap_mode=>-1) # "mcgt" | |
| # consensus_string uses a threshold to calculate the consensus character | |
| # at that site: if the most frequent residue meets the threshold, it is | |
| # selected. Gap characters are treated as above: | |
| print aln.consensus_string(threshold=0.5) # "acgt" | |
| print aln.consensus_string(threshold=0.5, :gap_mode=>0) # "acgt" | |
| print aln.consensus_string(threshold=0.5, :gap_mode=>1) # "acg-" | |
| print aln.consensus_string(threshold=0.5, :gap_mode=>-1) # "acgt" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment