Created
February 26, 2013 21:25
-
-
Save radaniba/5042352 to your computer and use it in GitHub Desktop.
When reading in sequences, you may want to arrange or index them in some way (rather than just get one big list o' sequences). Fortunately Biopython's SeqIO has a useful function for this: "to_dict" returns a dictionary where the keys are derived from the SeqRecords that are the values.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from Bio import SeqIO | |
handle = open("example.fasta", "rU") | |
record_dict = SeqIO.to_dict (SeqIO.parse (handle, "fasta")) | |
handle.close() | |
# you now have dict where the keys are the sequence IDs, e.g. record_dict["gi:12345678"] | |
# you can index in other ways with the "key_function". | |
# for example, if you wanted to index by the description of the sequence | |
handle = open("example.fasta", "rU") | |
record_dict = SeqIO.to_dict (SeqIO.parse (handle, "fasta"), key_function=lambda s: s.description) | |
handle.close() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment