You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parse multi-FASTA file, using Biopython ($ pip install biopython).
In [1]: from Bio import SeqIO
In [2]: import gzip
In [3]: handle = gzip.open("ftp.ncbi.nih.gov/snp/organisms/human_9606/rs_fasta/rs_ch1.fas.gz", "rU")
In [4]: record = SeqIO.parse(handle, "fasta").next()
In [5]: record
Out[5]: SeqRecord(seq=Seq('tcattgatggacatttgggttggttccaggtctttgctattgcgagtagtgcca...att', SingleLetterAlphabet()), id='gnl|dbSNP|rs171', name='gnl|dbSNP|rs171', description='gnl|dbSNP|rs171 rs=171|pos=500|len=702|taxid=9606|mol="genomic"|class=1|alleles="A/G"|build=138|suspect=?', dbxrefs=[])
In [6]: record.seq
Out[6]: Seq('tcattgatggacatttgggttggttccaggtctttgctattgcgagtagtgcca...att', SingleLetterAlphabet())