Created
March 31, 2014 14:23
-
-
Save mdshw5/9893462 to your computer and use it in GitHub Desktop.
Biostars 96573 solution
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pyfaidx import Fasta | |
with open("regions.txt") as regions, Fasta("sequence.fasta") as fasta: | |
for line in regions: | |
fields = line.rstrip().split() | |
rname, start, end = fields[4:7] | |
repeat = ' '.join(fields[9:11]) | |
seq = fasta[rname][int(start)-1:int(end)-1] | |
print(seq.name + repeat) | |
print(seq.seq) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
21 3.6 0.0 0.0 gi|411024077|gb|CM001634.1| 11554 11581 (28623556) + AT_rich Low_complexity 1 28 (0) 1 | |
22 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 12224 12245 (28622892) + AT_rich Low_complexity 1 22 (0) 2 | |
22 8.0 0.0 0.0 gi|411024077|gb|CM001634.1| 12609 12658 (28622479) + AT_rich Low_complexity 1 50 (0) 3 | |
22 5.6 0.0 0.0 gi|411024077|gb|CM001634.1| 12691 12726 (28622411) + AT_rich Low_complexity 1 36 (0) 4 | |
26 3.0 0.0 0.0 gi|411024077|gb|CM001634.1| 13965 13997 (28621140) + AT_rich Low_complexity 1 33 (0) 5 | |
222 3.7 0.0 0.0 gi|411024077|gb|CM001634.1| 16373 16399 (28618738) + (TA)n Simple_repeat 1 27 (0) 6 | |
219 12.5 0.0 0.0 gi|411024077|gb|CM001634.1| 17247 17286 (28617851) + (G)n Simple_repeat 1 40 (0) 7 | |
198 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 20074 20095 (28615042) + (CAAAAA)n Simple_repeat 3 24 (0) 8 | |
189 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 20344 20364 (28614773) + (TA)n Simple_repeat 1 21 (0) 9 | |
23 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 21437 21459 (28613678) + AT_rich Low_complexity 1 23 (0) 10 | |
198 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 22420 22441 (28612696) + (GAA)n Simple_repeat 2 23 (0) 11 | |
189 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 27191 27211 (28607926) + (TTTTTG)n Simple_repeat 4 24 (0) 12 | |
21 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 27481 27501 (28607636) + AT_rich Low_complexity 1 21 (0) 13 | |
24 16.1 0.0 0.0 gi|411024077|gb|CM001634.1| 33144 33174 (28601963) + AT_rich Low_complexity 1 31 (0) 14 | |
186 4.3 0.0 0.0 gi|411024077|gb|CM001634.1| 34503 34525 (28600612) + (CGAAT)n Simple_repeat 2 24 (0) 15 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment