Last active
March 9, 2020 15:51
-
-
Save mdshw5/6c06a272a7fde2a5288214080210c8cf to your computer and use it in GitHub Desktop.
biostars 265811
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pyfaidx import Fasta | |
# positions.txt: | |
# chr1 1 T | |
# chr1 100 C | |
# chr2 10 G | |
# ... | |
with open('positions.txt') as mut_table: | |
# mutable Fasta modifies input file in-place | |
# make sure you're editing a copy of the original file | |
with Fasta('input.fasta', mutable=True) as fasta: | |
for line in mut_table: | |
rname, pos, base = line.rstrip().split() | |
# convert 1-based to 0-based coordinates | |
fasta[rname][int(pos) - 1] = base |
Sorry to hear that you're having trouble. I've tried to reproduce your issue using the example you provided, however I do see that the sites are mutated when running the code in this gist.
$ cat input.fasta
>ACVR1_R206H_1
CACCGCTGGCGAGCCACTGTTCTTTAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAAGAATTC
>ACVR1_R206H_2
CACCGTCTGGCGAGCCACTGTTCTTCAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAGAATTC
>ACVR1_S290L_3
CACCGATCGTTGTACGACTATCTTTCATGAAATGGGATCGTTGTACGACTATCTTCAGCTTACTAGAATTC
$ cat positions.txt
ACVR1_R206H_1 1 T
ACVR1_R206H_2 1 T
ACVR1_S290L_3 1 T
ACVR1_R206H_1 4 T
ACVR1_R206H_2 4 T
ACVR1_S290L_3 4 T
$ cat mutate_guides.py
from pyfaidx import Fasta
# positions.txt:
# chr1 1 T
# chr1 100 C
# chr2 10 G
# ...
with open('positions.txt') as mut_table:
# mutable Fasta modifies input file in-place
# make sure you're editing a copy of the original file
with Fasta('input.fasta', mutable=True) as fasta:
for line in mut_table:
rname, pos, base = line.rstrip().split()
# convert 1-based to 0-based coordinates
fasta[rname][int(pos) - 1] = base
$ python --version
Python 3.6.5
$ pip freeze
pyfaidx==0.5.8
six==1.14.0
Running the script produces the expected output:
$ python mutate_guides.py
$ cat input.fasta
>ACVR1_R206H_1
TACTGCTGGCGAGCCACTGTTCTTTAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAAGAATTC
>ACVR1_R206H_2
TACTGTCTGGCGAGCCACTGTTCTTCAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAGAATTC
>ACVR1_S290L_3
TACTGATCGTTGTACGACTATCTTTCATGAAATGGGATCGTTGTACGACTATCTTCAGCTTACTAGAATTC
Hopefully this can give you a starting point to figure out what's going wrong. Do note that the Fasta(mutable=True)
does mean that the FASTA file will be edited in-place, and so you'll want to run the script on a copy of your guide sequence file so that you can compare the changes to the original.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks @mdshw5 for getting back to me. I actually have a slightly different problem.
`>ACVR1_R206H_1
CACCGCTGGCGAGCCACTGTTCTTTAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAAGAATTC
I have a fasta index of around 6000 guides like above where i'd like to mutate a C in a specific position to a T. I thought I could use the pyfaidx package but the output does not change from the original input.