Created
January 20, 2015 21:38
-
-
Save rmzelle/dbb504d2e64030023b2f to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
##gff-version 3 | |
scaffold_28 prediction gene 1 402 0 + . ID=545184;Name=545184 | |
scaffold_28 prediction gene 805 981 0 - . ID=93782;Name=93782 | |
scaffold_28 prediction gene 2030 2721 0 + . ID=545205;Name=545205 | |
scaffold_28 prediction gene 3273 3545 0 - . Name=YOL159C-A;Synteny=no_synteny;SystematicGeneName=YOL159C-A;ID=38792 | |
scaffold_28 prediction gene 5318 5833 0 - . Name=YOL159C;Synteny=no_synteny;SystematicGeneName=YOL159C;ID=38793 | |
scaffold_28 prediction gene 6780 8600 0 - . Name=ENB1;Synteny=no_synteny;SystematicGeneName=YOL158C;StandardGeneName=ENB1;ID=38794 | |
scaffold_28 prediction gene 9698 11467 0 - . Name=IMA4;Synteny=no_synteny;SystematicGeneName=YJL221C;StandardGeneName=IMA4;ID=38795 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import gffutils | |
import os | |
def updateGenome(): | |
genomeGFF="genome-minimal-fail.gff" | |
genomeGFFdb = genomeGFF.replace(".gff", ".db") | |
#create database for genome | |
if os.path.isfile(genomeGFFdb): | |
db = gffutils.FeatureDB(genomeGFFdb) | |
else: | |
db = gffutils.create_db(genomeGFF, dbfn=genomeGFFdb, force=True, merge_strategy="merge") | |
#shift coordinates of features that follow the original integration site by deletion size | |
downstreamGenomeFeatures = db.region('scaffold_28:6001-11467', completely_within=True) | |
for feature in downstreamGenomeFeatures: | |
print(feature["Name"][0]) | |
feature.start += 10000 | |
feature.end += 10000 | |
print(feature.end) | |
db.update(downstreamGenomeFeatures, merge_strategy="replace") | |
return | |
updateGenome() |
Thanks a lot. Could you please check my new example, though, at https://gist.github.com/rmzelle/22fc9c537ddf8f40de4b
I'm trying to shift the last two features of the GFF file downstream by 10000 bp. While the generator shows that the coordinates change, somehow the changes don't make it into the database after running db.update()
.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I can't figure out how to submit a web-based pull request on a gist, so I'll just describe here:
downstreamGenomeFeatures
on line 15 is a generator. Like all generators (and just like reading a file object), iterating through it will consume it. You can't rewind. So on line 17 when you iterate over it, it consumes everything in the generator. Then when you calldb.update
and give itdownstreamGenomeFeatures
, by that time it's empty. So you get the error, "ValueError: No lines parsed -- was an empty file provided?" because no features were provided.One solution is to make your own generator that consumes
downstreamGenomeFeatures
but yields the modified feature. Then give that generator todb.update
. Like this: