Skip to content

Instantly share code, notes, and snippets.

@daler
Last active August 29, 2015 14:00
Show Gist options
  • Save daler/11400223 to your computer and use it in GitHub Desktop.
Save daler/11400223 to your computer and use it in GitHub Desktop.
infer GTF genes, for biostars #99287
import gffutils
# Import the GTF file into a sqlite3 database.
# This only ever has to be done once.
db = gffutils.create_db("example.gtf", dbfn='example.gtf.db')
# In other scripts, you can connect to the database like this:
db = gffutils.FeatureDB('example.gtf.db')
# Note that gene and transcript have been inferred
print list(db.featuretypes())
# ['CDS', 'exon', 'gene', 'start_codon', 'stop_codon', 'transcript']
# Here's how to write genes out to file
with open('example_genes.gtf', 'w') as fout:
for gene in db.features_of_type('gene'):
fout.write(str(gene) + '\n')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment