Skip to content

Instantly share code, notes, and snippets.

@daler
Last active January 3, 2016 06:39
Show Gist options
  • Save daler/8424229 to your computer and use it in GitHub Desktop.
Save daler/8424229 to your computer and use it in GitHub Desktop.
"""
Illustrates how to use pybedtools to count motifs along sliding windows
of UTRs from a GFF file.
This uses example data shipped with pybedtools.
"""
import pybedtools
def renamer(f):
# BEDTools sees the featuretype of a GFF file as the "name" when using
# makewindows, so we need to set the featuretype to be the ID. Or whatever
# GFF attribute is of interest.
if f[2] == 'UTR':
f[2] = f['ID']
return f
# Prepare the UTRs file
utrs = pybedtools.example_bedtool('gdc.gff')\
.each(renamer)
# 20-bp windows, labeled by their name and an integer
windows = pybedtools.BedTool().window_maker(b=utrs, w=20, i='srcwinnum' )
motifs = pybedtools.example_bedtool('gdc.bed')
# Count up the number of features in each window.
# Last column of this file will have the number of motifs hitting that window
results = windows.intersect(motifs, c=True)
@yarden
Copy link

yarden commented Jan 14, 2014

If I try this I get:

Command was:

    bedtools makewindows -i srcwinnum -b "chr2L .   mRNA:xs2:UTR:41-70  41  70  0   +   .   ID=mRNA:xs2:UTR:41-70;Parent=mRNA:xs2
chr2L   .   CDS 71  130 0   +   .   ID=mRNA:xs2:CDS:71-130;Parent=mRNA:xs2;
chr2L   .   intron  131 170 0   +   .   ID=mRNA:xs2:intron:131-170;Parent=mRNA:xs2;
chr2L   .   CDS 171 200 0   +   .   ID=mRNA:xs2:CDS:171-200;Parent=mRNA:xs2;
chr2L   .   mRNA:xs2:UTR:201-220    201 220 0   +   .   ID=mRNA:xs2:UTR:201-220;Parent=mRNA:xs2
chr2L   .   exon    41  130 0   +   .   ID=mRNA:xs2:exon:41-130;Parent=mRNA:xs2;
chr2L   .   exon    171 220 0   +   .   ID=mRNA:xs2:exon:171-220;Parent=mRNA:xs2;
chr2L   .   mRNA    41  220 0   +   .   ID=mRNA:xs2;Parent=g2;
chr2L   .   CDS 161 230 0   -   .   ID=tRNA:t2:CDS:161-230;Parent=tRNA:t2;
chr2L   .   exon    161 230 0   -   .   ID=tRNA:t2:exon:161-230;Parent=tRNA:t2;
chr2L   .   tRNA    161 230 0   -   .   ID=tRNA:t2;Parent=t2;
chr2L   .   gene    41  220 0   +   .   ID=g2;
" -w 20

Error message was:
Error: The requested bed file (chr2L    .   mRNA:xs2:UTR:41-70  41  70  0   +   .   ID=mRNA:xs2:UTR:41-70;Parent=mRNA:xs2
chr2L   .   CDS 71  130 0   +   .   ID=mRNA:xs2:CDS:71-130;Parent=mRNA:xs2;
chr2L   .   intron  131 170 0   +   .   ID=mRNA:xs2:intron:131-170;Parent=mRNA:xs2;
chr2L   .   CDS 171 200 0   +   .   ID=mRNA:xs2:CDS:171-200;Parent=mRNA:xs2;
chr2L   .   mRNA:xs2:UTR:201-220    201 220 0   +   .   ID=mRNA:xs2:UTR:201-220;Parent=mRNA:xs2
chr2L   .   exon    41  130 0   +   .   ID=mRNA:xs2:exon:41-130;Parent=mRNA:xs2;
chr2L   .   exon    171 220 0   +   .   ID=mRNA:xs2:exon:171-220;Parent=mRNA:xs2;
chr2L   .   mRNA    41  220 0   +   .   ID=mRNA:xs2;Parent=g2;
chr2L   .   CDS 161 230 0   -   .   ID=tRNA:t2:CDS:161-230;Parent=tRNA:t2;
chr2L   .   exon    161 230 0   -   .   ID=tRNA:t2:exon:161-230;Parent=tRNA:t2;
chr2L   .   tRNA    161 230 0   -   .   ID=tRNA:t2;Parent=t2;
chr2L   .   gene    41  220 0   +   .   ID=g2;
) could not be opened. Exiting!
Traceback (most recent call last):
  File "/home/yarden/jaen/msi_code/msi/test.py", line 25, in <module>
    windows = pybedtools.BedTool().window_maker(b=utrs, w=20, i='srcwinnum' )
  File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.6.2-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 623, in decorated
    result = method(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.6.2-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 216, in wrapped
    check_stderr=check_stderr)
  File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.6.2-py2.7-linux-x86_64.egg/pybedtools/helpers.py", line 412, in call_bedtools
    'stderr, above', stderr)
pybedtools.helpers.BEDToolsError: ('Error message from BEDTools written to stderr, above', 'Error: The requested bed file (chr2L\t.\tmRNA:xs2:UTR:41-70\t41\t70\t0\t+\t.\tID=mRNA:xs2:UTR:41-70;Parent=mRNA:xs2\nchr2L\t.\tCDS\t71\t130\t0\t+\t.\tID=mRNA:xs2:CDS:71-130;Parent=mRNA:xs2;\nchr2L\t.\tintron\t131\t170\t0\t+\t.\tID=mRNA:xs2:intron:131-170;Parent=mRNA:xs2;\nchr2L\t.\tCDS\t171\t200\t0\t+\t.\tID=mRNA:xs2:CDS:171-200;Parent=mRNA:xs2;\nchr2L\t.\tmRNA:xs2:UTR:201-220\t201\t220\t0\t+\t.\tID=mRNA:xs2:UTR:201-220;Parent=mRNA:xs2\nchr2L\t.\texon\t41\t130\t0\t+\t.\tID=mRNA:xs2:exon:41-130;Parent=mRNA:xs2;\nchr2L\t.\texon\t171\t220\t0\t+\t.\tID=mRNA:xs2:exon:171-220;Parent=mRNA:xs2;\nchr2L\t.\tmRNA\t41\t220\t0\t+\t.\tID=mRNA:xs2;Parent=g2;\nchr2L\t.\tCDS\t161\t230\t0\t-\t.\tID=tRNA:t2:CDS:161-230;Parent=tRNA:t2;\nchr2L\t.\texon\t161\t230\t0\t-\t.\tID=tRNA:t2:exon:161-230;Parent=tRNA:t2;\nchr2L\t.\ttRNA\t161\t230\t0\t-\t.\tID=tRNA:t2;Parent=t2;\nchr2L\t.\tgene\t41\t220\t0\t+\t.\tID=g2;\n) could not be opened. Exiting!\n')

@yarden
Copy link

yarden commented Jan 14, 2014

My pybedtools is:

pybedtools-0.6.2-py2.7-linux-x86_64.egg

@daler
Copy link
Author

daler commented Jan 15, 2014

(upgrading to 0.6.4 and BEDTools 2.18 solved this)

@daler
Copy link
Author

daler commented Jan 15, 2014

BEDTools makewindows doesn't keep strand information, so if you need strand then you need a different strategy:

def stranded_windows(f, n=5):
    name = f['ID']
    for i, start in enumerate(range(0, len(f) - n, n)):
        f.start = start
        f.stop = start + n
        f.attrs['ID'] = '%s_%s' % (name, i)
        yield f

windows = pybedtools.example_bedtool('gdc.gff')\
    .filter(lambda x: x[2] == 'UTR')\
    .split(stranded_windows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment