Skip to content

Instantly share code, notes, and snippets.

@edsu
Created January 21, 2010 02:29
Show Gist options
  • Select an option

  • Save edsu/282524 to your computer and use it in GitHub Desktop.

Select an option

Save edsu/282524 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
"""
Look for odd ampersands in URIs in O'Reilly data and print out the
context URI (where the assertions came from).
Output looks something like:
ed@inkdroid:~/bzr/oreilly-crawler$ ./amps.py
<http://oreilly.com/catalog/0636920001744/> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Sleepycat'].
<http://oreilly.com/catalog/9781593271749/> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Sleepycat'].
<http://oreilly.com/catalog/9781933952246/> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Sleepycat'].
"""
import rdflib.graph
import rdflib.term
g = rdflib.graph.ConjunctiveGraph('Sleepycat')
g.open('store')
for s, p, o in g:
has_amp = False
if ' & ' in s:
has_amp = True
elif type(o) == rdflib.term.URIRef and ' & ' in o:
has_amp = True
if has_amp:
for context in g.contexts((s, p, o)):
print context
g.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment