Skip to content

Instantly share code, notes, and snippets.

@rtroncy
Created July 22, 2014 20:05
Show Gist options
  • Save rtroncy/f47ea589666ccd2fd7a0 to your computer and use it in GitHub Desktop.
Save rtroncy/f47ea589666ccd2fd7a0 to your computer and use it in GitHub Desktop.
SPARQL query to get items from the Europeana endpoint matching a search term present in the creator/title/description properties
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
PREFIX dct: <http://purl.org/dc/terms/>
SELECT ?item ?mediaurl ?poster ?date1 ?date2 ?description ?publisher1 ?publisher2 WHERE {
?proxy ore:proxyFor ?item;
dc:title ?title;
dc:creator ?creator;
dc:description ?description;
dc:type ?type;
OPTIONAL {
?proxy ore:proxyIn ?provider.
?provider edm:object ?mediaurl.
}
OPTIONAL {
?proxy ore:proxyIn ?providerAux.
?aggregation ore:aggregate ?providerAux.
?aggregation edm:preview ?poster.
}
OPTIONAL {?proxy dc:date ?date1 .}
OPTIONAL {?proxy dct:created ?date2 .}
OPTIONAL {?proxy dc:publisher ?publisher1 .}
OPTIONAL {?proxy dc:source ?publisher2 .}
FILTER (REGEX(STR(?creator), "martena", "i") ||
REGEX(STR(?title), "martena", "i") ||
REGEX(STR(?description), "martena", "i") )
}
LIMIT 10
@boyan-simeonov
Copy link

You are using three REGEX statements which have to scan all objects in the dataset and this will be very slow. We have Full Text Search indexes on the objects titles and texts. You can expose FTS through SPARQL. To search in the titles index use :

PREFIX luc: http://www.ontotext.com/owlim/lucene#
PREFIX dc: http://purl.org/dc/elements/1.1/
SELECT DISTINCT ?uri ?title
WHERE {
?uri luc:titleIndex "martena*";
dc:title ?title
}
The other possibility is to search in all texts :

PREFIX luc: http://www.ontotext.com/owlim/lucene#
PREFIX dc: http://purl.org/dc/elements/1.1/
SELECT DISTINCT ?uri ?title
WHERE {
?uri luc:full "martena*";
dc:title ?title
}
Of course these queries are just an example and you can modify them to get the expected results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment