This tutorial introduces the basics of Wikidata, an offshot of Wikipedia. Since its launch on October 30, 2012, Wikidata has developed rapidly and has become a repository of all kinds of facts across the Wikipedia language editions. In what follows, you will become familiar with the purpose and goals of Wikidata, including how to contribute to its development.
- Articulate the short and long term goals of Wikidata
- Learn various ways to contribute Wikidata
- Write simple SPARQL queries to retrieve data from Wikidata
- Understand the potential of Wikidata for scholarly communications
To understand the problems that Wikidata addresses, you need to reflect on the challenges posed by Wikipedia's 295 language editions. Ideally, articles in these language editions should be linked together. So, if you create an article about a Portuguese scholar on the English Wikipedia, you'd want to connect that article to any existing articles on other language editions, including the Portuguese edition but also to the Chinese, French, etc. editions too. In the past, you had to create these language links manually. As might be expected, syncronizing editions was sometimes spotty, meaning that an equivalent article might exist on another language edition, but not be reported on your edition. The Wikipedia article on Wikidata lists three phases of development:
-
Phase One aspires to connect articles on the same topic across language editions. Rather than asking editors to link from every language edition to every other language edition, Wikidata centralized the process. In other words, the Wikidata project moved the relationship between Wikipedia editions from a matrix model to a hub-and-spoke model.
-
Phase Two connected the statements made on the different language editions. If, for instance, you put up an infobox with a photograph about a intellectual figure on one language edition, you'd like other editors to have access to that information. This phase of the Wikidata project makes that possible and is gradually extending the range of statements that you can query across language editions.
-
Phase Three will make it possible to share lists across language editions such as, for example, this List of female scientists in the 21st century and to make sure that they get updated automatically when new information is entered.
Now that you have a sense of what Wikidata is about, let's make some contrbutions. Wikidata is not just about automating the exchange of information. Just like Wikipedia, it's freely editable and you can contribute information to it directly.
The first step is to set up an account (though you can also edit Wikidata without any account). You create a username for Wikidata just like you do on Wikipedia. In fact, if you have signed into Wikipedia already, you already have a username on Wikidata. If you need to create an account, create one here. You create your user page and talk page just as you would on Wikipedia. If you're interested in editing labels in languages other than English, you can also add language codes using the Babel extension to Wikidata: {{#babel:en-N|fr-3|de-5}}
Let's start with taking an introductory tour of how to edit Wikidata statements. An item according to Wikidata is a "a real-world object, concept, event that is given an identifier." Items have names like Q30. Wikidata predicates or "properties" have names like P571 for the inception data of some item. While these names look weird to speakers of English (or any other natural language), they provide a way of identifying information across editions without privileging any particular linguistic community. In other words, we all need to learn Wikidata's version of Esperanto to make statements in Wikidata. Fortunately, Wikidata provides tools to help.
The best part about wikidata is that you don't need to look up information across its various pages. You can write simple (and complex) queries to pinpoint precisely the data you want to receive. The query language for Wikidata is called SPARQL or the SPARQL Protocol and RDF Query Language.
Try these SPARQL queries out by entering them into the Wikidata Query Service.
#Women pastors
#Forked from http://tinyurl.com/zjnpbm5
SELECT DISTINCT ?women ?womenLabel WHERE {
?women wdt:P31 wd:Q5 .
?women wdt:P21 wd:Q6581072 .
?women wdt:P106 wd:Q152002 .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
#defaultView:ImageGrid
SELECT ?paintings ?paintingsLabel ?image WHERE {
?paintings wdt:P31 wd:Q3305213.
?paintings wdt:P180 wd:Q63070.
OPTIONAL { ?paintings wdt:P18 ?image. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
#Paintings of Jesus vrs. Mary
SELECT ?Jesus ?Mary WHERE {
{
SELECT (COUNT(?paintings) AS ?Jesus) WHERE {
?paintings wdt:P31 wd:Q3305213.
?paintings wdt:P180 wd:Q302.
}
}
{
SELECT (COUNT(?paintings) AS ?Mary) WHERE {
?paintings wdt:P31 wd:Q3305213.
?paintings wdt:P180 wd:Q345.
}
}
}
#Religious Buildings in New York City
#defaultView:Map
SELECT DISTINCT ?institutionLabel ?image ?coor WHERE {
?entity wdt:P279* wd:Q24398318.
wd:Q60 wdt:P150* ?location.
?institution wdt:P31 ?entity.
?institution wdt:P131 ?location.
?institution wdt:P625 ?coor.
OPTIONAL { ?institution wdt:P18 ?image. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
#Religious Denomination Timeline
#Forked from http://tinyurl.com/jrv757r
#defaultView:Timeline
SELECT ?denomination ?denominationLabel ?founding (SAMPLE(?image) AS ?image) WHERE
{
?denomination (wdt:P31/wdt:P279*) wd:Q13414953.
?denomination wdt:P571 ?founding.
OPTIONAL { ?denomination wdt:P18 ?image. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?denomination ?denominationLabel ?founding
LIMIT 20
#Occupations of Graduates of Princeton Theological Seminary
#defaultView:BarChart
SELECT ?occupationLabel (COUNT(?student) AS ?studentCount) WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
?student wdt:P69 wd:Q909696.
?student wdt:P106 ?occupation.
}
GROUP BY ?occupationLabel
#Orders of Religious in Wikipedia
#Forked from http://tinyurl.com/zo2cl2c
#defaultView:BubbleChart
SELECT ?order ?orderLabel (COUNT(*) AS ?count) WHERE {
?pid wdt:P31 wd:Q5.
?pid wdt:P611 ?order.
OPTIONAL {
?order rdfs:label ?orderLabel.
FILTER((LANG(?orderLabel)) = "en")
}
}
GROUP BY ?order ?orderLabel
ORDER BY DESC(?count)
LIMIT 50
#Pastors' Citizenship per Denomination
#defaultView:TreeMap
SELECT ?religionLabel ?countryLabel (COUNT(?pastor) as ?count)
WHERE
{
?pastor wdt:P106 wd:Q152002 .
?pastor wdt:P140 ?religion .
?pastor wdt:P27 ?country .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?religionLabel ?countryLabel
#Chinese temples
#defaultView:Map
SELECT ?temple ?templeLabel ?coordinate_location WHERE {
?temple wdt:P31 wd:Q44539.
?temple wdt:P17 ?country.
FILTER(?country IN(wd:Q148, wd:Q865, wd:Q334))
OPTIONAL { ?temple wdt:P625 ?coordinate_location. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
- Check out WikiCite, a project to collect all bibliographic citations in Wikidata. See WikiProject Source Metadata on Wikidata for additional information.
SELECT ?staff ?staffLabel ?ORCID_iD WHERE
{
?staff wdt:P108 wd:Q29052 ;
wdt:P496 ?ORCID_iD .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
#Main subjects of journals in Wikidata
SELECT ?main_subjectLabel (COUNT(?journal) AS ?jourNum) WHERE {
?journal wdt:P31 wd:Q5633421.
?journal wdt:P921 ?main_subject.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?jourNum ?main_subjectLabel
ORDER BY DESC(?jourNum)
- Check out Scholia, a project to visualize scholarly data in Wikidata. See, for instance, Vanderbilt University in Scholia. See also the From Wikidata to Scholia: Creating Structured-Linked Data to Generate Scholarly Profiles by Mairelys Lemus-Rojas and Jere D. Odell at IUPUI.
By way of conclusion, let's try out the Wikidata Game and see whether we can make improvements to the quality of Wikidata's data.