Update: my blog post The lie of the API details the issues with current APIs.
Background: I'm a researcher in semantic hypermedia, at the moment comparing different APIs for accessing metadata for human and machine consumption.
Story: I am browsing a cultural website and want to retrieve the metadata of the object I'm looking at in a machine-readable format. The steps below are the actual steps that I've undertaken on different sites.
I'm looking at the object http://collection.cooperhewitt.org/objects/35460799/.
- To retrieve this in JSON, I just take copy that URL and do:
$ curl -H "Accept: application/json" http://collection.cooperhewitt.org/objects/35460799/
I'm looking at the person http://dbpedia.org/resource/Arthur_Rimbaud
- To retrieve this in JSON, I just take copy that URL and do:
$ curl -L -H "Accept: application/json" http://dbpedia.org/resource/Arthur_Rimbaud
There's even RDF if I need it (same URL): ``` $ curl -L -H "Accept: text/turtle" http://dbpedia.org/resource/Arthur_Rimbaud ```
I'm looking at the object http://www.europeana.eu/portal/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html.html?start=1&query=david+ochterlony+hookah&startPage=1&rows=24
- To retrieve JSON, I try
$ curl -H "Accept: application/json" http://www.europeana.eu/portal/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html.html
- I try to make sense of the following output:
<html><head><title>Apache Tomcat/6.0.24 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 406 - </h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u></u></p><p><b>description</b> <u>The resource identified by this request is only capable of generating responses with characteristics not acceptable according to the request "accept" headers ().</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.24</h3></body></html>
- I search for the documentation.
- I end up on this page and click "API documentation".
- I end up on the Introduction page, where I see that I have to register.
- On the registration page, I enter my e-mail address.
- I receive an e-mail and click the link.
- I receive my API key.
- I click through to Working with the API and take a mental note about a field named
apikey
. - I go to Sample code. No, that's not it.
- I go to API methods and see that
record.json
(is it a method or a file) looks like what I need, so I click it. - I am informed that I need to use the URL template
http://europeana.eu/api/v2/record/[recordID].json
. This URL template has the parametersrecordID
,callback
,profile
. I only understand the second one without reading, but I don't need it (not using JSON-P). - Hoping to find the Record ID, I go back to the page I opened in the beginning. I look through the whole page and find nothing called "Record ID", but I find a field "Identifier" with string
019ADDOR0000002U00000000
. - I now feel ready to make my first API call and try
$ curl http://europeana.eu/api/v2/record/019ADDOR0000002U00000000.json?apikey=xxxxxxxxx
where xxxxxxxxx is my actual API key, using the apikey
field name I found earlier.
15. I try to make sense of the following output:
<html><head><title>Apache Tomcat/6.0.24 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 404 - /api/v2/record/019ADDOR0000002U00000000.json</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>/api/v2/record/019ADDOR0000002U00000000.json</u></p><p><b>description</b> <u>The requested resource (/api/v2/record/019ADDOR0000002U00000000.json) is not available.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.24</h3></body></html>
- Thinking I might have not used the API key properly, I go back to Working with the API and now see something about a
wskey
parameter. So the field is calledapikey
but the parameterwskey
. I assume this is a URL query string parameter. - I try the request again:
$ curl http://europeana.eu/api/v2/record/019ADDOR0000002U00000000.json?wskey=xxxxxxxxx
- I visually check whether the error output is the same:
<html><head><title>Apache Tomcat/6.0.24 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 404 - /api/v2/record/019ADDOR0000002U00000000.json</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>/api/v2/record/019ADDOR0000002U00000000.json</u></p><p><b>description</b> <u>The requested resource (/api/v2/record/019ADDOR0000002U00000000.json) is not available.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.24</h3></body></html>
- I suspect I might have gotten the identifier wrong. I go back to the original page and start looking into the source code whether I can find an identifier. I only find
019ADDOR0000002U00000000
, which I have tried already. - I go back to the Working with the API page and click the link Europeana ID next to the
recordID
field, where I read the following explanation: _Digital records delivered to Europeana are assigned a unique identifier, Europeana ID, that serves to further identify the records when using the API. Usually, this identifier is based on the original metadata that are provided for the record and internal Europeana identifiers of the provider and the dataset containing the record. For example, a Europeana ID of an object can look as follows: /09102/_GNM_1234 where 091 is the identifier of the provider, 02 is the id of the dataset and GNM_1234 is derived from the unique identifier of the record in the context of the provider. - I inspect the URL to see whether I can find such an identifier: http://www.europeana.eu/portal/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html.html?start=1&query=david+ochterlony+hookah&startPage=1&rows=24. Indeed, there is a part "92037/", but the thing that follows it does not look like that. I find this strange, but try it anyway:
$ curl http://europeana.eu/api/v2/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html?apikey=xxxxxxxxx
- I get the error message
<html><head><title>Apache Tomcat/6.0.24 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 404 - /api/v2/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>/api/v2/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html</u></p><p><b>description</b> <u>The requested resource (/api/v2/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html) is not available.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.24</h3></body></html>
- I try to Google for "http://europeana.eu/api/v2/record" to see if anybody else got the API working.
- I arrive at the npm package registry and find a JSON fragment that mentions the link
http://europeana.eu/api/v2/record/08501/03F4577D418DC84979C4E2EE36F99FECED4C7B11.json?wskey=abc123
. - I add my own API key to test whether I can retrieve this random object:
$ curl http://europeana.eu/api/v2/record/08501/03F4577D418DC84979C4E2EE36F99FECED4C7B11.json?wskey=xxxxxxxxx
- This works; but it's not the object that I wanted. Now let's try replacing the object identifier by
92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html
:
$ curl http://europeana.eu/api/v2/record/92037/_http___www_bl_uk_onlinegallery_onlineex_apac_addorimss_s_019addor0000002u00000000_html.json?wskey=xxxxxxxxx
This works.
- I wonder why it didn't work in step 21, only to find out that I had not added the extension
.json
. I also wonder if there is any other way of getting the object ID instead of copying from the URL.
Hi Ruben,
Ouch! And thanks. Well, it's obvious that we have a lot of improvements to do on our documentation! We'll take your experience to heart as we're now working on a major update of our API-docs. I'll get back to you once we've improved our docs and hopefully your next review will be a bit more positive.
As to your question on 27 I guess one of the mistakes we've made is that we wrongfully assumed that API-users would begin with a search, e.g. http://www.europeana.eu/portal/api/console.html?function=search&query=multatuli and then pick up the id and/or provided record call directly from the response (both are included) for the full record call, e.g. http://www.europeana.eu/portal/api/console.html?function=record&profile=full&recordId=%2F92062%2F8E88751AB58C3D950E96A4C92505DB8600BB99C4
Bad assumption.
Cheers,
David