Created
June 14, 2012 01:40
-
-
Save JeniT/2927644 to your computer and use it in GitHub Desktop.
Possible way to provide POSTable URI in RDF
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<http://www.amazon.com/gp/product/B000QECL4I> | |
eg:reviews <http://www.amazon.com/product-reviews/B000QECL4I> ; | |
eg:order "http://www.amazon.com/gp/product/B000QECL4I{?copies}" ; | |
. | |
and then the definition of eg:reviews would say "the object of this property | |
provides reviews of the subject of this property" and the definition of | |
eg:order would say "POST to the URI generated by expanding the URI template | |
value of this property where the copies variable is the number of copies to | |
be ordered" | |
dunno on question of whether URI template should have its own datatype |
Author
JeniT
commented
Jun 16, 2012
via email
On 15 Jun 2012, at 23:17, Erik Wilde wrote:
> But if we want to describe the format of a graph that describes an
> order, then according to REST over HTTP, we really need a media type:
> e.g. x-example/order+turtle (or something). That allows us to document
> the required graph structure in a media type and achieve some shared
> understanding between client and server.
yes, i absolutely agree with that, but i think @JeniT disagrees with that.
I don't disagree that a custom media type is useful, I am questioning whether it is the only thing that works.
At one level, I have a pragmatic concern that there are multiple syntaxes for RDF, and people who operate LDP-based services will find it burdensome to define specific media types for each of the different flavours: text/vnd.amazon.order+turtle, application/vnd.amazon.order+xml, application/vnd.amazon.order+json and so on. (Note that I'm assuming that the +xml variant is RDF/XML and the +json variant is JSON-LD if we went down this path we should work with IETF to define a structured syntax suffix registration for at least +turtle.)
At another level, I want there to be specification-level clarity that states that a custom media type for each service that accepts a POST/PUT, and guidance on how to use them. It is not clear to me, when someone says "according to REST over HTTP we must..." which specification they are referring to where this constraint is specified. It could be:
1. that the HTTP specification states that on an OPTIONS request, the server
MUST provide a response with a (eg) Accept-Content-Types header that lists
acceptable media types, and further that all POST/PUT requests that include
content that is valid according to that media type MUST be successful (ie
that the media type given in Accept-Content-Types must be defined at
a granular level, so you can't just say Accept-Content-Types: application/xml
unless you really do accept all XML)
2. that the need for a specific media type is only actually at the level of a
REST best practice rather than a constraint at the HTTP specification
level, but we want to make it a tighter constraint in LDP because we want
LDP to follow all REST best practices
#1 does not seem to be the case. I'm totally fine with #2 as long as we are honest that this is what we are doing and provide sufficient detail such that developers writing servers and clients know what they need to do to satisfy it.
I think that the REST best practice is not so much "the constraints on a POST/PUT should be identified through a media type" as "the constraints on a POST/PUT should be discoverable". @dret said:
in media types, that knowledge would be coupled to the link relation, either implicitly (submit something using this vocabulary when traversing such a link), or explicitly (often using ***@***.*** or ***@***.*** attributes in XML vocabularies). this allows clients to choose according to their capabilities and preferences, if servers provide alternatives, and those alternatives are communicated through media types. new capabilities may show up when a server starts supporting additional interactions, but clients often need to be updated (learning about the new media types) to be able to take advantage of these new capabilities.
Taking Atom as an example of good RESTful practice, I note that its `link` element has a `@type` attribute, but it is only defined in terms of the media type of the response to a GET on the `@href`, not on limiting what can be submitted when POSTing to that URI. The `edit` link relation defined in the Atom publishing protocol doesn't say anything about the interpretation of the `@type` attribute in this context either. The Atom service descriptions do have an `accept` element, but it's not specified how these are located. It would be really good to have an example of an RESTful API that is actually doing this _right_, that we could follow. Presumably you have an example in mind, @dret?
But anyway, let's explore some possible patterns in an XML world. As @dret said, the first possibility would be for the link relation to implicitly describe what is expected by the endpoint, so when you GET information about a product, it includes a link like:
<link rel="http://example.com/relation/order"
href="http://amazon.com/order" />
and by knowledge of the link relation http://example.com/relation/order (which is presumably defined at that URI, although there's no constraint to make that so within Atom so far as I can tell), an application can work out what it can send to the endpoint. This only works if there aren't endpoint-specific constraints on what's acceptable.
A second pattern is that the owner of the web service specifies a media type `application/vnd.order+xml` and that's used in the `@type` attribute, with the link relation `http://example.com/relation/order` specifying that the `@type` attribute indicates the media type of what can be POSTed to the URI in the `@href`:
<link rel="http://example.com/relation/order"
href="http://amazon.com/order"
type="application/vnd.amazon.order+xml" />
A third possible pattern would be to have the `application/xml` media type specify some media type parameters that enabled people to specify a schema location for and document element of some XML (there are multiple ways to cut that of course; I'm more interested in the pattern of using media type parameters than the niceties of what that would mean for XML). In that case, the link would look like:
<link rel="http://example.com/relation/order"
href="http://amazon.com/order"
type='application/xml;schema="http://amazon.com/schema/order.xsd";root="{http://amazon.com/schema/order}Order"' />
A fourth possible pattern would be to add `@x:schema` and `@x:root` attributes to Atom's `link` element to provide equivalent information, like this:
<link rel="http://example.com/relation/order"
href="http://amazon.com/order"
type="application/xml"
x:schema="http://amazon.com/schema/order.xsd"
x:root="{http://amazon.com/schema/order}Order" />
A fifth pattern would be to not define anything on the `link` element itself, but for the documentation of the link relation `http://example.com/relation/order` to state that applications can query on the `@href` URI using the OPTIONS method, and what should be returned in that case, and for that response to specify the constraints. So the document containing the link would have:
<link rel="http://example.com/relation/order"
href="http://amazon.com/order" />
just like in the first example, but doing an OPTIONS request on `http://amazon.com/order` would result in something like:
<service xmlns="http://www.w3.org/2007/app"
xmlns:atom="http://www.w3.org/2005/Atom">
<workspace>
atom:titleAmazon/atom:title
<collection href="http://amazon.com/order" >
atom:titleOrders/atom:title
<accept>application/vnd.amazon.order+xml</accept>
</collection>
</workspace>
</service>
with of course also the possibility for the `accept` element in this case to follow any of the patterns above.
There may be other plausible patterns. All these patterns are possible for RDF-based services too.
My hypothesis is that it's impossible in the general case to specify all possible constraints on acceptable POST/PUT entities. Some constraints are going to be unknowable because they depend on the state of the world at submission time (eg are there sufficient items in stock to fulfil the order). Other constraints are going to be endpoint specific (eg is the item of a type that the vendor sells).
So you have to draw the line somewhere. I think as a developer the crucial thing is discoverability: I would prefer to have the link relation/link/endpoint specify `application/xml` plus the schema and document element of the expected XML than for it to specify an unregistered media type of `application/vnd.amazon.order+xml`. But I may have missed some REST theory that states that this is not a good way of specifying constraints?
The equivalent for RDF would be for the property/endpoint metadata/endpoint itself to specify an RDF serialisation (application/rdf+xml, text/turtle etc) plus something that defines acceptable RDF graphs. As @ldodds said:
> An RDF Schema or OWL ontology don't let us describe the structure of a
> graph in the same way that we could define a schema for an order
> document in XML. So that doesn't give us enough leverage.
the problem is that RDF has no concept of validation. but there should
be something similar, right? checking a graph against expectations can
surely be done somehow, is there some framework for that?
This is a gap in the RDF stack (and one that's come up a few times during TAG discussions over the last few days). OWL inference can be run in a "closed world" mode that does a kind of validation. We have SPARQL graph patterns, but using them as a means of validating RDF would be like doing XML validation solely through XPath expressions. It would be nice to have a grammar more like RELAX NG for RDF graphs; I think that Eric Prud'hommeaux is interested in doing something like that, but it would surprise me if there weren't something similar around already from which we could learn.
We should really be on the LDP mailing list to discuss this rather than here...
| At one level, I have a pragmatic concern that there are multiple syntaxes for RDF, and people who operate LDP-based services will find it burdensome to define specific media types for each of the different flavours: text/vnd.amazon.order+turtle, application/vnd.amazon.order+xml, application/vnd.amazon.order+json and so on. (Note that I'm assuming that the +xml variant is RDF/XML and the +json variant is JSON-LD if we went down this path we should work with IETF to define a structured syntax suffix registration for at least +turtle.)
i absolutely agree that this is not nice on a variety of levels. i
wouldn't get my hopes too high on fixing the media types spec, though.
there's a lot of history to it, it's even bigger than the web, so making
any changes is a very sensitive thing to do. regarding the suffixes,
maybe that's something that could be done, but you'd end up answering a
lot of questions that are very hard to answer.
| 1. that the HTTP specification states that on an OPTIONS request, the server
| MUST provide a response with a (eg) Accept-Content-Types header that lists
| acceptable media types, and further that all POST/PUT requests that include
| content that is valid according to that media type MUST be successful (ie
| that the media type given in Accept-Content-Types must be defined at
| a granular level, so you can't just say Accept-Content-Types: application/xml
| unless you really do accept all XML)
i agree that this is not written down in these absolute terms anywhere.
and as you know, 99.99% of application/xml services then would have to
be application/xdm anyway (if there were such a media type).
| I think that the REST best practice is not so much "the constraints on a POST/PUT should be identified through a media type" as "the constraints on a POST/PUT should be discoverable". @dret said:
i like the term discoverable here, but then again the question remains
through what means. it could be HTTP (even thought it's not mandatory),
it could be registrations somewhere (that's the media type route), or it
could be through runtime mechanisms (which then need machinery that is
capable of using them).
| Taking Atom as an example of good RESTful practice, I note that its `link` element has a `@type` attribute, but it is only defined in terms of the media type of the response to a GET on the `@href`, not on limiting what can be submitted when POSTing to that URI. The `edit` link relation defined in the Atom publishing protocol doesn't say anything about the interpretation of the `@type` attribute in this context either. The Atom service descriptions do have an `accept` element, but it's not specified how these are located. It would be really good to have an example of an RESTful API that is actually doing this _right_, that we could follow. Presumably you have an example in mind, @dret?
atompub does specify the expected media types in the media type
registration itself (defining the link relations and what clients are
supposed to do when the follow these links). i haven't written the spec,
but i assume the idea was to only specify those media types which are
dynamic at runtime (@accept). service descriptions are discoverable
through "service", which for some reason i still don't understand is
listed in
http://www.iana.org/assignments/link-relations/link-relations.xml as
specified in RFC 5023, when it very clearly isn't. @jasnell may have the
background on this, but i think it became apparent that making service
documents discoverable was a good idea, and adds very little overhead
(just one link relation).
i think overall, atompub gets it right. like you mentioned earlier,
clients need to be coded to support these interaction patterns of a
media type anyway, and because of that, it does not hurt that not all
expectations about media types in link interactions are discoverable at
runtime. only if they are variable there should be a runtime mechanism.
| But anyway, let's explore some possible patterns in an XML world. As @dret said, the first possibility would be for the link relation to implicitly describe what is expected by the endpoint, so when you GET information about a product, it includes a link like:
| <link rel="http://example.com/relation/order"
| href="http://amazon.com/order" />
| and by knowledge of the link relation http://example.com/relation/order (which is presumably defined at that URI, although there's no constraint to make that so within Atom so far as I can tell), an application can work out what it can send to the endpoint. This only works if there aren't endpoint-specific constraints on what's acceptable.
"http://example.com/relation/order" is not a link, it's an identifier
(http://tools.ietf.org/html/rfc5988#section-4.2). clients have knowledge
of the link relations they can traverse (because then implement them),
and other links are meaningless to them. i am not 100% sure what you
mean by "endpoint-specific constraints". if the media type or the
registered link relation specify a media type that is expected when
following that link, then that's what a server should accept. of course
it might reject it because of service aspects (invalid product number in
order), is that what you're referring to?
| A second pattern is that the owner of the web service specifies a media type `application/vnd.order+xml` and that's used in the `@type` attribute, with the link relation `http://example.com/relation/order` specifying that the `@type` attribute indicates the media type of what can be POSTed to the URI in the `@href`:
| <link rel="http://example.com/relation/order"
| href="http://amazon.com/order"
| type="application/vnd.amazon.order+xml" />
i've seen that quite a bit for GET, but not for POST, i think. but it
would work for POSTs as well, as long as the link relation (either in
the media type or in the link relation registration) makes it clear that
@type refers to the request, and not to the response.
| A third possible pattern would be to have the `application/xml` media type specify some media type parameters that enabled people to specify a schema location for and document element of some XML (there are multiple ways to cut that of course; I'm more interested in the pattern of using media type parameters than the niceties of what that would mean for XML). In that case, the link would look like:
| <link rel="http://example.com/relation/order"
| href="http://amazon.com/order"
| type='application/xml;schema="http://amazon.com/schema/order.xsd";root="{http://amazon.com/schema/order}Order"' />
| A fourth possible pattern would be to add `@x:schema` and `@x:root` attributes to Atom's `link` element to provide equivalent information, like this:
| <link rel="http://example.com/relation/order"
| href="http://amazon.com/order"
| type="application/xml"
| x:schema="http://amazon.com/schema/order.xsd"
| x:root="{http://amazon.com/schema/order}Order" />
that i don't like that much because in many cases, media types not just
specify a schema, but also a processing model for the client (how to
handle extensions of the base schema, for example). if all you can
specify is a schema, then you cannot specify a processing model.
| A fifth pattern would be to not define anything on the `link` element itself, but for the documentation of the link relation `http://example.com/relation/order` to state that applications can query on the `@href` URI using the OPTIONS method, and what should be returned in that case, and for that response to specify the constraints. So the document containing the link would have:
| <link rel="http://example.com/relation/order"
| href="http://amazon.com/order" />
| just like in the first example, but doing an OPTIONS request on `http://amazon.com/order` would result in something like:
| <service xmlns="http://www.w3.org/2007/app"
| xmlns:atom="http://www.w3.org/2005/Atom">
| <workspace>
| atom:titleAmazon/atom:title
| <collection href="http://amazon.com/order">
| atom:titleOrders/atom:title
| <accept>application/vnd.amazon.order+xml</accept>
| </collection>
| </workspace>
| </service>
| with of course also the possibility for the `accept` element in this case to follow any of the patterns above.
that would be perfectly legitimate behavior for a media type, making as
many thing runtime as possible. the question is what you're buying with
this pattern, i.e. are you really expecting that clients will support
different order media types and then can maybe specify their supported
media types in the request via accept when they follow the order link.
it's doable, but i have not seen that level of radical openness. i'd say
that typically, media types encode an application scenario and assume
that clients are interaction within that framework. they might define
extension points and places where clients can find additional links, but
within the media type scenario, things are typically designed with
making some decisions design time, and only making those decisions
runtime where there's a specific goal for doing that.
| My hypothesis is that it's impossible in the general case to specify all possible constraints on acceptable POST/PUT entities. Some constraints are going to be unknowable because they depend on the state of the world at submission time (eg are there sufficient items in stock to fulfil the order). Other constraints are going to be endpoint specific (eg is the item of a type that the vendor sells).
of course you should not hardcode the available products into the media
type, that would be a pretty terrible design. but you can hardcode all
the things that make sense for your application scenario, strategically
leaving those things up to runtime that can change at runtime. that's
how you usually design services that are as easy to use as possible, at
least from the SOA point of view.
| So you have to draw the line somewhere. I think as a developer the crucial thing is discoverability: I would prefer to have the link relation/link/endpoint specify `application/xml` plus the schema and document element of the expected XML than for it to specify an unregistered media type of `application/vnd.amazon.order+xml`. But I may have missed some REST theory that states that this is not a good way of specifying constraints?
the crucial this is "understandability", which might be a little bit
different. media types are supposed to be "self-describing" (not in the
semweb sense of the word) in the sense that you see an instance, there
is a way how to find information that helps you to understand what it
means. the media type is the label you start with, and then you go to
the registry and can find the definition.
`application/vnd.amazon.order+xml` should be documented somewhere, and
there's google. that will get you to a document that tells you the
conversational context. if you're just linked to the schema, you can
auto-generate an instance, but you don't understand the conversational
scenario (get a shopping card id, add items to it, get your customer id,
and then submit an order with your shopping cart and customer id). a
service has almost always more context than just one isolated
interaction, and the media type establishes that context. that's why the
important part about atompub is the protocol, and not the schemas (which
are fairly minimal, as a diff with atom).
| This is a gap in the RDF stack (and one that's come up a few times during TAG discussions over the last few days). OWL inference can be run in a "closed world" mode that does a kind of validation. We have SPARQL graph patterns, but using them as a means of validating RDF would be like doing XML validation solely through XPath expressions. It would be nice to have a grammar more like RELAX NG for RDF graphs; I think that Eric Prud'hommeaux is interested in doing something like that, but it would surprise me if there weren't something similar around already from which we could learn.
i think that for any kind of service scenario, validation is essential.
it's the first line of defense, effective when backed by a good schema
language, and thus takes load off the actual service implementation. and
even for the "just POST some RDF graph to an RDF database", i would
guess that in all settings with loose coupling, you would want to have
some control over what people are POSTing.
| We should really be on the LDP mailing list to discuss this rather than here...
now that you're mentioning it ;-) feel free to link to the gist, maybe
for tomorrow's meeting people would like to read some of that. and as
usual, thanks a lot for your great comments!
Hi,
why not use SPARQL (or rather, graph patterns) to describe inputs, outputs and relation between input and output? Most of the current approaches on http://linkedservices.org/ use that type of description.
HATEOAS URIs could just be embedded into the RDF that's returned.
Best regards,
Andreas.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment