The types of data we will concern ourselves with:
- [T]riples
- [Q]uads
- [M]ixed triples and quads
- [S]PARQL Results (mapping variable names to RDF Terms)
- [O]perations (insert/delete operations for RDF Term tuples)
- [B]ytes (flat, unstructured, or opaque data)
We will discuss data of type t as being either structured [t] or serialized [t'].
The potential sources of data and their respective types:
- variable [*, *']
- IO handle [*']
- iterator [T,Q,M,S,O]
- URL [*'] # the actual type should be determined at dereference time by the content-type header
- model [T,Q]
- store [T,Q]
The potential destinations (sinks) for data and their respective types:
- variable [*, *']
- IO handle [B']
- iterator [*]
- model [M,O]
- store [M,O; but impl. dep (some stores might just accept triples while others accept both triples and quads)]
An Input
is a typed source (source[type]). An Output
is a typed sink (sink[type]).
The parsers and their respective types:
- N-Quads [M]
- N-Triples [T]
- Turtle [T]
- RDF/XML [T]
- RDF/JSON [T]
- RDFa [T]
- TriG [M]
- RDFPatch [O]
- SPARQL/XML [S]
- SPARQL/JSON [S]
- TSV [S]
The serializers and their respective types:
- N-Quads [M]
- N-Triples [T]
- N-Triples/Canon [T]
- Turtle [T]
- RDF/XML [T]
- RDF/JSON [T]
- TriG [M]
- RDFPatch [O]
- CSV [B]
- SPARQL/XML [S]
- SPARQL/JSON [S]
- TSV [S]
The parsing process is: Input x Parser -> Output
The serialization process is: Input x Serializer -> Output
The valid typings for these processes are:
source[T'] x parser[T',M'] -> sink[T,M]
source[Q'] x parser[Q',M'] -> sink[Q,M]
source[M'] x parser[M'] -> sink[M]
source[S'] x parser[S'] -> sink[S]
source[O'] x parser[O'] -> sink[O]
source[T] x serializer[T,M] -> sink[B']
source[Q] x serializer[Q,M] -> sink[B']
source[M] x serializer[M] -> sink[B']
source[S] x serializer[S,B] -> sink[B']
source[O] x serializer[O] -> sink[B']
Parse(source[t'], parser[u'], sink[v])
check:
# make sure that the types make sense
t' == u' == v
or
(t',u',v) in:
(T', [T', M'], [T, M])
(Q', [Q', M'], [Q, M])
Serialize(source[t], serializer[u], sink[B'])
check:
# make sure that the types make sense
t == u
or
(t, u) in
(T, M)
(Q, M)
(S, B)
Other types must make use of casting functions to participate in parsing and serializing. For example, to serialize quads into a triple format like N-Triples, a function to drop the graphs on each statement (and yielding triples) can be used to cast the store (typed as a [Q]uad source) to a [T]riple source:
# (Using a more functional syntax)
Serialize(drop-graph(store[Q])[T], N-Triples[T]) -> IOHandle[T']
Some useful casting functions are:
- drop-graph(source[Q,M]) -> source[T]
- add-graph(source[T,M], iri[R]) -> source[Q]
- map-statement(source[T,Q,M], positionToNameMap) -> source[S]
- construct(source[S], template[t: T,Q,M]) -> source[t]
- insert(source[T,Q,M]) -> source[O]
- delete(source[T,Q,M]) -> source[O]
- filter(source[t], block) -> source[t]
- map(source[t], block, u) -> source[u]
- cast(source[B], t) -> source[t] # e.g. for guessing a format type from the filename extension