Fetching RDF : Data Objects

RDF in a repository rarely comes in independent statements. The client does not know the form of the RDF available about a resource. A query is good for locating resources but extracting information may not just be a matter of the client listing the properties required.

For example, Dublin Core defines a number of properties, such as "title" and "creator", that can usefully describe digital content. Usually therer wil be several statements about the resource being described. Which properties are there on a given resource in this RDF repository?

Two issues arise:

Optional properties, where a property may or may not be supplied.
Structures, where the useful data is in a structure of bNodes.

In the first case, a query that lists all the Dublin Core properties will fail if any one of the properties is missing as the matching of the graph pattern fails.

In the second case, information such as a person's name, may itself be number of statements: in an RDF vCard, there is a formatted name and also the name as a structure:

<http://example.org/employee/1357>
    vcard:FN "John Smith" ;
    vcard:N
        [ vcard:givenName   "John" ;
          vcard:familyName  "Smith"
        ] .

We do not wish the client to have to determine which properties or structures are present. This would take a number of network operations.

When the vCard vocabulary was defined, there was the concept of a vCard "object". Joseki provides a framework for the server-side definition of what constitutes a unit of information: the data object. We provide a single operation fetch to retrieve such data objects.

The Fetch Operation

The fetch operation is saying "get all relevant RDF about this thing in the repository". The fetch operation is a query. It takes a single URI as argument and returns all RDF about that resource. The definition of "all RDF" is what the server decides, by various dynamically loaded modules, and the client gets a small RDF graph which it can inspect.

In the example, a vCard fetch module would return all the properties from the vCard namespace which have the resource as subject and all the structures, such as that starting from vcard:N, that compromise the concept of a vCard.

The exact RDF to return on a fetch operation is dependent on the configuration of teh Joseki server and on what the publisher wishes to provide for each RDF repository (model) on the server. The RDF returned is publication-specific.

A fetch request over HTTP GET might look like:

GET /model?lang=fetch&r=http://example.org/resource HTTP/1.1
Host: example.org

Data objects have URIs (the full string including model and resource) so an application can make further RDF statements about the object. Because Joseki provides fetch as a HTTP GET operation, the data object has a well-fomred URI it is possible to cache them and make links to them from other application or documents.

Sometimes, the subject resource is identified by a property/value pair. A fetch request can alternatively provide an identifying property and value:

GET /model?lang=fetch&p=http://example.org/ns#prop&v=literal HTTP/1.1
Host: example.org

GET /model?lang=fetch&p=http://example.org/ns#prop&o=http://host/x HTTP/1.1
Host: example.org

Note that if the value is a URI, the parameter name is o, if it is a plain literal string, the parameter name is v. More complex expressions to locate the resource or resources of interest should be done with RDQL.

Reference and Containment Links

At a data object level, there are two kinds of links. This is not an absolute classification as it depends on context; this informal link characterisation is not meant as a rigid defintion. There is one kind of link in RDF (the statement or triple) and the idea of data objects is at a higher level.

Reference links connect a resource with other information. They are links out of the data object and do not imple the link target (the object of the RDF statement) forms part of the data object.

Containment links define the conneted part of the graph that comprise the data object subgraph. In the vCard example, it is the properties from the vcard namespace. In the FOAF vocabulary, property like foaf:name is a conatinment link, the property foaf:knows is not a containment link and foaf:depiction might be considered as either. This isn't an absolute classification and will depend on application context.

A fetch module should form a subgraph which includes nodes connected by containment links and not include nodes referred to by reference links. Fetch modules are selected by the server configuration file and wil be appropriate to the published RDF.