Ordered Content

An XML document has a tree structure where the children of each element are lexically ordered. Sometimes this ordering is significant, sometimes it is not. Gloze is designed from the point of view of retaining sufficient information to reconstruct the original document from the RDF, making it possible to round-trip from XML to RDF and back again. This doesn't mean recording the order in all cases. In many cases the XML schema contains enough ordering information to reconstruct the original sequence. Ambiguity is created by multiple occurrences of the same element, by mixed content, or by the appearance of the xs:any wild-card. Sequences of singly occurring xs:sequence are entirely unambigous, as are choices where only one of the choices can occur (so long as this occurs once). The compositor 'all', where all orderings are valid typically require sequencing (unless in the degenerate case where they only have a single element). The ordering of attributes is not significant.

All the examples in this section were generated with order=seq (-Dgloze.order=seq).

The following schema describes an unambiguous sequence containing an unambiguous choice.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://example.org/" xmlns="http://example.org/" elementFormDefault="qualified">

        <xs:element name="foobar">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element name="foo" type="xs:string" minOccurs="0" />
                                <xs:choice>
                                        <xs:element name="bar" type="xs:string"/>
                                        <xs:element name="baz" type="xs:string"/>
                                </xs:choice>
                        </xs:sequence>
                </xs:complexType>
        </xs:element>

</xs:schema>

The XML below requires no additional ordering information.

<?xml version="1.0" encoding="UTF-8"?>
<foobar xmlns="http://example.org/">
        <foo>foobar</foo>
        <bar>foobar</bar>
</foobar>

# Base: http://example.org/ordering.xml
@prefix ns1:     <http://example.org/> .
@prefix xs:      <http://www.w3.org/2001/XMLSchema> .
@prefix ns2:     <http://example.org/def/> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xs_:     <http://www.w3.org/2001/XMLSchema#> .
@prefix :        <#> .

<>    ns1:foobar
              [ ns1:bar "foobar"^^xs_:string ;
                ns1:foo "foobar"^^xs_:string
              ] .

Where there is ambiguity in the schema for a given element content this is captured as an RDF sequence. We need to record the order in which particular statements appear. This is achieved by reifying the statement, giving us an object that can be added to an rdf:Seq.

The schema below permits multiple occurrences of the element 'bar', so ordering information needs to be recorded if it is significant.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://example.org/" xmlns="http://example.org/" elementFormDefault="qualified">

        <xs:element name="foobar">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element name="foo" type="xs:string"/>
                                <xs:element name="bar" type="xs:string" maxOccurs="unbounded"/>
                        </xs:sequence>
                </xs:complexType>
        </xs:element>

</xs:schema>
<?xml version="1.0" encoding="UTF-8"?>
<foobar xmlns="http://example.org/">
        <foo>foobar</foo>
        <bar>foobar</bar>
        <bar>foobar</bar>
</foobar>

The RDF mapping shows how ordering information is recorded alongside the standard RDF mapping. The standard mapping involves adding the property 'foo' with value "foobar". The subject of this statement is also an RDF sequence, and the first member of this sequence is the reification with rdf:predicate 'foo' and rdf:object "foobar". The two occurrences of 'bar'with identical values, "foobar" result in the assertion of a single property/value. Despite this, two reified statements are added to the sequence, one for each occurrence.

# Base: http://example.org/ordering1.xml
@prefix ns1:     <http://example.org/> .
@prefix xs:      <http://www.w3.org/2001/XMLSchema> .
@prefix ns2:     <http://example.org/def/> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xs_:     <http://www.w3.org/2001/XMLSchema#> .
@prefix :        <#> .

_:b1  a       rdf:Seq ;
      rdf:_1  [ a       rdf:Statement ;
                rdf:object "foobar"^^xs_:string ;
                rdf:predicate ns1:foo ;
                rdf:subject _:b1
              ] ;
      rdf:_2  [ a       rdf:Statement ;
                rdf:object "foobar"^^xs_:string ;
                rdf:predicate ns1:bar ;
                rdf:subject _:b1
              ] ;
      rdf:_3  [ a       rdf:Statement ;
                rdf:object "foobar"^^xs_:string ;
                rdf:predicate ns1:bar ;
                rdf:subject _:b1
              ] ;
      ns1:bar "foobar"^^xs_:string ;
      ns1:foo "foobar"^^xs_:string .

<>    ns1:foobar _:b1 .

Ordering is transparent from an ontological perspective. One of the design goals was to allow users of the RDF mapping to ignore the sequence if it is not relevant. Ordering is treated as a data-structuring issue rather than an ontological one. The addition of a property to a resource is treated independently of adding it to the sequence; sequencing meta-data is overlaid on top of the existing unordered information model.


Generated on Mon Jun 18 16:02:38 2007 for Gloze by  doxygen 1.5.0