XML Data and Node Types

XML is used as the data source for XQuery and must be parsed into Hyracks data. Each node type defined in XPath and XQuery can be mapped into pointable defined in Apache VXQuery™.

XPath Node Types

Data Type Pointable Name Data Size
Attribute Nodes(ANP) AttributeNodePointable 1 + length
Document Nodes(DNP) DocumentNodePointable 1 + length
Element Nodes(ENP) ElementNodePointable 1 + length
Node Tree(NTP) NodeTreePointable 1 + length
Processing Instruction Node(PINP) PINodePointable 1 + length
Comment Node(CNP) TextOrCommentNodePointable 1 + length
Text Node(TNP) TextOrCommentNodePointable 1 + length

XML Mapping

The XML mapping to Hyracks pointables is fairly straight forward. The following example shows how each node is mapped and saved into a byte array used by Hyracks.

Example XML File

The example XML file comes from W3School XQuery tutorial.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- Edited by XMLSpyÆ -->
<bookstore>

    <book category="COOKING">
        <title lang="en">Everyday Italian</title>
        <author>Giada De Laurentiis</author>
        <year>2005</year>
        <price>30.00</price>
    </book>
    
    <book category="CHILDREN">
        <title lang="en">Harry Potter</title>
        <author>J K. Rowling</author>
        <year>2005</year>
        <price>29.99</price>
    </book>
    
    <book category="WEB">
        <title lang="en">XQuery Kick Start</title>
        <author>James McGovern</author>
        <author>Per Bothner</author>
        <author>Kurt Cagle</author>
        <author>James Linn</author>
        <author>Vaidyanathan Nagarajan</author>
        <year>2003</year>
        <price>49.99</price>
    </book>
    
    <book category="WEB">
        <title lang="en">Learning XML</title>
        <author>Erik T. Ray</author>
        <year>2003</year>
        <price>39.95</price>
    </book>

</bookstore>

Example Hyracks Mapping

The mapping is explained through using some short hand for the above example XML file. Realize the direct bytes will not be explained although the pointable names are used for each piece of information.

NodeTree {
    DocumentNode {bookstore}
        sequence (children) {
            ElementNode {book}
                sequence (attributes) {
                    AttributeNode {category}
                }
                sequence (children) {
                    ElementNode {title:Everyday Italian}
                        sequence (attributes) {
                            AttributeNode {lang}
                        }
                    ElementNode {author}
                    ElementNode {year}
                    ElementNode {price}
                }
            ElementNode {book}
                sequence (attributes) {
                    AttributeNode {category}
                }
                sequence (children) {
                    ElementNode {title:Harry Potter}
                        sequence (attributes) {
                           AttributeNode {lang}
                        }
                    ElementNode {author}
                    ElementNode {year}
                    ElementNode {price}
                }
            ElementNode {book}
                sequence (attributes) {
                    AttributeNode {category}
                }
                sequence (children) {
                    ElementNode {title:XQuery Kick Start}
                        sequence (attributes) {
                            AttributeNode {lang}
                        }
                    ElementNode {author}
                    ElementNode {author}
                    ElementNode {author}
                    ElementNode {author}
                    ElementNode {author}
                    ElementNode {year}
                    ElementNode {price}
                }
            ElementNode {book}
                sequence (attributes) {
                    AttributeNode {category}
                }
                sequence (children) {
                    ElementNode {title:Learning XML}
                        sequence (attributes) {
                            AttributeNode {lang}
                        }
                    ElementNode {author}
                    ElementNode {year}
                    ElementNode {price}
                }
        }
}