apache > ws.apache
Apache Muse
 

Apache Muse - Parse XML Fragments with XmlUtils

The Muse runtime does a lot of XML processing in almost all of its tasks, and much of that processing power comes from a class called org.apache.muse.util.xml.XmlUtils. This class provides a long list of well-documented static methods for reading and writing XML fragments. And while the Muse framework and tooling try to help you avoid processing DOM trees as part of your resource implementation, the fact is that sometimes you just have to roll up your sleeves and tackle the XML yourself.

The XmlUtils class makes using the DOM API much easier by reducing the amount of boilerplate code you have to write. This how-to will show you how to parse XML documents and/or fragments in order to extract information as painlessly as possible.

Note
The XmlUtils class also has methods for creating XML.

Normally, parsing through DOM trees is arduous because every part of the XML document is a node, including text, attributes, processing instructions, and comments; most of the time, users are just interested in the existence of child elements and what text (if any) is under them. To that end, the XmlUtils class has overloaded methods named getElement(), getElements(), and getElementText(). These methods allow you to zip through an XML fragment without using the org.w3c.dom.Node API to check the type of each node and handle special cases.

Note
There is a separate how-to for loading XML from a file.

Here is some sample code that illustrates the parsing of XML fragments:

import java.net.URI;
import javax.xml.namespace.QName;

import org.w3c.dom.*;

import org.apache.muse.util.xml.XmlUtils;

...

QName firstName = new QName("http://ws.apache.org/muse/test", "FirstChild", "test");
QName secondName = new QName("http://ws.apache.org/muse/test", "SecondChild", "test");

Document inputDoc = ...

Element root = XmlUtils.getFirstElement(inputDoc);
Element firstChild = XmlUtils.getElement(root, firstName);
String secondChildText = XmlUtils.getElementText(root, secondName);

System.out.println("Root element's name is: " + XmlUtils.getElementQName(root));
System.out.println("Second element's text is: " + secondChildText);

The getElementQName() method allows you to retrieve the fully-qualified name of any DOM Element. This may be helpful in cases where you are receiving an Element from some source other than getElement() or getElements().

Two additional methods worth mentioning are getQName() and getQNameFromChild(). These methods are very helpful in WSRF-based applications because WSRF-based specifications often encode qualified names in XML using the following pattern:

<SomeElement xmlns:pfx="http://ws.apache.org/muse/test">pfx:SomeName</SomeElement>

Here, the value of the text under <SomeElement/> is meant to be the fully-qualified name {http://ws.apache.org/muse/test}SomeName. However, only the prefix and local part are specified in the text. We must resolve the prefix within the XML fragment in order to get the fully-qualifed name. The aforementioned methods provide this for us.

getQName() returns the fully-qualified name represented by a given element's text, while getQNameFromChild() returns the same for one of the element's children. Both allow you to get the value you need without having to recurse up the DOM tree looking for a namespace declaration. Here is some sample code that illustrates their usage:

QName firstName = new QName("http://ws.apache.org/muse/test", "SomeElement", "test");

Document doc = ...
Element root = XmlUtils.getFirstElement(doc);
Element firstChild = XmlUtils.getElement(root, firstName);

QName firstChildValue1 = XmlUtils.getQName(firstChild);
QName firstChildValue2 = XmlUtils.getQNameFromChild(root, firstName);

//
// This should print 'true' because we just read the same value
// using two different methods
//
System.out.println(firstChildValue1.equals(firstChildValue2));