The StAX specification comprises two parts: a specification document titled “Streaming API For XML JSR-173 Specification” and a Javadoc describing the API. Both can be downloaded from the JSR-173 page. Since StAX is part of Java 6, the Javadocs can also be viewed online.
Probably one of the more obscure parts of the StAX specifications is the meaning of the
setPrefix
[3] method defined by XMLStreamWriter
.
To understand how this method works, it is necessary to look at different parts of the specification:
The Javadoc of the setPrefix
method.
The table shown in the Javadoc of the XMLStreamWriter
class
in Java 6[4].
Section 5.2.2, “Binding Prefixes” of the specification.
The example shown in section 5.3.2, “XMLStreamWriter” of the specification.
In addition, it is important to note the following facts:
The terms defaulting prefixes used in section 5.2.2 of the
specification and namespace repairing used in the Javadocs
of XMLStreamWriter
are synonyms.
The methods writing namespace qualified information items, i.e.
writeStartElement
, writeEmptyElement
and writeAttribute
all come in two variants: one that
takes a namespace URI and a prefix as arguments and one that only takes a
namespace URI, but no prefix.
The purpose of the setPrefix
method is simply to define the prefixes that
will be used by the variants of the writeStartElement
,
writeEmptyElement
and writeAttribute
methods
that only take a namespace URI (and the local name). This becomes clear by looking at the
table in the XMLStreamWriter
Javadoc. Note that a call to
setPrefix
doesn't cause any output and it is still necessary
to use writeNamespace
to actually write the necessary
namespace declarations. Otherwise the produced document will not be well formed with
respect to namespaces.
The Javadoc of the setPrefix
method also clearly defines the scope
of the prefix bindings defined using that method: a prefix bound using
setPrefix
remains valid till the invocation of
writeEndElement
corresponding to the last invocation of
writeStartElement
. While not explicitly mentioned in the
specifications, it is clear that a prefix binding may be masked by another binding
for the same prefix defined in a nested element.
An aspect that may cause confusion is the fact that in the example shown in section
5.3.2 of the specifications, the calls to setPrefix
(and
setDefaultNamespace
) all appear immediately before a
call to writeStartElement
or writeEmptyElement
.
This may lead people to incorrectly believe that a prefix binding defined using
setPrefix
only applies to the next element
written[5].
This interpretation is clearly in contradiction with the setPrefix
Javadoc, unless one assumes that “the current START_ELEMENT / END_ELEMENT pair”
means the element opened by a call to writeStartElement
immediately following
the call to setPrefix
. This however would be a very arbitrary interpretation
of the Javadoc[6].
The correctness of the comments in the previous paragraph can be checked using the following code snippet:
XMLOutputFactory f = XMLOutputFactory.newInstance(); XMLStreamWriter writer = f.createXMLStreamWriter(System.out); writer.writeStartElement("root"); writer.setPrefix("p", "urn:ns1"); writer.writeEmptyElement("urn:ns1", "element1"); writer.writeEmptyElement("urn:ns1", "element2"); writer.writeEndElement(); writer.flush(); writer.close();
This produces the following output[7]:
<root><p:element1/><p:element2/></root>
Since the code doesn't call writeNamespace
, the output is obviously not
well formed with respect to namespaces, but it also clearly shows that the scope of the
prefix binding for p
extends to the end of the
root
element and is not limited to
element1
.
To avoid unexpected results and keep the code maintainable, it is in general advisable to keep
the calls to setPrefix
and writeNamespace
aligned,
i.e. to make sure that the scope (in XMLStreamWriter
) of the prefix binding
defined by setPrefix
is compatible with the scope (in the produced
document) of the namespace declaration written by the corresponding call
to writeNamespace
. This makes it necessary to write code like this:
writer.writeStartElement("p", "element1", "urn:ns1"); writer.setPrefix("p", "urn:ns1"); writer.writeNamespace("p", "urn:ns1");
As can be seen from this code snippet, keeping the two scopes in sync makes it necessary to use
the writeStartElement
variant which takes an explicit prefix. Note that
this somewhat conflicts with the purpose of the setPrefix
method;
one may consider this as a flaw in the design of the StAX API.
Drawing the conclusions from the previous section and taking into account that
XMLStreamWriter
also has a “namespace repairing”
mode, one can see that there are in fact three different ways to use
XMLStreamWriter
. These usage patterns correspond to the
three bullets in section 5.2.2 of the StAX specification[8]:
In the “namespace repairing” mode (enabled by the
javax.xml.stream.isRepairingNamespaces
property), the writer
takes care of all namespace bindings and declarations, with minimal help from
the calling code. This will always produce output that is well-formed with respect
to namespaces. On the other hand, this adds some overhead and the result may
depend on the particular StAX implementation (though the result produced by
different implementations will be equivalent).
In repairing mode the calling code should avoid writing namespaces explicitly
and leave that job to the writer. There is also no need to call
setPrefix
, except to suggest a preferred prefix for
a namespace URI. All variants of writeStartElement
,
writeEmptyElement
and writeAttribute
may be used in this mode, but the implementation can choose whatever prefix mapping
it wants, as long as the output results in proper URI mapping for elements and
attributes.
Only use the variants of the writer methods that take an explicit prefix together
with the namespace URI. In this usage pattern, setPrefix
is not used at all and it is the responsibility of the calling code to keep
track of prefix bindings.
Note that this approach is difficult to implement when different parts of the output document
will be produced by different components (or even different libraries). Indeed, when
passing the XMLStreamWriter
from one method or component
to the other, it will also be necessary to pass additional information about the
prefix mappings in scope at that moment, unless the it is acceptable to let the
called method write (potentially redundant) namespace declarations for all namespaces
it uses.
Use setPrefix
to keep track of prefix bindings and make sure that
the bindings are in sync with the namespace declarations that have been written,
i.e. always use setPrefix
immediately before or immediately
after each call to writeNamespace
. Note that the code is
still free to use all variants of writeStartElement
,
writeEmptyElement
and writeAttribute
;
it only needs to make sure that the usage it makes of these methods is consistent with
the prefix bindings in scope.
The advantage of this approach is that it allows to write modular code: when a
method receives an XMLStreamWriter
object (to write
part of the document), it can use
the namespace context of that writer (i.e. getPrefix
and getNamespaceContext
) to determine which namespace
declarations are currently in scope in the output document and to avoid
redundant or conflicting namespace declarations. Note that in order to do so,
such code will have to check for an existing prefix binding before starting
to use a namespace.
[3] For simplicity, we only discuss
setPrefix
here. The same remarks also apply to
setDefaultNamespace
.
[4] This table is not included in the Javadoc in the original StAX specification.
[5] Another factor that contributes to the confusion is that in SAX,
prefix mappings are always generated before the corresponding startElement
event and that their scope ends with the corresponding endElement
event. This is so because the ContentHandler
interface specifies that
“all startPrefixMapping
events will occur immediately before the
corresponding startElement
event, and all endPrefixMapping
events will occur immediately after the corresponding endElement
event”.
[6] Early versions of XL XP-J were based on this interpretation of the
specifications, but this has been corrected. Versions conforming to the specifications support
a special property called javax.xml.stream.XMLStreamWriter.isSetPrefixBeforeStartElement
,
which always returns Boolean.FALSE
. This allows to easily distinguish the non
conforming versions from the newer versions. Note that in contrast to what the usage of the reserved
javax.xml.stream
prefix suggests, this is a vendor specific property that
is not supported by other implementations.
[7] This has been tested with Woodstox 3.2.9, SJSXP 1.0.1 and version 1.2.0 of the reference implementation.
[8] The content of this section is largely based on a reply posted by Tatu Saloranta on the Axiom mailing list. Tatu is the main developer of the Woodstox project.