This page documents the various Simple API for XML (SAX) samples included with Xerces. Besides being useful programs, they can be used as SAX programming examples to learn how to program using the SAX API.
SAX samples:
Most of the SAX parser samples have a command line option that
allows the user to specify a different DOM parser to use. In
order to supply another SAX parser besides the default Xerces
SAXParser
, the parser must implement either the
org.xml.sax.Parser
or org.xml.sax.XMLReader
interfaces.
-Djavax.xml.parsers.SAXParserFactory=...
option to the virtual machine in order to use a different
SAX parser factory.
A sample SAX2 counter. This sample program illustrates how to register a SAX2 ContentHandler and receive the callbacks in order to print information about the document. The output of this program shows the time and count of elements, attributes, ignorable whitespaces, and characters appearing in the document.
This class is useful as a "poor-man's" performance tester to compare the speed and accuracy of various SAX parsers. However, it is important to note that the first parse time of a parser will include both VM class load time and parser initialization that would not be present in subsequent parses with the same file.
Option | Description |
---|---|
-p name | Select parser by name. |
-x number | Select number of repetitions. |
-n | -N | Turn on/off namespace processing. |
-np | -NP |
Turn on/off namespace prefixes. NOTE: Requires use of -n. |
-v | -V | Turn on/off validation. |
-s | -S |
Turn on/off Schema validation support. NOTE: Not supported by all parsers. |
-f | -F |
Turn on/off Schema full checking. NOTE: Requires use of -s and not supported by all parsers. |
-m | -M | Turn on/off memory usage report. |
-t | -T | Turn on/off \"tagginess\" report. |
--rem text | Output user defined comment before next parse. |
-h | Display help screen. |
The speed and memory results from this program should NOT be used as the basis of parser performance comparison! Real analytical methods should be used. For better results, perform multiple document parses within the same virtual machine to remove class loading from parse time and memory usage.
The "tagginess" measurement gives a rough estimate of the percentage of markup versus content in the XML document. The percent tagginess of a document is equal to the minimum amount of tag characters required for elements, attributes, and processing instructions divided by the total amount of characters (characters, ignorable whitespace, and tag characters) in the document.
Not all features are supported by different parsers.
Provides a complete trace of SAX2 events for files parsed. This is useful for making sure that a SAX parser implementation faithfully communicates all information in the document to the SAX handlers.
Option | Description |
---|---|
-p name | Select parser by name. |
-n | -N | Turn on/off namespace processing. |
-v | -V | Turn on/off validation. |
-s | -S |
Turn on/off Schema validation support. NOTE: Not supported by all parsers. |
-h | Display help screen. |
A sample SAX2 writer. This sample program illustrates how to register a SAX2 ContentHandler and receive the callbacks in order to print a document that is parsed.
Option | Description |
---|---|
-p name | Select parser by name. |
-n | -N | Turn on/off namespace processing. |
-v | -V | Turn on/off validation. |
-s | -S |
Turn on/off Schema validation support. NOTE: Not supported by all parsers. |
-c | -C |
Turn on/off Canonical XML output. NOTE: This is not W3C canonical output. |
-h | Display help screen. |