Very often, applications need to transmit XML documents over the network using a socket stream. However, XML is not designed to make this possible because XML documents do not contain an explicit end-of-document terminal. Therefore, the stream must end (i.e. the socket must close) in order for the parser to finish parsing the complete XML document.
Since creating socket streams is expensive the application needs to re-use the same stream but XML doesn't define an end-of-document. Therefore, another solution must be found. The socket samples included with Xerces can be used to learn how to overcome this common problem in a general way.
Socket samples:
This sample delays the input to the SAX parser to simulate reading data
from a socket where data is not always immediately available. An XML
parser should be able to parse the input and perform the necessary
callbacks as data becomes available. So this is a good way to test
any parser that implements the SAX2 XMLReader interface
to see if it can parse data as it arrives.
Note: This sample uses NSGMLS-like output of elements and attributes interspersed with information about how many bytes are being read at a time.
| Option | Description |
|---|---|
| -p name | Select SAX2 parser by name. |
| -n | -N | Turn on/off namespace processing. |
| -v | -V | Turn on/off validation. |
| -s | -S |
Turn on/off Schema validation support. NOTE: Not supported by all parsers."); |
| -f | -F |
Turn on/off Schema full checking. NOTE: Requires use of -s and not supported by all parsers. |
| -h | Display help screen. |
This sample provides a solution to the problem of 1) sending multiple XML documents over a single socket connection or 2) sending other types of data after the XML document without closing the socket connection.
The first situation is a problem because the XML specification does not allow a document to contain multiple root elements. Therefore a document stream must end (or at least appear to end) for the XML parser to accept it as the end of the document.
The second situation is a problem because the XML parser buffers the input stream in specified block sizes for performance reasons. This could cause the parser to accidentally read additional bytes of data beyond the end of the document. This actually relates to the first problem if the documents are encoding in two different international encodings.
The solution that this sample introduces wraps both the input and output stream on both ends of the socket. The stream wrappers introduce a protocol that allows arbitrary length data to be sent as separate, localized input streams. While the socket stream remains open, a separate input stream is created to "wrap" an incoming document and make it appear as if it were a standalone input stream.
To use this sample, enter any number of filenames of XML documents as parameters to the program. For example:
This program will create a server and client thread that communicate
on a specified port number on the "localhost" address. When the client
connects to the server, the server sends each XML document specified
on the command line to the client in sequence, wrapping each document
in a WrappedOutputStream. The client uses a
WrappedInputStream to read the data and pass it to the
parser.