Defines an interface for SAX and DOM serializers, a serializer factory and its configuration, the output format properties, and related interfaces.

Version:
Alpha $Revision$ $Date$
Author:
Assaf Arkin
Goals:

The Serializer Interfaces

{@link serialize.Serializer} defines the interface supported by a serializer. A serializer implements a mechanism for producing output from a series of SAX events or a DOM document, in a given format (aka the output method). A serializer can be constructed directly, or obtained from some factory, it may implement the base functionality or provide additional functionality suitable for the given output method (e.g. indentation in XML, page control in PDF, etc).

A serializer is not thread safe and may not be used concurrently, however, a serializer may be recyclable and used to serialize any number of documents with the same output method.

Before serializing a document, the serializer must be set with the output stream or writer, and optionally with an {@link serialize.OutputFormat} specifying the output properties. Serializer implementations may support additional methods to control the way in which documents are serialized, or extend {@link serialize.OutputFormat} and offer additional output properties.

{@link serialize.Serializer} and {@link serialize.OutputFormat} provides the minimum functionality that all serializers must support and that an application may depend on, and are based on the XSLT 1.0 specification.

For the purpose of serializing, a handle to the serializer is obtained that can be either a SAX 1 DocumentHandler, a SAX 2 ContentHandler or a DOM Level 1/2 {@link serialize.DOMSerializer}. The application should obtain and use only one handle at any given time and may not reuse the handle to serialize multiple documents. It is illegal for the application to call two different handle returning methods without resetting the serializer, or two use the same handle after resetting the serializer.

{@link serialize.SerializerFactory} provides a means of obtaining the default serializers available from a given implementation. At the minimum an implementation should support XML, HTML and Text serializers. When additional serializers are available, the application may obtain them through the {@link serialize.SerializerFactory} or construct them directly.

Non-escaping and whitespace preserving output control is offered for XML, HTML and similar output methods, but it is not mandatory that a serializer support these output control methods. Non-escaping and whitespace preserving can be set globally through {@link serialize.OutputFormat}, or directly when serializing SAX events through {@link serialize.SerializerHandler}. Serializers are not required to implement the {@link serialize.SerializerHandler} interface.

Usage Examples

Serialize a DOM document as XML:

  void printXML( Document doc, OutputStream stream, String encoding )
  {
      OutputFormat format;
      Serializer   ser;

      // Obtain a suitable output format for the XML method and
      // set the encoding.
      format = SerializerFactory.getOutputFormat( Method.XML );
      format.setEncoding( encoding );

      // Obtain a suitable serializer for the XML method and
      // set the output stream.
      ser = SerializerFactory.getSerializer( format );
      ser.setOutputStream( stream );

      // Use DOMSerializer to serialize the document
      ser.asDOMSerializer().serialize( doc );
  }
    

Serialize an empty HTML document using SAX events, reuse the serializer:

  Serializer ser;

  // Obtain an HTML serializer once, use it multiple times.
  ser = SerializerFactory.getSerializer( Method.HTML );
  printEmptyHTML( ser, System.out );
  printEmptyHTML( ser, System.err );
  . . . 

  void printEmptyHTML( Serializer ser, OutputStream os )
  {
      ser.setOutputStream( os );
      ser.asDocumentHandler().startDocument();
      ser.asDocumentHandler().startElement( "html", new AttributeListImpl() );
      ser.asDocumentHandler().endElement( "html" );
      ser.asDocumentHandler().endDocument();
      ser.reset();
  }
    

The Properties File

An implementation will include a serializer properties file called serializer.properties located in the serialize package. The properties file lists all the default serializers supported by that implementation. Serializers that are not listed in the properties file may be constructed directly by the application.

The properties file contains a property serialize.methods listing all the output methods supported by the implementation (comma separated list). For each method a property serialize.[method] names the class of the {@link serialize.Serializer} implementation. The optional property serialize.format.[method] names the class of a suitable {@link serialize.OutputFormat} implementation.