Java API for XML Processing
Release Notes

Version: 1.1ea

This document contains notes that may help you use this library more effectively.

XSLT Support

XSLT is supported in this release via the javax.xml.transform.TransformFactory class. See the associated Javadoc for details on accessing basic functionality in a XSLT processor independent manner.

Parser

There are two factory classes for making parsers pluggable. If you write to the JAXP API in the javax.xml.parsers, org.xml.sax, and org.w3c.dom packages, you can use the library in a manner independent of the underlying implementing parser.
To be notified of validation errors in an XML document, two things must happen.
1. Validation must be turned on. See the setValidating methods of javax.xml.parsers.DocumentBuilderFactory or javax.xml.parsers.SAXParserFactory .
2. An application-defined ErrorHandler must be set. See the setErrorHandler methods of javax.xml.parsers.DocumentBuilder or org.xml.sax.XMLReader.
The links provided above are only some of the ways to get notification of validation errors.
Whenever you work with text encodings other than UTF-8 and UTF-16, you should put an encoding declaration at the very beginning of all your XML files (including DTDs). If you don't do this, the parser will not be able to determine the encoding being used, and will probably be unable to parse your document. A text declaration like <?xml version='1.0' encoding='euc-jp'?> says that the document uses the "euc-jp" encoding.
The parser currently reports warnings, rather than errors, in cases where the declared and actual text encodings don't match. It may give those same warnings in the common case where the encoding name used internally to Java is not the one used in the document. If the declared encoding is truly an error, you'll usually see other errors (not warnings) being reported by the parser.
The parser currently does not report an error for content models which are not deterministic. Accordingly it may not behave well when given data which matches an "ambiguous" content model such as ((a,b)|(a,c)). DTDs with such models are in error, and must be restructured to be unambiguous. (In the example, (a,(b|c)) is an equivalent legal content model.)
If you are using JDK 1.1 with large numbers of symbols (more than can be counted in sixteen bits) you might encounter a message, panic: 16-bit string hash table overflow as the Java VM aborts. The Java 2 SDK does not have this limitation.

Object Model

Conforming to the XML specification, the parser reports all whitespace to the DOM even, if it's meaningless. Many applications do not want to see such whitespace. You can remove it by invoking the Element.normalize method, which merges adjacent text nodes and also canonicalizes adjacent whitespace into a single space (unless the xml:space="preserve" attribute prevents it).
Currently, attribute nodes may not have children. Access their values as strings instead of enumerating children.
Currently, when documents are cloned, the clone will not have a clone of the associated ElementFactory or DocumentType.
The in-memory representation of text nodes has not been tuned to be efficient with respect to space utilization.

Other Issues

This software is a "Java Optional Package" for XML processing.
If you recompile the DOM implementation using versions of "javac" older than the Java 2 SDK version 1.2 you may run into a compiler bug. The symptom is a report of illegal access violations for some of the private classes inside the DOM implementation. This is because of incorrect code generated by the compiler. You should only compile these class files with a compiler that does not have this bug; you may also use the pre-compiled version in this release. There is no bytecode dependency on the Java 2 runtime; you may use these classes on JDK 1.1 systems also.
The Microsoft SDK 3.2 for Java (and presumably all earlier versions) has bugs similar to the one noted above. There are both compiler and JVM bugs; the JVM bugs prevent the correct byte codes (as produced by the Java 2 SDK) from working. This means that you can't compile or use this DOM code with Microsoft implementations of Java until Microsoft fixes these bugs, which have been reported to Microsoft.

Changes since JAXP RI (Reference Implementation) version 1.0.1

All previous releases (from version 1.0.1 and before) used a parser implementation with a package heirarchy beginning with com.sun.xml. Between version 1.0.1 and the current release, the parser was donated to the Apache Software Foundation under the name "Crimson" and the packages were correspondingly renamed to org.apache.crimson. Migration from previous releases may involve renaming packages in your application. In addition, if your application uses SAX1 then you may either convert it to use the preferred SAX2 org.sax.xml.XMLReader or obtain a SAX1 org.sax.xml.Parser from the javax.xml.parsers.SAXParser.getParser() method.

Java API for XML Processing Release Notes

XSLT Support

Parser

Object Model

Other Issues

Changes since JAXP RI (Reference Implementation) version 1.0.1

Java API for XML Processing
Release Notes