SAX2DTM Design Notes

The current implementation is subject to change and this class should be accessed only through published interface methods. However, the following information is provided to aid in an understanding of how this class currently works and is provided for debugging purposes only. This implementation stores information about each node in a series of arrays. Conceptually, the arrays can be thought of as either String Vectors or int Vectors although they are implemented using some internal classes. The m_chars array is conceptually a Vector of chars. The chief arrays of interest are shown in the following table:

Array Name Array Type Contents
m_exptype int An integer representing a unique value for a Node. The first 6 bits represent the Node type, as shown below. The next 10 bits represent an index into m_namespaceNames. The remaining 16 bits represent an index into m_locNamesPool. Start here. This Vector represents the list of Nodes.
m_locNamesPool String Local (prefixed) names. Field of m_expandedNameTable.
m_namespaceNames String Namespace URIs. Field of m_expandedNameTable.
m_dataOrQName int An index into either m_data or m_valuesOrPrefixes, as explained in the next table.
m_valuesOrPrefixes String Values and prefixes.
m_data int Entries here occur in pairs. The use of this array is explained in the next table.
m_chars char Characters used to form Strings as explained in the next table.

This table shows how the array values are used for each type of Node supported by this implementation. An n represents an index into m_namespaceNames for the namespace URI associated with the attribute or element. It actually consists of the 10 bits, including the rightmost two bits of the leftmost byte. The eeee represents an index into m_locNamesPool for the value indicated in the table.

NodeType m_exptype m_dataOrQName m_data
Attr  08neeee
-0bneeee
eeee is local name of attribute.
No namespace: an index into m_valuesOrPrefixes pointing to the attribute value.
Namespace: a negative number, the absolute value of which is an index into m_data.
index: an int containing the index into m_valuesOrPrefixes for the Attr QName.
index+1: an int containing the index into m_valuesOrPrefixes for the attribute value.
Comment  20000000 index into m_valuesOrPrefixes for comment text. unused
Document  24000000 0 unused
Element  04neeee
-07neeee
eeee is local name of element.
No namespace: 0.
Namespace: an index into m_valuesOrPrefixes pointing to the QName.
unused
Text  0C000000 an index into m_data. index: an int containing starting subscript in m_chars for the text.
index+1: an int containing the length of the text.
ProcessingInstruction  1C0eeee
eeee is the target name.
index into m_valuesOrPrefixes for PI data. unused
Namespace  34neeee
eeee is namespace prefix.
index into m_valuesOrPrefixes pointing to the namespace URI. unused