The Xerces Native Interface (XNI) is a framework for communicating a "streaming" document information set and constructing generic parser configurations. XNI is part of the Xerces2 development but it is important to note that the Xerces2 parser is just a standards compliant reference implementation of the Xerces Native Interface. Other parsers can be written that conform to XNI without conforming to any particular standards.
An XML parser can be viewed as a pipeline in which information flows from a scanner to a validator to the parser. In this pipeline, one component (the scanner) acts as a source of events; the final component (the parser) is the final target of the events; and any components between the source and target are known as filters. Filter components are both targets for the information sent by the previous component in the pipeline and sources for the information that the filter chooses to propagate to the next component in the pipeline. The following diagram illustrates the layout of the pipeline in this kind of parser.
Parsing of DTDs can also be viewed as a pipeline. Since the DTD is referenced in the document instance by XML syntax (the DOCTYPE declaration), the DTD pipeline is triggered by the document scanner. This contrasts with XML Schema because there is no XML syntax that associates a Schema grammar with a document; a special attribute in the document instance is used as a hint to the location of the grammar. The following diagram illustrates the layout of the DTD pipeline.
Note that the DTD scanner communicates directly with the validator. The validator receives the callbacks from the DTD scanner in order to create and populate the DTD grammar object. In this way, the validator acts as a "tee", propogating the DTD events to both the next stage in the pipeline and the DTD grammar object. This allows the validation stage in the pipeline to be completely removed from the parser configuration, if needed.
The XML document information is defined by the
interface and the DTD information is defined by the
(Note: As of 10 Apr 2001, the DTD interfaces are subject to change
based on user feedback.)
This set of interfaces and supporting interfaces and classes
comprise the XNI Core. However, whereas the XNI Core defines what
information document and DTD is communicated but does not define
the semantics for configuring the parser pipeline.
In the XNI world, a parser object used by an application is merely an API generator (e.g. building DOM trees or calling SAX handlers). The components and configuration information for that parser is defined within a parser configuration object. With this approach, different parser configurations can be used with the existing parser instances without duplicating code.
The parser configuration object, defined by the
interface, that is used by the application is comprised of a series of
components. The parser configuration assembles the parsing pipeline
components, transmits settings to each component, and controls their
actions. The following diagram shows a general parser configuration
and its components. (No ordering or direct connection between
components should be implied.)
The workings of the parser configuration object are unknown to the parser. The parser is only able to set features and properties on the configuration, set the XNI handlers to receive the document information, and initiate a parse. Typically the parser object itself will be registered as the target of XNI events produced from the parser configuration when a document is parsed, but it doesn't have to be. The following diagram illustrates this situation.
Features and properties are provided via the extensible mechanism found in SAX2. Features are boolean settings on the parser configuration while properties are object settings. There are a number of SAX2 core features and properties but XNI parser components are free to define new ones. All of the features and properties are managed by the parser configuration, though.
TODO: Expand on how features and properties are set, when, and by who.
The parser configuration implements the
interface and each component implements the
interface. For this configuration system to work, the parser
configuration must adhere to the following guidelines:
reset method on each configurable component.
This call allows each component to query the state of only
those features and properties that are important to the operation
of the component.