XML Schema Tutorial

Introduction

Commons XML Schema model is a general purpose schema model that can be used when a Java object tree representation of an Xml schema is required. This short tutorial explains how the Commons XML Schema can be utilized.

Structure and Dependencies

The core commons XML Schema classes have no third party dependencies. However it depends on the XMLUnit and JUnit libraries for unit testing, and the maven build uses the StAX API libraries to access the javax.xml.namespace.QName class (which is not part of the JDK). Also the serialization mechanism uses the DOM serialization mechanism, hence the JDK has to be 1.4 and upwards.

The structure of the commons XMLSchema model is quite straightforward. It has a strict specification bound hierarchy of classes that represents each and every schema component. It is not based on an interface-implementation model which allows extensions and different implementations. However, the schema specification is quite stable and complete, hence a change is unlikelyl, which makes the commons XmlSchema sufficient for almost all needs of schema handling.

Reading a Schema

The reader for the XML Schema model is called the SchemaCollection (org.apache.ws.commons.schema.XmlSchemaCollection). It has a static read method that returns a XmlSchema object which represents the whole schema. The XmlSchema instance returned can be used to access types and elements of the relevant schema by their qualified name.

The read method has a parameter to pass in a validating event handler. The validating event handler can be used to pass in the custom validating procedures. However, this particular handler has no effect on the reading of the schema yet, and it is not a feature in this release of Commons XML Schema. The following code fragment shows how a file can be read through the SchemaCollection.

 
InputStream is = new FileInputStream(fileName);
XmlSchemaCollection schemaCol = new XmlSchemaCollection();
XmlSchema schema = schemaCol.read(new StreamSource(is), null);

Note that null is passed for the validating handler since it has no effect yet.

Navigating the Schema Model

Navigation of the model once the XmlSchema model is obtained is also quite straight forward. All top level elements and types are available through the schema object as either org.apache.ws.commons.schema.XmlSchemaObjectTable instances or can be accessed directly if it can have a QName reference. For example, if the qualified name of an element is known, then getElementByName method can be used to extract the XmlSchemaElement object directly from the schema object. The following code sample shows how such direct methods can be used to extract schema objects

   XmlSchemaType schemaType = schema.getTypeByQName(TYPE_QNAME);
   XmlSchemaElement elem = schema.getElementByQName(ELEMENT_QNAME);

Note that the TYPE_QNAME and ELEMENT_QNAME represents QName objects.

Printing the Schema Model

Printing of the model once the XmlSchema model has been modified or constructed in-memory, is also quite straightforward. Schema object has a write method that can use an output stream.

The following code fragment shows how to write the schema into the System output stream.

schema.write(System.out);

Conclusion

Commons XmlSchema is quite a versatile piece of code that can be used to manipulate and generate XML Schemas. It has minimum dependencies and can be used inside another project with ease.