Apache Struts API Documentation: Package org.apache.struts.digester

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package org.apache.struts.digester

The Digester package provides for rules-based processing of arbitrary XML documents.

See:
Description

Class Summary

CallMethodRule Rule implementation that calls a method on the top (parent) object, passing arguments collected from subsequent CallParamRule rules or from the body of this element.

CallParamRule Rule implementation that saves a parameter from either an attribute of this element, or from the element body, to be used in a call generated by a surrounding CallMethodRule rule.

Digester A Digester processes an XML input stream by matching a series of element nesting patterns to execute Rules that have been added prior to the start of parsing.

ObjectCreateRule Rule implementation that creates a new object and pushes it onto the object stack.

Rule Concrete implementations of this class implement actions to be taken when a corresponding nested pattern of XML elements has been matched.

SetNextRule Rule implementation that calls a method on the (top-1) (parent) object, passing the top object (child) as an argument.

SetPropertiesRule Rule implementation that sets properties on the object at the top of the stack, based on attributes with corresponding names.

SetPropertyRule Rule implementation that sets an individual property on the object at the top of the stack, based on attributes with specified names.

SetTopRule Rule implementation that calls a method on the top (parent) object, passing the (top-1) (child) object as an argument.

Package org.apache.struts.digester Description

The Digester package provides for rules-based processing of arbitrary XML documents.

[Introduction] [Configuration Properties] [The Object Stack] [Element Matching Patterns] [Processing Rules] [Usage Example]

Introduction

In many application environments that deal with XML-formatted data, it is useful to be able to process an XML document in an "event driven" manner, where particular Java objects are created (or methods of existing objects are invoked) when particular patterns of nested XML elements have been recognized. Developers familiar with the Simple API for XML Parsing (SAX) approach to processing XML documents will recognize that the Digester provides a higher level, more developer-friendly interface to SAX events, because most of the details of navigating the XML element hierarchy are hidden -- allowing the developer to focus on the processing to be performed.

In order to use a Digester, the following basic steps are required:

Create a new instance of the org.apache.struts.digester.Digester class. Previously created Digester instances may be safely reused, as long as you have completed any previously requested parse, and you do not try to utilize a particular Digester instance from more than one thread at a time.
Set any desired configuration properties that will customize the operation of the Digester when you next initiate a parse operation.
Push any desired initial object(s) onto the Digester's object stack.
Register all of the element matching patterns for which you wish to have processing rules fired when this pattern is recognized in an input document. You may register as many rules as you like for any particular pattern. If there is more than one rule for a given pattern, the rules will be executed in the order that they were listed.
Call the digester.parse() method, passing a reference to the XML document to be parsed in one of a variety of forms. See the Digester.parse() documentation for details. Note that you will need to be prepared to catch any IOException or SAXException that is thrown by the parser, or any runtime expression that is thrown by one of the processing rules.

Digester Configuration Properties

A org.apache.struts.digester.Digester instance contains several configuration properties that can be used to customize its operation. These properties must be configured before you call one of the parse() variants, in order for them to take effect on that parse.

Property Description

debug An integer defining the amount of debugging output that will be written to System.out() as the parse progresses. This is useful when tracking down where parsing problems are occurring. The default value of zero means no debugging output will be generated -- increasing values generally cause the generation of more verbose and detailed debugging information.

validating A boolean that is set to true if you wish to validate the XML document against a Document Type Definition (DTD) that is specified in its DOCTYPE declaration. The default value of false requests a parse that only detects "well formed" XML documents, rather than "valid" ones.

Property	Description
debug	An integer defining the amount of debugging output that will be written to `System.out()` as the parse progresses. This is useful when tracking down where parsing problems are occurring. The default value of zero means no debugging output will be generated -- increasing values generally cause the generation of more verbose and detailed debugging information.
validating	A boolean that is set to `true` if you wish to validate the XML document against a Document Type Definition (DTD) that is specified in its `DOCTYPE` declaration. The default value of `false` requests a parse that only detects "well formed" XML documents, rather than "valid" ones.

In addition to the scalar properties defined above, you can also register a local copy of a Document Type Definition (DTD) that is referenced in a DOCTYPE declaration. Such a registration tells the XML parser that, whenever it encounters a DOCTYPE declaration with the specified public identifier, it should utilize the actual DTD content at the registered system identifier (a URL), rather than the one in the DOCTYPE declaration.

For example, the Struts framework controller servlet uses the following registration in order to tell Struts to use a local copy of the DTD for the Struts configuration file. This allows usage of Struts in environments that are not connected to the Internet, and speeds up processing even at Internet connected sites (because it avoids the need to go across the network).

    digester.register
      ("-//Apache Software Foundation//DTD Struts Configuration 1.0//EN",
       "/org/apache/struts/resources/struts-config_1_0.dtd");

As a side note, the system identifier used in this example is the path that would be passed to java.lang.ClassLoader.getResource() or java.lang.ClassLoader.getResourceAsStream(). The actual DTD resource is loaded through the same class loader that loads all of the Struts classes -- typically from the struts.jar file.

The Object Stack

One very common use of org.apache.struts.digester.Digester technology is to dynamically construct a tree of Java objects, whose internal organization, as well as the details of property settings on these objects, are configured based on the contents of the XML document. In fact, the primary reason that the Digester package was created was to facilitate the way that the Struts controller servlet configures itself based on the contents of your application's struts-config.xml file.

To facilitate this usage, the Digester exposes a stack that can be manipulated by processing rules that are fired when element matching patterns are satisfied. The usual stack-related operations are made available, including the following:

clear() - Clear the current contents of the object stack.
peek() - Return a reference to the top object on the stack, without removing it.
pop() - Remove the top object from the stack and return it.
push() - Push a new object onto the top of the stack.

A typical design pattern, then, is to fire a rule that creates a new object and pushes it on the stack when the beginning of a particular XML element is encountered. The object will remain there while the nested content of this element is processed, and it will be popped off when the end of the element is encountered. As we will see, the standard "object create" processing rule supports exactly this functionalility in a very convenient way.

Several potential issues with this design pattern are addressed by other features of the Digester functionality:

How do I relate the objects being created to each other? - The Digester supports standard processing rules that pass the top object on the stack as an argument to a named method on the next-to-top object on the stack (or vice versa). This rule makes it easy to establish parent-child relationships between these objects. One-to-one and one-to-many relationships are both easy to construct.
How do I retain a reference to the first object that was created? As you review the description of what the "object create" processing rule does, it would appear that the first object you create (i.e. the object created by the outermost XML element you process) will disappear from the stack by the time that XML parsing is completed, because the end of the element would have been encountered. To deal with this, the normal approach is to push a reference to some application global object onto the stack before the parse begins, and arrange that a parent-child relationship be created (by appropriate processing rules) between this manually pushed object and the one that is dynamically created. In this way, the pushed object will retain a reference to the dynamically created object (and therefore all of its children) after the parse finishes.

Element Matching Patterns

A primary feature of the org.apache.struts.digester.Digester parser is that the Digester automatically navigates the element hierarchy of the XML document you are parsing for you, without requiring any developer attention to this process. Instead, you focus on deciding what functions you would like to have performed whenver a certain arrangement of nested elements is encountered in the XML document being parsed. The mechanism for specifying such arrangements are called element matching patterns.

A very simple element matching pattern is a simple string like "a". This pattern is matched whenever an <a> top-level element is encountered in the XML document, no matter how many times it occurs. Note that nested <a> elements will not match this pattern -- we will describe means to support this kind of matching later.

The next step up in matching pattern complexity is "a/b". This pattern will be matched when a <b> element is found nested inside a top-level <a> element. Again, this match can occur as many times as desired, depending on the content of the XML document being parsed. You can use multiple slashes to define a hierarchy of any desired depth that will be matched appropriately.

For example, assume you have registered processing rules that match patterns "a", "a/b", and "a/b/c". For an input XML document with the following contents, the indicated patterns will be matched when the corresponding element is parsed:

  <a>         -- Matches pattern "a"
    <b>       -- Matches pattern "a/b"
      <c/>    -- Matches pattern "a/b/c"
      <c/>    -- Matches pattern "a/b/c"
    </b>
    <b>       -- Matches pattern "a/b"
      <c/>    -- Matches pattern "a/b/c"
      <c/>    -- Matches pattern "a/b/c"
      <c/>    -- Matches pattern "a/b/c"
    </b>
  </a>

It is also possible to match a particular XML element, no matter how it is nested (or not nested) in the XML document, by using the "*" wildcard character in your matching pattern strings. For example, an element matching pattern of "*/a" will match an <a> element at any nesting position within the document.

It is quite possible that, when a particular XML element is being parsed, the pattern for more than one registered processing rule will be matched (either because you registered more than one processing rule with the same matching pattern, or because one more more exact pattern matches and wildcard pattern matches are satisfied by the same element. When this occurs, the corresponding processing rules will all be fired, in the order that they were initially registered with the Digester.

Processing Rules

The previous section documented how you identify when you wish to have certain actions take place. The purpose of processing rules is to define what should happen when the patterns are matched.

Formally, a processing rule is a Java class that subclasses the org.apache.struts.digester.Rule interface. Each Rule implements one or more of the following event methods that are called at well-defined times when the matching patterns corresponding to this rule trigger it:

begin() - Called when the beginning of the matched XML element is encountered. A data structure containing all of the attributes corresponding to this element are passed as well.
body() - Called when nested content (that is not itself XML elements) of the matched element is encountered. Any leading or trailing whitespace will have been removed as part of the parsing process.
end() - Called when the ending of the matched XML element is encountered. If nested XML elements that matched other processing rules was included in the body of this element, the appropriate processing rules for the matched rules will have already been completed before this method is called.
finish() - Called when the parse has been completed, to give each rule a chance to clean up any temporary data they might have created and cached.

As you are configuring your digester, you can call the addRule() method to register a specific element matching pattern, along with an instance of a Rule class that will have its event handling methods called at the appropriate times, as described above. This mechanism allows you to create Rule implementation classes dynamically, to implement any desired application specific functionality.

In addition, a set of processing rule implementation classes are provided, which deal with many common programming scenarios. These classes include the following:

ObjectCreateRule - When the begin() method is called, this rule instantiates a new instance of a specified Java class, and pushes it on the stack. The class name to be used is defaulted according to a parameter passed to this rule's constructor, but can optionally be overridden by a classname passed via the specified attribute to the XML element being processed. When the end() method is called, the top object on the stack (presumably, the one we added in the begin() method) will be popped, and any reference to it (within the Digester) will be discarded.
SetPropertiesRule - When the begin() method is called, the digester uses the standard Java Reflection API to identify any JavaBeans property setter methods (on the object at the top of the digester's stack) who have property names that match the attributes specified on this XML element, and then call them individually, passing the corresponding attribute values. A very common idiom is to define an object create rule, followed by a set properties rule, with the same element matching pattern. This causes the creation of a new Java object, followed by "configuration" of that object's properties based on the attributes of the same XML element that created this object.
SetPropertyRule - When the begin() method is called, the digester calls a specified property setter (where the property itself is named by an attribute) with a specified value (where the value is named by another attribute), on the object at the top of the digester's stack. This is useful when your XML file conforms to a particular DTD, and you wish to configure a particular property that does not have a corresponding attribute in the DTD.
SetNextRule - When the begin() method is called, the digester analyzes the next-to-top element on the stack, looking for a property setter method for a specified property. It then calls this method, passing the object at the top of the stack as an argument. This rule is commonly used to establish one-to-many relationships between the two objects, with the method name commonly being something like "addChild".
SetTopRule - When the begin() method is called, the digester analyzes the top element on the stack, looking for a property setter method for a specified property. It then calls this method, passing the next-to-top object on the stack as an argument. This rule would be used as an alternative to a SetNextRule, with a typical method name "setParent", if the API supported by your object classes prefers this approach.
CallMethodRule - This rule sets up a method call to a named method of the top object on the digester's stack, which will actually take place when the end() method is called. You configure this rule by specifying the name of the method to be called, the number of arguments it takes, and (optionally) the Java class name(s) defining the type(s) of the method's arguments. The actual parameter values, if any, will typically be accumulated from the body content of nested elements within the element that triggered this rule, using the CallParamRule discussed next.
CallParamRule - This rule identifies the source of a particular numbered (zero-relative) parameter for a CallMethodRule within which we are nested. You can specify that the parameter value be taken from a particular named attribute, or from the nested body content of this element.

You can create instances of the standard Rule classes and register them by calling digester.addRule(), as described above. However, because their usage is so common, shorthand registration methods are defined for each of the standard rules, directly on the Digester class. For example, the following code sequence:

    Rule rule = new SetNextRule(digester, "addChild",
                                "com.mycompany.mypackage.MyChildClass");
    digester.addRule("a/b/c", rule);

can be replaced by:

    digester.addSetNext("a/b/c", "addChild",
                        "com.mycompany.mypackage.MyChildClass");

Usage Examples

Processing The Struts Configuration File

As stated earlier, the primary reason that the org.apache.struts.digester.Digester package exists is because the Struts controller servlet itself needed a robust, flexible, easy to extend mechanism for processing the contents of the struts-config.xml configuration that describes nearly every aspect of a Struts-based application. Because of this, the controller servlet contains a comprehensive, real world, example of how the Digester can be employed for this type of a use case. See the initDigester() method of class org.apache.struts.action.ActionServlet for the code that creates and configures the Digester to be used, and the initMapping() method for where the parsing actually takes place.

The following discussion highlights a few of the matching patterns and processing rules that are configured, to illustrate the use of some of the Digester features. First, let's look at how the Digester instance is created and initialized:

    Digester digester = new Digester();
    digester.push(this);
    digester.setDebug(detail);
    digester.setValidating(true);

We see that a new Digester instance is created, and is configured to use a validating parser. Validation will occur against the struts-config_1_0.dtd DTD that is included with Struts (as discussed earlier). In order to provide a means of tracking the configured objects, the controller servlet instance itself will be added to the digester's stack.

    digester.addObjectCreate("struts-config/global-forwards/forward",
                             forwardClass, "className");
    digester.addSetProperties("struts-config/global-forwards/forward");
    digester.addSetNext("struts-config/global-forwards/forward",
                        "addForward",
                        "org.apache.struts.action.ActionForward");
    digester.addSetProperty
      ("struts-config/global-forwards/forward/set-property",
       "property", "value");

The rules created by these lines are used to process the global forward declarations. When a <forward> element is encountered, the following actions take place:

A new object instance is created -- the ActionForward instance that will represent this definition. The Java class name defaults to that specified as an initialization parameter (which we have stored in the String variable forwardClass), but can be overridden by using the "className" attribute (if it is present in the XML element we are currently parsing). The new ActionForward instance is pushed onto the stack.
The properties of the ActionForward instance (at the top of the stack) are configured based on the attributes of the <forward> element.
Nested occurrences of the <set-property> element cause calls to additional property setter methods to occur. This is required only if you have provided a custom implementation of the ActionForward class with additional properties that are not included in the DTD.
The addForward() method of the next-to-top object on the stack (i.e. the controller servlet itself) will be called, passing the object at the top of the stack (i.e. the ActionForward instance) as an argument. This causes the global forward to be registered, and as a result of this it will be remembered even after the stack is popped.
At the end of the <forward> element, the top element (i.e. the ActionForward instance) will be popped off the stack.

Later on, the digester is actually executed as follows:

    InputStream input =
      getServletContext().getResourceAsStream(config);
    ...
    try {
        digester.parse(input);
        input.close();
    } catch (SAXException e) {
        ... deal with the problem ...
    }

As a result of the call to parse(), all of the configuration information that was defined in the struts-config.xml file is now represented as collections of objects cached within the Struts controller servlet, as well as being exposed as servlet context attributes.

Parsing Body Text In XML Files

The Digester module also allows you to process the nested body text in an XML file, not just the elements and attributes that are encountered. The following example is based on an assumed need to parse the web application deployment descriptor (/WEB-INF/web.xml) for the current web application, and record the configuration information for a particular servlet. To record this information, assume the existence of a bean class with the following method signatures (among others):

  package com.mycompany;
  public class ServletBean {
    public void setServletName(String servletName);
    public void setServletClass(String servletClass);
    public void addInitParam(String name, String value);
  }

We are going to process the web.xml file that declares the controller servlet in a typical Struts-based application (abridged for brevity in this example):

  <web-app>
    ...
    <servlet>
      <servlet-name>action</servlet-name>
      <servlet-class>org.apache.struts.action.ActionServlet<servlet-class>
      <init-param>
        <param-name>application</param-name>
        <param-value>org.apache.struts.example.ApplicationResources<param-value>
      </init-param>
      <init-param>
        <param-name>config</param-name>
        <param-value>/WEB-INF/struts-config.xml<param-value>
      </init-param>
    </servlet>
    ...
  </web-app>

Next, lets define some Digester processing rules for this input file:

  digester.addObjectCreate("web-app/servlet",
                           "com.mycompany.ServletBean");
  digester.addCallMethod("web-app/servlet/servlet-name", "setServletName", 0);
  digester.addCallMethod("web-app/servlet/servlet-class",
                         "setServletClass", 0);
  digester.addCallMethod("web-app/servlet/init-param",
                         "addInitParam", 2);
  digester.addCallParam("web-app/servlet/init-param/param-name", 0);
  digester.addCallParam("web-app/servlet/init-param/param-value", 1);

Now, as elements are parsed, the following processing occurs:

<servlet> - A new com.mycompany.ServletBean object is created, and pushed on to the object stack.
<servlet-name> - The setServletName() method of the top object on the stack (our ServletBean) is called, passing the body content of this element as a single parameter.
<servlet-class> - The setServletClass() method of the top object on the stack (our ServletBean) is called, passing the body content of this element as a single parameter.
<init-param> - A call to the addInitParam method of the top object on the stack (our ServletBean) is set up, but it is not called yet. The call will be expecting two String parameters, which must be set up by subsequent call parameter rules.
<param-name> - The body content of this element is assigned as the first (zero-relative) argument to the call we are setting up.
<param-value> - The body content of this element is assigned as the second (zero-relative) argument to the call we are setting up.
</init-param> - The call to addInitParam() that we have set up is now executed, which will cause a new name-value combination to be recorded in our bean.
<init-param> - The same set of processing rules are fired again, causing a second call to addInitParam() with the second parameter's name and value.
</servlet> - The element on the top of the object stack (which should be the ServletBean we pushed earlier) is popped off the object stack.