######################################################################### # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. ######################################################################### == overview The files in this directory are intended as an example of how to use the Apache Digester to parse "document-markup" style xml. It also serves as an example of how to subclass the main Digester class in order to extend its functionality. By "document-markup" xml, we mean input like XHTML, where the data is valid xml and where some elements contain interleaved text and child elements. For example, "
Hi, this is some document-style xml.
" Topics covered: * how to subclass digester * how to process markup-style xml. == compiling and running * to compile: mvn compile * to build the jar artifact mvn package * to run: mvn verify Alternatively, you can set up your CLASSPATH appropriately, and run the example directly. == Notes The primary use of the Digester is to process xml configuration files. Such files do not typically interleave text and child elements in the style encountered with document markup. The standard Digester behaviour is therefore to accumulate all text within an xml element's body (of which there is expected to be only one "segment") and present it to a Rule or user method as a single string. While this significantly simplifies the implementation of Rule classes for the primary Digester goal of parsing configuration files, this process of simplifying all text within an element into a single string "loses" critical information necessary to correctly parse "document-markup" xml. This example shows one method of extending the Digester class to resolve this issue.. At some time the ability to process "document-markup" style xml may be built into the standard Digester class.