Overview

Forrest uses three major components in its processing:

  1. Readers - read source documents
  2. Input Plugins - converts source documents to the internal Forrest format
  3. Output Plugins - converts internal format documents to the required output format

Different implementations of each of these components allow different processing to take place. For example, different readers will retrieve the source document using different methods, such as HTTP, FTP or SSH. Different input plugins will allow different types of source document to be processed and different output plugins will provide different output formats such as HTML, PDF or OpenOffice.org

In the following sections we will examine how the Forrest controller selects which reader, input and output plugins to use for any particular request.

Selecting Readers

Which reader is used to read a source document is defined by the protocol used in the location URL. Note that this has nothing to do with the request URL, it is defined in the locationmap for the Forrest content object. So, for example, if the source location of a file is defined as file://foo/bar.html the reader defined to handle "file:" requests will be used.

To define a reader for a specific protocol we add a bean definition to our content objects forrestContext.xml file. The ID of the bean should be the name of the protocol we wish process with this reader. For example:

]]>

What Source Type?

A reader creates a source document object that is used in the next stage of processing. This source document is an object that extends AbstractSourceDocument. A method getType returns a string identifying the type of document represented. This string is used in the next stage of processing, which is selecting an Input plugin to convert the source document to our internal format.

A DocumnetFactory is provided by core that attempts to identify the source document type. This factory is used by the default readers provided by core. However, in some cases, such as when an XML document does not provide a DTD definition that can be used to identify the document type, you will need to create a custom reader that returns the correct document type. In this case the utility class DefaultSourceDocument will probably be useful (see the setType method).

Selecting Input Plugins

Each input plugin will process a single type of source document. Each document is defined by a "type", this is set by the reader that reads the document. In many cases this will be a MIME type, but where no suitably specific MIME type exists, such as an XML document, an arbitrary string can be used.

When selecting an input plugin to be used the documents type is utilised to lookup the correct input plugin. To facilitate this lookup we again use the ID of the input plugins bean as defined in the forrestContext.xml file. For example, the following will process documents of type bar:

]]>

Selecting an Output Plugin

An output plugin is selected by looking at the request URL of the. This URL is matched against an output plugin by looking at a "pattern" property in the bean that implements the plugin. The first bean that is found where the request URL matches the supplied pattern will be used to process the output. In addition to the pattern property an output plugin specifies an XSLT file that is used to convert the internal format to our desired output format.

For example, the following output plugin will be used to process requests ending in either ".html" or ".htm".

]]>