Readers are the first stage in the processing pipeline. They read
the source document for subsequent processing. It is worth noting that a
reader may no actually read a file, it may generate the files content
dynamically. This document describes how to create a new reader.
A reader is responsible for creating the initial source content. Forrest provides
a set of generic readers that can retrieve a document from the network or from the
filesystem. These documents will often need to be processed by an input plugin to
create a document in Forrests internal format.
Chained Readers
Chained readers are the type of reader you will most often use.
A Chained reader provides a means of pre and/or post processing a
document retrieved from some other reader. It is commonly used when you
need to retrieve a document whose type cannot be identified from the raw
source alone.
A Chained Reader is defined in forrestContext.xml as follows:
We can then define a chain of readers in the locationmap
like this:
Java Readers
If you want to do complex processing in order to generate the
source document you can create a Java class to carry out thge necessary
processing. This is really easy, simply extend org.apache.forrest.core.reader.AbstractReader
and implement public AbstractSourceDocument
read(IController controller, URI requestURI, final Location location).
For example:
If your reader needs to do some initialisation before it executes
the read method you should override public
void init(). This method is called once during
creation of the reader object.
Register The Reader
Once you have built your reader then you need to register it with Forrest. This is done
by adding an entry to the forrestContext.xml. The entry should
look something like this:
]]>
Note that the beans id is the same as the scheme used in the locationmap to identify a source
location. This is used by Forrest to lookup the correct reader for any given location.
The scheme can be any string that represents the type of reader we are building. For
example, if we are building a reader that will provide documents representing products in
a catalogue database then we may choose a scheme of "product" (note that we don't
include the ':'). This will then be used
in the locationmap to indicate a product document. For example, we may have a location
defined as:
]]>
This entry means that a request for http://localhost:8888/test/product.html
will result in a request to the reader assigned the product scheme. To
actually read the document Forrest will call the read(...)
method. This method will process the request in whatever
way is appropriate for that particular scheme. In this case it will communicate with
the database to retrieve the product with ID 1.