Understanding Apache Cocoon

This document is intended for both Users and Developers and presents an overall picture of @Name@.

What You Should know:

  • XML, XML Namespaces
  • Basics of XPath, XSLT
  • Java language
  • Servlets, HTTP

What You need not know:

  • Cocoon 1
  • Cocoon project was founded in Jan. 1999 by Stefano Mazzocchi as an open source project under Apache Software Foundation.
  • Started as a simple servlet for XSL styling of XML content.
  • Was based on DOM level 1 API. This choice turned out to be quite limiting for speed/memory efficiency.
  • Used reactor pattern to connect components. This allowed the reaction instructions to be placed inside the documents. Though appealing, it caused difficulties in managing highly dynamic web-sites.
  • Allowed context overlap to happen by having processing instructions in documents/stylesheets.
  • A separate codebase to incorporate Cocoon 1 learnings.
  • Designed for execution speed/memory efficiency and scalability to process very large documents by switching processing model from DOM to SAX.
  • Centralizes the management functions by allowing processing pipeline specification in a sitemap (an XML file), replacing the embedded processing instruction model.
  • Better support for pre-compilation, pre-generation and caching for better performance.

Basic problem to be solved:

Basic mechanisms for processing XML documents:

  • Dispatching based on Matchers.
  • Generation of XML documents (from content, logic, Relation DB, objects or any combination) through Generators
  • Transformation (to another XML, objects or any combination) of XML documents through Transformers
  • Aggregation of XML documents through Aggregators
  • Rendering XML through Serializers

Sequence of Interactions

Pipeline

  • Avalon framework for logging, configuration, threading, context etc.
  • Caching mechanism
  • Pipeline handling
  • Program generation, compilation, loading and execution.
  • Base classes for generation, transformation, serialization, components.
  • ...
  • Specific generators
  • Specific transformers
  • Specific matchers
  • Specific serializers
  • ...
  • sitemap.xsl
  • xsp.xsl
  • esql.xsl
  • request.xsl
  • response.xsl
  • ...
  • ...

An XSP page is an XML page with following requirements:

  • The document root must be <xsp:page>
  • It must have language declaration as an attribute in the <xsp:page> element.
  • It must have namespace declaration for xsp as an attribute in the <xsp:page> element.
  • For an XSP to be useful, it must also require at least an <xsp:logic> and an <xsp:expr> element.
static private int counter = 0; private synchronized int count() { return counter++; }

I have been requested count() times.

]]>

An XSP page is used by a generator to generate XML document.

  • Code is embedded in the XML page
  • No separation of content and logic
  • Okay for small examples but terrible for large systems.
  • Code is in a separate logicsheet (an XSL file)
  • Effective separation of content and logic
  • Preferred way to create XSPs
  • The logicsheet is packaged as a reusable tag library and registered with Cocoon in cocoon.xconf file.
  • Tag library has a namespace declaration, declared in the original logicsheet and matched in <xsp:page> xmlns:... attribute.
  • Effective separation of content, logic and management
... ... ... ... ... ... ]]>

Sitemap contains configuration information for a Cocoon engine:

  • list of matchers
  • list of generators
  • list of transformers
  • list of readers
  • list of serializers
  • list of selectors
  • list of processing pipelines with match patterns
  • ...

Sitemap is an XML file corresponding to a sitemap DTD.

Sitemap can be edited to add new elements.

Sitemap is generated into a program and is compiled into an executable unit.

A Matcher attempts to match an URI with a specified pattern for dispatching the request to a specific processing pipeline.

Different types of matchers:

  • wildcard matcher
  • regexp matcher

More matchers can be added without modifying Cocoon.

Matchers help in specifying a specific pipeline processing for a group of URIs.

Sitemap entries for different types of matchers

]]>

Pipeline entries in sitemap file

...

A Generator is used to create an XML structure from an input source (file, directory, stream ...)

Different types of generators:

  • file generator
  • directory generator
  • XSP generator
  • JSP generator
  • Request generator
  • ...

More generators can be added without modifying Cocoon.

Sitemap entries for different types of generators

... ]]>

A sample generator entries in a pipeline

]]>

A Generator turns an XML document, after applying appropriate transformations, into a compiled program whose output is an XML document.

An XSP generator applies all the logicsheets specified in the source XML file before generating the program.

Generators cache the compiled programs for better runtime efficiency.

A Transformer is used to map an input XML structure into another XML structure.

Different types of transformers:

  • XSLT Transformer
  • Log Transformer
  • SQL Transformer
  • I18N Transformer
  • ...

Log Transformer is a good debugging tool.

More transformers can be added without modifying Cocoon.

Sitemap entries for different types of transformers

false false ... ]]>

A sample transformer entry in a pipeline

]]>

A Serializer is used to render an input XML structure into some other format (not necessarily XML)

Different types of serializers:

  • HTML Serializer
  • FOP Serializer
  • Text Serializer
  • XML Serializer
  • ...

More serializers can be added without modifying Cocoon.

Sitemap entries for different types of serializers

... ]]>

A sample serializer entry in a pipeline

]]>

The sitemap configuration allows dynamic setup of processing pipelines consisting of a generator, multiple transformers and a serializer.

Requests are dispatched to a pipeline based on request URI and the pipeline matching pattern (either with wildcards or as a regexp)

The pipeline is setup in the generated file sitemap_xmap.java (This file gets generated [possibly asynchronously] everytime the sitemap.xmap is modified.

Logicsheets are XSL files with an associated namespace.

Primary mechanism to add program logic (code) to XSPs.

These need to be registered in configuration file cocoon.xconf.

Logicsheets are used by the generator to transform XML structure before generating program.

Cocoon comes with a no. of built-in logic sheets:

  • request.xsl
  • response.xsl
  • session.xsl
  • cookie.xsl
  • esql.xsl
  • log.xsl
  • ...

Log.xsl structure

... variable and xsp:logic statements ... if(getLogger() != null) getLogger().debug(""); ... ]]>

A sample use

Test Message ]]>

Cocoon is highly configurable. Main configuration files, assuming Cocoon deployment as a servlet in a servlet container, are (directory locations assume Tomcat servlet container):

  • sitemap.xmap: the sitemap file. By default, located in $TOMCAT_HOME/webapps/cocoon directory.
  • cocoon.xconf: configuration file having logicsheet registrations. Specifies, sitemap.xmap location and other such parameters. By default, located in $TOMCAT_HOME/webapps/cocoon directory.
  • web.xml: servlet deployment descriptor. Specifies location of cocoon.xconf, log file location and other such parameters. Located in $TOMCAT_HOME/webapps/cocoon/WEB-INF directory.
  • cocoon.roles: mapping file for Core Cocoon components name and implementation classes. For example, if you want to use a parser other than the default one, you need to modify this file.

Cocoon produces execution log entries for debugging/auditing.

  • The amount of data to be logged can be controlled by log-level parameter in web.xml file. The default is DEBUG (maximum data).
  • By default, the log file is: $TOMCAT_HOME/webapps/cocoon/WEB-INF/logs/cocoon.log.

Cocoon keeps the generated .java files in a directory tree starting at (by default):
$TOMCAT_HOME/webapps/work/localhost_8080%2Fcocoon/org/apache/cocoon/www.

You can find sitemap_xmap.java here.

Files created by LogTransformer are kept (by default) in $TOMCAT_HOME directory.

Download Tomcat from Apache site.

Download Cocoon sources from Apache CVS. [Command assume UNIX Bourne shell]

Build sources as per instruction in Install file.

Move the cocoon.war file to $TOMCAT_HOME/webapps directory.

Start the servlet engine. Type-in the URL http://localhost:8080/cocoon in your browser. You should see the Cocoon welcome message.

Consult Install file if you face problems.