Apache Forrest 1.0 - Working Draft

Abstract

Apache Forrest 1.0 is the first specification of Apache Forrest (the product), intended to be the reference point for developers and users in the road to making a first stable Apache Forrest release.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supercede this document.

This document is under development, and will not be considered final until Apache Forrest 1.0 is released.

Warning: This document is under development. It intends to describe the internal machinery that is planned for the near future. It does not represent the current state.

This is the First Public Working Draft of the Forrest 1.0 specification. This document has been produced by the Apache Forrest Project developer community as part of the Apache Forrest Project. The authors of this document are the Apache Forrest Project participants.

Publication as a Working Draft does not imply endorsement by the Apache Forrest Project Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Comments on this document are welcome. Please send issues to the public developers mailing list. It is appropriate to send discussion email to this address. Please note that comments that you make will be publicly archived and available, do not send information you would not want to see distributed, such as private data.

1. Introduction

1.1 Reading the Specification

The specification has been written with various modes of presentation in mind. In case of a discrepancy, the online electronic version is considered the authoritative version of the document.

This document uses the terms must, must not, required, shall, shall not, recommended, should, should not, may, and optional in accord with [RFC 2119].

1.2 How the Specification is Organized

Chapters are arranged by topic.

1.3 Documentation Conventions

Throughout this document, the following namespace prefixes and corresponding namespace identifiers are used:

xhtml2: The XHTML 2.0 namespace
my: Any user defined namespace

This is only a convention; any namespace prefix may be used in practice.

The following typographical conventions are used to present technical material in this document.

Examples are set off typographically:

Example: Example item

Example Item

References to external documents appear as follows: [Sample Reference] with links to the references section of this document.

Sample Reference: Reference - linked to from above.

The following typesetting convention is used for non-normative commentary:

Note:

A gentle explanation to readers.

Editorial note: Editorial Note Name
Editorial commentary, not intended for final publication.

Issue (sample-implementation-issue):

Issue-Name

A specific issue for which input from implementors is requested, for example as part of the Candidate Recommendation phase.

Resolution:

None recorded.

Editorial note: TODOs
Complete the other sections below Handling raw unprocessed files Internationalisation Configuration files Project directory tree Review the "Processing pipeline" section. Commence the "Processing overview" section. Review the "Forrest:templates" section. Define the scheam for config files. Don't number the sections below until later.

Apache Forrest Core

Processing pipeline

This is the conceptual processing pipeline offered by Apache Forrest.

All references to actual source directories and internal or external formats is generic and are specified elsewhere in this spec.

Step 1: Resolver (content)

Forrest has a single conceptual source space, that can initially be thought of as a single directory, the current xdocs directory. Every file that is outside of this directory needs to be resolved by a locationmap, so that Forrest sees it all as a single directory in any case.

This source space contains files that have a filename and an extension. There should be only one file with a certain name in a certain directory, which will be the main source of our transformation.

For each URL that is requested, there shall be only one source file resolved, which will be our main source.

Step 2: Xifier (content)

Transform the main source to the intermediate format, XHTML2.

The various input formats are handled by specific input plugins, all transforming to XHTML2. Input plugins for HTML and XHTML are available by default.

It has an opportunity to aggregate data from other sources and to do various pre-processing.

Multiple formats can be requested of the same source: the filename asked will be in the following manner.

name.type.format

Example:

myfile.content.html 
myfile.javadocs.html 
myfile.html

Step 3: Filter (content)

Adds navigation, metadata, extra content, functionality, and transformations to the content stream.

The filtering stages should use different filtering files, so as to not produce markup that is not needed by the view.

Navigation is the addition of the 'tab' and 'linkmap' information to the stream.

Metadata about a page can be added, like the date, page size, etc.

Nuggets of information can be added, based on the URL and on the contents of the main source. For example, newsfeeds about the page being processed.

Stucture is being built, nuggets are being placed, and hooks are being provided.

Fbits (defined below) should be inserted only as placeholders that the view can populate with the actual functionality.

Filtering on the main content can be done, such as automatic creation of links based on the site.xml linkmap; or footnote processing.

These filtering steps are done by plugins (filtering plugins).

Step 4: Windower (presentation)

A particular window defines the visual organisation of the xml stream. The actual look-and-feel of that window is dependent on a subsequent theme.

At this stage, output plugins start to operate.

Based on the view specified, the content is transformed into a format that contains presentation information. Example formats are html, fo (formatting objects) and svg.

Note that this part adds functionality implementation to the content. For example, a search item can be displayed, or widgets can be used. These are fbits, or functionality bits, and are different from nuggets, which are extra content.

Note that fbits are view-dependent, so that a view can decide to implement them or not.

Examples of current fbits are the search pane, the page format selector, etc.

The configuration of these bits are done with the new generic skin-configuration format and a new templating language.

Step 5: Themer (presentation)

The structure and content is now ready for a theme to be applied. Theming adds colors and general appearance information. In html it is css files for example, or the color information from the skin configuration.

Themes are view-dependent but e.g. you can write one theme for many view derivatives.

Step 6: Serializer (presentation)

The presentation is transformed to the actual final format with output plugins. For example a fo presentation can be output as xhtml, pdf, rtf, doc, ps, etc.

Processing overview

Editorial note: overview

Commence this section. Describe how the whole thing operates and define the terminology of each part.

See various email to the forrest dev mailing list, especially Re: Defining Views Terminology

The source for the main content is located via the Resolver and the XML stream is generated. The type of input is detected and a certain input plugin handles the source, transforming it to XHTML2. This Xifier step builds the initial document structure and provides the initial content nuggets. The Forrest:view defines additional content, functionality, and style for the request. The Forrest core defines a default view for each output format (xhtml2, fo, etc.), and a project can over-ride those contracts or provide additional contracts, and page-specific views can also be provided.

Forrest:templates

Editorial note: templates
Review this section.

Forrest:templates (or f:t for short) is a templating language to create views.

Definition of Forrest:View = content (nuggets) + functionality (fbit) + design (style)

Nuggets are the incoming content. They are pure items of content (without any information about fct., style,...).

The fbits can (but do not need to) use nuggets to implement (or populate) the actual functionality needed in the requested view.

fbit containing nuggets (it contains e.g. captions ->i18n nuggets) <forrest:fbit name="fontsize"/>

pure fbit (no content just functionality -> IMO very rare, the example tag would as well contain i18n: close-schliessen-cerrar) <forrest:fbit name="close-window"/>

pure fbit *using* nuggets (e.g. profiling data for the actual view.) <forrest:fbit name="searchbox" type="sport"/>

Normally the nuggets and fbits will be implemented in an overall design. The design is a container concept of storing fbits and nuggets in graphical container (template + hooks). Templates can be used to create the overall design in different media (xhtml, fo,...) of the document.

This design state should only use registered contracts for fbits and nuggets but still have absolute control over the style.

Handling raw unprocessed files

Editorial note: raw
Add content.

Internationalisation

Editorial note: inter
Add content.

Configuration files

Editorial note: config
Add content. There is one email thread recently. Anil started it following ApacheCon EU 2005. There is also the design discussion in the "events" SVN.

Project directory tree

Editorial note: project-tree
Define the project directory layout. There is one email thread in Section E and various other discussion.

A. Schemas for Forrest 1.0 config files

Editorial note: config-schema
Define RELAX NG grammars.

B. References

B.1 Normative References

[HTML]: "HTML 4.01 Specification", D. Raggett, A. Le Hors, I. Jacobs, 24 December 1999.
Latest version available at: http://www.w3.org/TR/html401
[RFC2119]: "RFC2119: Key words for use in RFCs to Indicate Requirement Levels", S. Bradner, March 1997.
Available at: http://www.ietf.org/rfc/rfc2119.txt
[RFC2396]: "RFC2396: Uniform Resource Identifiers (URI): Generic Syntax", T. Berners-Lee, R. Fielding, L. Masinter, August 1998.
This document updates RFC1738 and RFC1808.
Available at: http://www.ietf.org/rfc/rfc2396.txt
[RFC2854]: "RFC2854: The text/html Media Type", D. Conolly, L. Masinter, June 2000.
Available at: http://www.ietf.org/rfc/rfc2854.txt
[RFC3023]: "RFC3023: XML Media Types", M. Murata, S. St.Laurent, D. Kohn, January 2001.
This document obsoletes [RFC2376].
Available at: http://www.ietf.org/rfc/rfc3023.txt
[RFC3066]: "Tags for the Identification of Languages", H. Alvestrand, January 2001.
Available at: http://www.ietf.org/rfc/rfc3066.txt
[RFC3236]: "The 'application/xhtml+xml' Media Type", M. Baker, P. Stark, January 2002.
Available at: http://www.ietf.org/rfc/rfc3236.txt
[XML]: "Extensible Markup Language (XML) 1.0 Specification (Second Edition)", T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, 6 October 2000.
Latest version available at: http://www.w3.org/TR/REC-xml
[XMLNS]: "Namespaces in XML", T. Bray, D. Hollander, A. Layman, 14 January 1999.
XML namespaces provide a simple method for qualifying names used in XML documents by associating them with namespaces identified by URI.
Latest version available at: http://www.w3.org/TR/REC-xml-names
[XMLC14N]: "Canonical XML Version 1.0", J. Boyer, 15 March 2001.
This document describes a method for generating a physical representation, the canonical form, of an XML document.
Latest version available at: http://www.w3.org/TR/xml-c14n

B.2 Informative References

XHTML Modularization: Modularization of XHTML, M. Altheim, et. al., 2001. W3C Recommendation available at http://www.w3.org/TR/xhtml-modularization/.
XML Events: XML Events - An events syntax for XML, Steven Pemberton, T. V. Raman, Shane P. McCarron, 2003. W3C Recommendation available at: http://www.w3.org/TR/xml-events/.

C. Changes (Non-Normative)

To be added.

D. Acknowledgements (Non-Normative)

This document was produced with the participation of current Apache Forrest Project members and of all participants to the public discussion.

E. Development Notes (Non-Normative)

Decisions and threads

[RT] RAW content

Raw content is to be obtained like this:

for single files one can ask for myfile.source.extension
for whole directories, the thing has to be declared in some sitewide metadata

[RT] Directory structure and configuration

To be discussed