How projects-old.apache.org Works
=============================

The aim is to provide a framework that easily allows for new indexes
and projects to be added with a minimum of effort. When added, the
entire site should correctly reflect their addition and all the
required links and pages be correctly created.

Basic Process
-------------

The basic process, outlined below, may initially seem overly
complicated, but it provides a very flexible framework that can be
easily extended.

The aim of the process is to take a set of DOAP files, each
representing a project within the ASF, and produce an xdoc per project
and an xdoc per index. The actual conversion of these xdocs into html
is done via the standard xdocs2 build process.

The process is as follows.

1. Parse files.xml. For each 'location' url found attempt to fetch the
file into a temporary filename (doap_temp.rdf). We then extract the
project name from this temporary file and run xmllint on the temporary
file, storing the output as a filename built from the project
name. This is done to allow projects to use common filenames without
collisions being an issue.

2. Read each index to build the transformations required. At the end
of this step we will have an XSL stylesheet (transform.xsl) that
contains details of the information to be extracted from eavery doap
files that the indexes require. We do it this way to allow indexes to
be added/removed without requiring any templates to be altered.

3. Process the DOAP files. Each DOAP file is transformed via
'doap2xdocs.xsl' into an xdoc and is also processed via
'build_indexes.xsl' to extract the index information into
'data_indexes.xml' which is used in step #4.

4. Process each index into an xdoc. This is a 2 step affair as we need
to generate a stylesheet that contains the correct instructions for
the stylesheet as the instructions for the index can specify the
template to use and also many settings for the layout generated. The
first step is therefore to generate a temporary stylesheet by
processing the index instructions via the template (either specified
in the settings or the default). Once we have the temporary stylesheet
we generate the xdocs for the index using it. It gets deleted as long
as the transformation is succesful.

All of the above is done in the correct order by the projects.pl
script, found in the top level directory.

Requirements
------------

In order to generate the html pages you will need the following
checked out on your system

 - Perl (and .pl scripts need to be executable)

 - https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects

 - https://svn.apache.org/repos/infra/infrastructure/trunk/tools

 - one of: xalan, msxsl, xsltproc must be on the path
 
Generating HTML
---------------

Once you have them both the repositories checked out, you simply run
the xdocs2 build script. Presently the only choice is the perl script,
build.pl. This can be run directly, or via the ant build.xml script,
which sets up the necessary environment.

Xdocs2 is a generic xml to html generator which takes an
argument of the configurtaion file used to control the
transformation. This file is called projects.xml for
projects-old.apache.org. The command would therefore be

  <path to tools repository>/xdocs2/scripts/build.pl <path to projects repository>/projects.xml

If all is well this will result in the html being generated in the
html directory of the projects repository.

Index Data
----------

The index data is gathered into a single file that is then fed into
every single index stylesheet, thus all the index data is available to
every index. Every "index item" has (at a minimum) a name and 'link'
element. The 'link' element will always be the same as the 'name'
supplied for the project within the index data and is used to allow
the project element to be easily found. The actual data used by the
index is contained within the 'name' element.

A working knowledge of XSLT will be useful when adding new indexes.

The 'project/id' element is the name that is used throughout the
process to identify the project. The doap, xdocs and html filenames
will all be based on this value. e.g if the id is 'apr' then the
generated xdocs filename will be apr.xml and the final output html
file will be apr.html.

<indexData>
  <project>
    <id>ecs</id>
    <name>Apache ECS</name>
    <link>/projects/ecs.html</link>
    <desc>API for generating elements for various markup languages</desc>
  </project>
  <alpha>
    <name>E</name>
    <link>Apache ECS</link>
  </alpha>
  <language>
    <name>Java</name>
    <link>Apache ECS</link>
  </language>
  <category>
    <name>library</name>
    <link>Apache ECS</link>
  </category>
  <release>
    <name>2003-07-10:1.4.2</name>
    <link>Apache ECS</link>
    <created>2003-07-10</created>
    <releaseName>ecs</releaseName>
    <version>1.4.2</version>
  </release>
</indexData>

Node       Description
---------- -------------------------------------------------------------------
name       data to be indexed
link       name of project that the dat relates to
id         short name for project (only valid in project nodes)
desc       description of project (only valid in project nodes)

Indexes
-------

Each index is specified in a single file within the 'indexes'
directory. The file contains all the settings for the index, including
the data to be extracted from the DOAP files and the text to appear on
the finalised page. The files are simple xml with the contents shown
below.

Details of the index configuration file can be found in the README in
the indexes directory.

There are only a limited number of index formats available at present,
but adding additional formats should be easily possible. All formats
have the filename 'index_xxx.xsl' and are found within the tenplates
directory.

These format files are identified by the index configuration files
(using the 'template' element in the output section) and they are used
to generate an index specific temporary stylesheet that is used by the
master file.

The master file is 'index2xdocs.xsl' and contains all the neccesary
logic to produce the xdocs for the index. It uses the file
'projects.xsl' for a number of utility templates, though these may be
merged with the main file at some point. As there are so many
variables that can be set for each index the master file is also
processed to create a temporary stylesheet specific to the index. One
parameter that is passed at this stage is the filename of the
temporary file that is being used to store the generated data from the
format file. This file is then included in the final step.

The final step is to generate the xdoc document using the temporary
files and the index data.

Where to find the files
=======================
Apart from the files in this directory tree, the process also makes use of XML/XSLT files under:

https://svn.apache.org/repos/infra/infrastructure/trunk/tools/xdocs2

It also uses Perl modules stored in
https://svn.apache.org/repos/infra/infrastructure/trunk/tools/support

Automated generation of files
=============================
The files are regenerated every few hours by a cron job running under apsite on minotaur.

The cron job runs the following file:

~apsite/wrkdir/bin/build.sh

which is stored in SVN at:

https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/bin/build.sh