UIMA DocBook Framework ========================== The docbook tooling used to produce the UIMA DocBooks is based on work done by the Velocity project, where it started out as a framework to render documentation for the Velocity project (http://jakarta.apache.org/velocity/) and ended somehow up to be a generic framework to render DocBook documents using Java and driven by ant. The Velocity developers wrote: While DocBook format seems to be ubiquitous these days, to my surprise there were not many generic frameworks around that could render all kinds of formats using Java and be easily customizable. Projects either use heavily customized and hacked style sheets or a mix of Java and other applications. Adjusting such a rendering framework to the needs of the Velocity project wasn't easy, so at some point, we decided to redo this (almost) from scratch. License Information =================== Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. Author information ------------------ This framework and documentation was originally written by the Jakarta Velocity Developers. It has been extensively modified by the UIMA developers. If you have questions, found a bug or have enhancements, please contact the UIMA developers through the UIMA Development Mailing list at uima-dev@incubator.apache.org Why another framework for rendering docbook? ============================================ The Velocity project uses a simple HTML based format called XDOC for its documentation for a very long time. However, XDOC is not really popular outside the Apache world (and not even in the Apache world), it renders somehow into HTML but no other formats (unless you consider a set of alpha and beta-level Maven plugins) and tool support for this format is not really there. When an XML based format for documentation is considered, DocBook seems to be a natural choice. So we decided to take a stab at rendering the existing Velocity Docs that are end-user specific (Users Guide, Developers Guide, Reference and the likes) through DocBook. What we wanted to have, was a framework, that * Renders multiple documents into multiple formats with an uniform look without having to copy a large number of stylesheets, images and other supporting files around * Uses the standard DocBook XML and XSL zip files available for download. Many of the open source DocBook framework use heavily hacked versions and we want to be able to keep up with releases without having to patch the released files every time. * Use current versions of the DocBook reference files, the libraries and supporting tools. * Render all formats without connecting to the Internet. Using the Apache XML resolver it should be possible to use the framework completely standalone. See http://xml.apache.org/commons/components/resolver/resolver-article.html for an explanation. * has some documentation so you understand what happens when a format gets rendered and how. That can be customized easily (if you consider customizing complex XSL style sheets 'easy' :-) ) * 100% pure Java. No external programs needed or called. * ant-driven, platform independent. UIMA developers added these: * supports XInclude - so you can break up your book into individual svn-controlled chapters, and xinclude them with a master file into a book * supports the Docbook "olink" mechanism * supports multiple Docbooks * shares images among html and htmlsingle formats (images are often the most memory consuming part of the document) What you need ============= * A Java Runtime. All testing has been done using the Sun JSDK 1.5.0 but Java 1.4.2 has also been used quite extensively. * Apache Ant version 1.6 or better. The build script uses the macrodef task which was introduced in ant 1.6. Get it from http://ant.apache.org/ * If you want to render images, Java Advanced Imaging (see README.FIRST) Everything else should be included in this package. How it is used ============== The documents to render are located in src/docbook. Each book is in one subdirectory. The name of the subdirectory is up to you, but this name appears in many other places (by convention). The name should be unique to this installation of this pacakge. The subdirectory can contain 1 or more files, and an images subdirectory. Image subdirectory names must be unique for this installation of this package, because they're all put into one target directory (to permit sharing of docbook images). If you have more than one file, you can use a modular approach where one file (by convention of the same name as the book subdirectory, followed by ".xml") is the "master file" and includes the docbook element, and uses XIncludes to include other parts (typically chapters). Parallel to the book subdirectories is another set of directories under src/olink. These contain for each book the generated olink databases needed for cross reference linking using the docbook olink mechanism. See the Olink section for more details. These olink files should be saved in the source svn tree because they're needed for the rebuilding of the targets. If you run ant in the base directory of this distribution, it should build a target directory in target/ for each directory in src/docbook. In each of the directories, you'll find three outputs for the chunked html, the single html (called by the book-name.html) and the pdf (called by the book-name.pdf). In addition, the target directory has - a shared images directory (the images associated with docbook itself are shared among all books). - a css directory for each book Zip files are not created here - a further packaging step can create these if needed. Running Ant from Eclipse ======================== If you right click the build-docbook.xml file and select run as Ant ... it will bring up a menu where you can add properties. You can use this to add/edit the "book_name" property and set it to, e.g., tools - this would build the tools book. This setting is sticky - it is remembered by the Eclipse framework - and you can rerun it with a single click on the run-external-tools icon. If you run ant with no arguments in the base directory, it runs the build.xml ant script which builds all the UIMA documentation. Notes ===== * Changing the paper size The docbook framework renders its pages in "Letter" format (8.5 x 11 inches). This allows printing in both, Letter and A4 format. If you want to reformat the PDF documentation in A4, you can use the 'paper.type' parameter: ant -Dpaper.type=A4 will render the documentation in A4. Add a new DocBook "book" ======================== You create a new subdirectory inside src/docbook. In there goes your new docbook document. If you need images for your document, they go into src//images//image001.png_etc. Inside your document, they should be referenced as '..images/etc.' because images are kept one level highter in the target for sharing the docbook images. In the main build.xml file, you must add a call to the framework for your new document: (1) (1) This is the subdirectory in src/docbook If you want to do just one chapter of a book, you can. Do it like this: When you add a new document to the framework, you must make sure that its referenced DocBook DTD files can be resolved by the Catalog resolver. Currently, the resolver knows about DocBook 4.4, so your document declaration should be If you use another doctype definition, the framework will still work, but connect to the Internet to get the definition files every time you run the build process. How to write / edit Docbook source ================================== If you are comfortable with Eclipse, obtain the XMLBuddy plugin for it and use it. It integrates into the Eclipse auto-complete framework all the docbook elements and attributes. It also includes any "entities" that you might define/declare. The UIMA Docbook source makes extensive use of these - using entities this way will show an error indicator if you mispell it, and entities also participate in the auto-complete framework. And, when you're all done, click on the menu "XML" -> "format" and the source is nicely formatted for you. You can also use XMLMind (see the bottom of this document). How it works (in depth) ======================= Take a look at the MANIFEST file first to get an idea what is in this package and what the various files are supposed to do. ant files --------- The build.xml file contains only the driver targets for rendering the documentation. The actual work is done through targets defined in build-docbook.xml. This file normally should not be changed; if you have to, please let the developers know, so we can incorporate your changes and or bug fixes into the main distribution. build-docbook.xml contains three main targets: pdf, html and htmlsingle. Each is responsible for rendering one format. Most settings are done in the project.properties file and there should be no need to change these properties. DocBook reference files ----------------------- This framework uses the DocBook XML and XSL distribution archives without any changes to them. The reference files are located in docbook/zip and are expanded into the docbook/ directory before the rendering process. The file names must be reflected in the docbook.xml.version and docbook.xsl.version properties in the project.properties file. If you want to use e.g. a newer XSL version, you can put it into docbook/zip and update project.properties to reflect the change. XML Resolver ------------ The framework uses the Apache XML commons resolver to avoid accessing the Internet for Catalog files. The resolver is configured through the CatalogManager.properties and xml-catalog.xml files in the base directory of the distribution. If you update e.g. the Docbook XML version, you must also update the catalog file to match the new version. Docbook Source files -------------------- The sources for each DocBook document to render should be in src/docbook. Each document has its own subdirectory and gets rendered separately. Adding a new document is described in "Adding a new DocBook document" above. Stylesheets and Driver files ---------------------------- For each of the formats used by the framework, a stylesheet driver file exists in src/styles. These files are pdf.xsl, html.xsl and htmlsingle.xsl. The driver files are intended to reference the actual style sheet customization and to add some framework specific elements through filtering. This two step process has been chosen because html and htmlsingle are very similar and it makes no sense to maintain two sets of stylesheet customizations that are virtually identical. Before usage, these files are copied to target/tmp using an ant filter set. This allows you to use the following replacements in the driver files: @file.prefix@ - Prefix for loading a file through the XSL processor. Is 'file://' for Unix and 'file:///' for Windows (defined in the build-docbook.xml ant file). @docbook.xml@ - Location of the DocBook XML files @docbook.xsl@ - Location of the DocBook XSL style sheets @olink_file@ - where generated olink db is put (usually the book_name) @type@ - pdf, html, or htmlsingle - used to form names in olink dbs @paper.type@ - paper size for pdf output @src.dir@ - Location of ${basedir}/src @tmp.dir@ - Location of ${basedir}/target/tmp @html.target.dir@ - used only for chunked pdf - the base target directory Please note, that *only* the driver file is filtered! If you have path adjustments to make, you must do them in the driver file! Please look at the provided driver files in src/styles on how to use the filter set. StyleSheet customizations ------------------------- These customizations are located in subdirectories in src/styles. Currently there are only two: pdf and html (html and htmlsingle use the same set of customizations). They are referenced from the driver files as src/styles//custom.xsl. PDF StyleSheet information -------------------------- In the footer, the and elements of the DocBook document are displayed. Each document should have these fields defined. Titlepages ---------- PDF and HTML use custom title pages. These are located in the respecting src/styles subdirectories as titlepage.xml template definitions. The build process renders these files using the DocBook XSL template/titlepage.xsl Stylesheet into the target/tmp/ directory as -titlepage.xsl (e.g. target/tmp/pdf-titlepage.xsl). This style sheet then must be included in the style sheet driver file (see the driver files in src/styles, e.g. pdf.xsl). The current titlepage reference a project logo which must be located as 'logo.png' in the src/images directory. It gets rendered on both the HTML and PDF title pages. HTML CSS -------- There is support for a CSS file in the html and htmlsingle render process. This file is defined in the HTML customization file (src/styles/html/custom.xsl) and must be located in the src/css/html directory. It is copied into a css subdirectory in the target and must be referenced as css/. See the HTML customization file and the build-docbook.xml on how this is done. Currently, only a single style sheet is supported for both html and htmlsingle. Acknowledgements ---------------- DocBook is a fairly complex format and using and customizing the XSL style sheets available is not really straightforward. So by googling left and right and looking at other DocBook rendering frameworks that are in the open source, we tried to model similarities and sometimes actually copied some of the ideas. This DocBook framework is literally standing on the shoulders of other projects, in particular: * The DocBook Format by Norman Walsh; (C) 1999-2006 by Norman Walsh, OASIS and O'Reilly, especially all the documentation that is available from http://www.docbook.org/ * The DocBook FAQ maintained by Dave Pawson. We wouldn't have survived without that. (http://www.dpawson.co.uk/docbook/) * DocBook XSL: The Complete Guide by Bob Stayton. This is an invaluable reference to the DocBook style sheets. Find it online at http://sagehill.net/ or buy the E-book. * The DocBook Project located at http://docbook.sourceforge.net/. They maintain the XSL style sheets used to transform DocBook into other formats and also link to the docbook mailing list archives. * The Apache XML commons resolver from http://xml.apache.org/commons/components/resolver/ Ideas on how to render elements, to arrange things and how to do more obscure things like title pages or use CSS to render HTML, I've taken (sometimes literally by cut'n'paste) from * The Spring Framework documentation. This is how we got hooked on the idea that Velocity should have DocBook documentation, too. Their DocBook framework is really nice, however for my needs it proved to be 'not exactly what we was looking for' (see above). Spring is IMHO an example that good documentation makes all the difference between a successful and popular project and 'the others'. Thanks a lot, Spring guys! * The "ant and docbook" styler suite by Dawid Weiss, available from http://www.cs.put.poznan.pl/dweiss/xml/projects/ant-docbook-styler/index.xml . We stole his CSS style sheet almost verbatim. Thanks a lot, Dawid! * The Maven sdocbook plugin by Siegfried Goeschl, Per Olesen and Carlos Sanchez, available at the SourceForge Maven plugin page at http://maven-plugins.sourceforge.net/maven-sdocbook-plugin/ * The XMLmind XML Editor from http://www.xmlmind.com/xmleditor/ This is a cross-platform, pure Java editor that not only runs well on Linux, Windows and MacOS but also offers DocBook WYSIWYG support and has a free version. And if you pay for it, you get the source code for it too.