DocBook Filters - Read and write docbook xml using OpenOffice.org
Goal is to explore the possibility of using OpenOffice.org as a WYSIWYG editor of XML content. The principle is to edit structured documents using styles. These styles are then transformed to XML tags on export.
This page shows how to enable and use the DocBook filters. It also shows the location of the stylesheets so that users can download and use the latest transformations.
Initially the project used OpenOffice.org sections to enforce nesting of DocBook sections. Feedback has shown that authors wish to use the common word processing styles such as Heading1, Heading2, etc. The stylesheets and templates shipped with OOo1.1 Beta use OpenOffice.org sections. Links are available below for stylesheets which use OOo headings and instructions on how to install these stylesheets. Stylesheets currently only support DocBook Articles. Book and Chapter support is planned.
Currently stylesheets are packaged in a JAR. The content of the JAR is explained later in this document. It is planned that future OOo will be able to use the import and export XSLTs directly.
To see what is available and get an impression of how it works look at :
- A Sample DocBook Document
- The corresponding Writer (.sxw) file after importing the docbook file.
Note : Nested sections supported. See applied or custom styles in the stylist for DocBook tags which have been supported.
Note : See also Eric Bellot's OOo2sdbk
How to Enable DocBook in OpenOffice1.1 Beta
The DocBook filters are installed during OpenOffice.org setup by selecting "Custom Installation" and in the "Optional Components" section, selecting the "Mobile Device Filters".
To enable the DocBook filter, the following file must be edited:
<OOo Install Dir>/share/registry/data/org/openoffice/Office/TypeDetection.xcu;
- Search for the line <node oor:name="DocBook File" oor:op="replace">
- Before the corresponding </node> tag add the following lines:
-
<prop oor:name="Installed" oor:type="xs:boolean">
-
<value>true</value>
- Similarly for XHTML except search for the line <node oor:name="XHTML File" oor:op="replace">
- Similarly for Flat XML File except search for the line <node oor:name="Flat XML File" oor:op="replace">
Once the filter has been enabled, run OOo to see DocBook available in the OpenOffice.org "File of Type" comb-box in both the Open and Save/As dialogs. Hint : Type D in the "File of Type" combo-box maybe more than once.
NOTE : In order for the Java based filters to work correctly, and Java Runtime Environment needs to be specified during setup. JRE1.4 or greater is recommended as it contains an XML Parser (Crimson) and an XSLT processor (Xalan). JRE1.3 can also be used, if a parser e.g. Xerces or Crimson, and the Xalan XSLT processor are made available.
Using OpenOffice.org to create and edit DocBook XML
Creating a DocBook Template
It is not possible to use a predefined template to supply the DocBook styles in OpenOffice.org. To do this you must do the following:- Download the associated template
- Open the template on OOo
- Press F11 and choose 'All Styles', so that all available DocBook styles are displayed.
Review the UserGuide for information on using the filter.
Sections/Headings
How to change the Stylesheet
The docbook filter uses the XMerge framework's XSLT processing functionality. The docbook.jar contains a set of two XSLT style-sheets, one for transforming from docbook to OpenOffice and one for transforming from OpenOffice to docbook.
The file also contains a converter.xml file in the META-INF directory that contains information describing the supported mime-types, the style-sheet names and the XMerge plugin that it uses.
To make changes:
- Create a temp directory.
- mkdir temp
- Unpack the jar to the temp directory:
- jar -xvf docbook.jar temp
- This will produce the following files in the temp directory:
- META-INF/MANIFEST.MF
- META-INF/converter.xml
- sofftodocbook.xsl
- docbooktosoff.xsl
The two stylesheets can now be edited as required, or download the latest copies from here.
Repacking the jar:
- cd temp
- jar -cvf ../docbook.jar *
- Copy the jar to <OOo installation>/program/class directory
- Restart OpenOffice.org
For example, to use the new XSLT stylesheets, replace the stylesheets in temp with the downloaded versions. The converter.xml file contains info regarding the file conversions that the filter supports. This information is used by XMerge when a conversion has been requested.
-
<converters>
-
<converter type="staroffice/sxw" version="1.0">
-
<converter-display-name>
-
XSLT Transformation sxw
-
<converter-description>
-
Converter which performs xslt transformations
-
<converter-vendor>
-
OpenOffice.org
-
<converter-class-impl>
-
org.openoffice.xmerge.converter.xml.xslt.PluginFactoryImpl
-
<converter-xslt-serialize>
-
sofftodocbookheadings.xsl
-
<converter-xslt-deserialize>
-
docbooktosoffheadings.xsl
-
<converter-target type="application/x-docbook" />
Currently Supported Tags
Click here to obtain a list of currently supported DocBook tags.
ToDo
- Increase coverage of supported DocBook tags
- XML Entity support. Entity references are lost currently. It may be possible to preserve them by treating them as fields.
- ArticleInfo - Initially was going to use document properties but too many tags so this would probably be another section.
- Images. (We have placeholders for required information).
- Hyperlinks.
- Chapters
- References
Limitations
These are limitations which should be highlighted but are not blockers. In fact, sufficient interest in this project should drive requirements for enhancements.
- No support in OOo for user as to which styles (tags) to use when. Must create and follow guidelines.
- No validation for export.
Open Issues
- Do not support nested tags in text spans.
-
e.g Use:
-
<menuchoice><guimenu>File</guimenu><guimenuitem>New</guimenuitem></menuchoice>
-
Use File ->New
- Importing comments (Parser ignores, may have to use comment tag).
- Nested lists are split into three separate lists in SO and have to be exported as three separate lists. Breaks the round trip.
- Styles are defined in the import stylesheet.
- Would like to be able to import into an existing template.