Using DocBook Lite

Contents

1. Introduction
2. XML
3. For best results...
4. The Physical Structure of a Book
5. The Logical Structure of a Book
6. Block elements
7. Inline elements
8. Tables
9. Indexterms
10. Out-of-flow Text
11. Reference Pages
12. See Also...

1. Introduction

This document describes the use of DocBook Lite, a document type definition (DTD) that defines and shapes the markup of O'Reilly books. The DTD declares a set of elements (containers of content) and entities (stand-ins for content). Its notation restricts the kinds of elements and data that each element can hold. For example, a <chapter> can contain a <para> (paragraph), but cannot contain a <book>.

DocBook Lite is an application of the XML markup language rules. XML doesn't require that you use a DTD, but we find that enforcing structure is vital to our ability to produce and repurpose books efficiently. Originally, we used the full DocBook application maintained by the OASIS SGML/XML standards group, but have since refined it to a small subset with its own DTD. It has a few additions for some types of books, but mostly the markup should be compatible with the full DocBook.

2. XML

The eXtensible Markup Language (XML) is a specification for how markup should work in a document. Using codes embedded in text, it defines structure and properties for the parts of a document. The basic philosophy of XML is that every unique part of a document should be clearly labelled and its position should be unambiguous. If a document satisfies the minimal rules of XML, it is said to be well-formed. A document that is not well-formed has syntax or other kinds of errors that need to be fixed before the document can be processed.

The rules for a well-formed XML document differ from those of HTML pages. HTML is less strict with syntax, and most browser will not complain about poor style or errors. However, XML is much less forgiving, so be warned.

DocBook Lite adds another level of control to the document. It restricts the kinds of elements and structures that make up a book. Specially tailored to O'Reilly's style, the DTD ensures that the book has maximum quality and information value by the time it reaches production. A document that conforms to our DTD is described as valid. A document that is not valid has incorrect markup that needs to be fixed before it can be accurately processed by our tools.

To test the validity of a document, you need to use a program called a validator. Sometimes this is built-in to the text editor you're using. For example, Arbortext's Adept editor has a validation option. Other editors may not include a validating parser because it's assumed that any document you open is already valid, and you won't make any mistakes because the editor won't let you (it constrains your actions). There are also stand-alone validators that read a document and list errors for you. We use nsgmls to validate books.

nsgmls is written and maintained by James Clark and is available for free from his web page.

3. For best results...

Just because you're using XML doesn't mean the document is marked up as well as it can be. Even a valid document can have mistakes and problems that lower the quality of the book and slow down production. For example, you may use the wrong element name, which will pass the DTD test but not make sense to a human. The following code fragment shows the correct markup for a term-definition list:

<variablelist>

 <varlistentry>
  <term>monkey</term>
  <listitem><para>A cute, furry mammal that climbs in trees.</para></listitem>
 </varlistentry>

 <varlistentry>
  <term>koala</term>
  <listitem><para>A cute, furry mammal that climbs in trees.</para></listitem>
 </varlistentry>

</variablelist>

Sometimes, people choose to do this with another kind of list:

<itemizedlist>
 <listitem><para>Monkey - A cute, furry mammal that climbs in trees.</para></listitem>
 <listitem><para>Koala - A cute, furry mammal that climbs in trees.</para></listitem>
</itemizedlist>

It's easier to type it the second way, but then you lose some information, like the fact that it's supposed to be a mapping of terms to definitions. Although you may think you're saving time with this shortcut, it will add delays later when production staff have to transform the list into its proper markup.

Another common mistake authors make is to assume that presentational markup is just as good as semantic markup. In other words, saying how something looks is as good as saying what it is. This is contrary to the philosophy of XML, and also will cause problems for your book later on. For example, consider the inline markup for a Web address, or URL. In print, a URL appears in italic like this:

For more information, you really ought to check out the W3C's website at http://www.w3.org/.

The correct way to mark up this passage is like this:

<para>For more information, you <emphasis>really</emphasis> 
ought to check out the W3C's website at <systemitem
class="url">http://www.w3.org/</systemitem>.</para>

This is called semantic markup because the "really" and the URL are labelled according to their meaning, not their appearance. The next snippet shows the incorrect, presentational markup:

For more information, you <emphasis>really</emphasis>
ought to check out the W3C's website at
<emphasis>http://www.w3.org/</emphasis>.

The author thought that because <emphasis> causes its contents to be formatted in italic, it's okay to label everything that comes out in italic as a <emphasis>. So what? Well, it becomes a problem when you want to repurpose the document in HTML. Instead of the URL coming out as a hyperlink, it's merely formatted in italic. When everything is marked up presentationally, it's impossible to reuse the content in a different context.

4. The Physical Structure of a Book

By physical structure, we mean the files and directories used to contain the pieces of an XML document. An O'Reilly book typically is stored in a single directory. Each chapter, preface, and appendix exists in a separate file, and there is a "master" file which contains the top (root) element for the book. The following table lists the kinds of files and their name conventions:

Table 1. File naming conventions

element filename purpose
<book> book.xml

Contains the DOCTYPE declaration, declares local entities in the internal subset, holds metadata, contains file reference entities to chapters and other external elements.

<copyrightpg> copy.xml

Contains the copyright page with legal info.

<preface> ch00.xml

Contains the preface.

<chapter>

chxx.xml, where xx is the number of the chapter (e.g. 01, 04, 11)

Contains a chapter.

<appendix>

appx.xml, where x is the letter of the appendix (e.g. a, b, c)

Contains an appendix.

<bibliography> biblio.xml

A chapter-level section containing bibliographic citations.

<glossary> gloss.xml

A chapter-level section containing glossary definitions.

<part>

partx.xml, where x is the number of the part (e.g. 1, 2, 3)

Contains a part (the first page only).

<colophon> colo.xml

A section at the end of the book describing details of the book's production.

So a typical directory listing for a book would look something like this:

$> ls /work/java/java.qref/xml

    appa.xml      ch01.xml      ch06.xml      part1.xml
    appb.xml      ch02.xml      ch07.xml      part2.xml
    appc.xml      ch03.xml      ch08.xml
    book.xml      ch04.xml      colo.xml
    ch00.xml      ch05.xml      copy.xml

There is one master file for the book (book.xml); three files for appendixes A, B, and C; one file for the preface; eight files for chapters 1-8; two files for parts I and II; and a file each for the copyright page and colophon. Note that the part files do not contain the chapters, even though the <part> elements logically contain <chapter>s.

5. The Logical Structure of a Book

Logical structure is defined by the book's markup once the files have been assembled, which is how the parser sees the document. The hierarchy of the book (chapters, sections, sub-sections, etc.) is constructed using elements called divisions. A division is a container that usually has a <title> and contains other divisions or block elements. We will look at top-level containers (<book>, <chapter>, <preface>, <appendix>, and <part>), and intermediate-level divisions (<sect1>, <sect2>, <sect3>, <sect4>, <simplesect>, and <partinfo>).

5.1. ID Attributes

To facilitate cross references, you should use IDs in major hierarchical elements such as sections, chapters, and appendixes. For example, a chapter might have an ID like this:

<chapter id="intro-chapter">

Technically, the only requirement for an ID attribute is that it be unique, since in XML no two elements can have the same ID attribute. However, we ask that you use attribute names that are easy to read and indicate the type of element and its subject. For example, id="shapes-section" is better than id="A6-4.2".

When the book is in production, and the order of elements is solidified, we will sometimes run a program that inserts ID attributes where they were previously missing, or to replace those that make sense only to the author. These generated IDs follow a pattern that helps production staff trace links to their targets, even if the target ID is missing or misspelled. The following table lists these ID patterns:

Table 1. Autogenerated ID Attributes

element pattern example
appendix prefix-APP-letter MONKEYS-APP-B
chapter prefix-CH-num MONKEYS-CH-3
preface prefix-PREF MONKEYS-PREF
part prefix-PART-num MONKEYS-PART-2
sect1 (A-head or section) prefix-CH/APP-num/letter-SECT-num MONKEYS-CH-3-SECT-2
sect2 (B-head or sub section) prefix-CH/APP-num/letter-SECT-num.num MONKEYS-CH-3-SECT-2.4
figure prefix-CH/APP-num/letter-FIG-num MONKEYS-PREF-FIG-24
table prefix-CH/APP-num/letter-TABLE-num MONKEYS-APP-C-TABLE-11
example prefix-CH/APP-num/letter-EX-num MONKEYS-CH-12-EX-8
indexterm prefix-terms MONKEYS-perilous-furballs

The prefix is a short code you pick to represent your book in ids. It's not required, but we use it to eliminate any confusion when handling files separately, since there is nothing within each file to indicate which book it came from.

The number for any element type starts at 1. For elements inside chapters or appendixes, the numbering is starts within the chapter/appendix and continues to increment until the very end. An exception is for sections, whose numbering resets to 1 with each parent section or chapter.

5.2. The master book file

This file is the root of the whole book. Therefore, it contains the important XML declaration at the top, as well as a directive to declare the DTD. You would modify the second string value to the actual location of the DTD on your system. Next are entity declarations. These can only reside inside the square brackets. First are the entities that point to the files in your book. Then follow any special entities you want to define. The <bookinfo> element will later be filled with metadata, but for now it can be left empty. Finally, the entity references for book files are listed in order at the bottom.

Example 1. Contents of book.xml

<!DOCTYPE book PUBLIC 
    "-//ORA//DBLITE 1.1//EN" "/usr/local/prod/sgml/dblite/new.dtd"
[

<!-- Declare external entities -->

<!ENTITY ch00     SYSTEM "ch00.xml">
<!ENTITY ch01     SYSTEM "ch01.xml">
<!ENTITY ch02     SYSTEM "ch02.xml">
<!ENTITY ch03     SYSTEM "ch03.xml">
<!ENTITY ch04     SYSTEM "ch04.xml">
<!ENTITY ch05     SYSTEM "ch05.xml">
<!ENTITY ch06     SYSTEM "ch06.xml">
<!ENTITY ch07     SYSTEM "ch07.xml">
<!ENTITY ch08     SYSTEM "ch08.xml">
<!ENTITY ch09     SYSTEM "ch09.xml">
<!ENTITY part1    SYSTEM "part1.xml">
<!ENTITY part2    SYSTEM "part2.xml">
<!ENTITY appa     SYSTEM "appa.xml">
<!ENTITY appb     SYSTEM "appb.xml">
<!ENTITY appc     SYSTEM "appc.xml">
<!ENTITY copy     SYSTEM "copy.xml">
<!ENTITY colo     SYSTEM "colo.xml">

<!-- Declare text entities -->

<!ENTITY ascii    "<acronym>ASCII</acronym>">
<!ENTITY html     "<acronym>HTML</acronym>">
<!ENTITY sgml     "<acronym>SGML</acronym>">
<!ENTITY xml      "<acronym>XML</acronym>">
<!ENTITY w3url    "http://www.w3.org/">

]>

<book>
  <title>Learning &xml;</title>

  <bookinfo>

      <!-- Marketing information and metadata, 
              to be filled out in production. -->

  </bookinfo>

  <!-- External entity refs -->

  &ch00;
  &copy;
  &part1;
  &part2;
  &colo;

</book>

In this example, none of the external entity references for chapters and appendixes appear in book.xml. We will see later that the chapters and appendixes are references only inside the files for the parts in which they belong, to maintain the proper hierarchy.

5.3. The preface

The preface is contained entirely within one file called ch00.xml. It looks like this:

Example 1. Contents of ch00.xml

<preface id="XML-PREF">
  <title>Preface</title>

          <!-- preamble (no A-head) -->

  <simplesect>
    <para>Welcome to &xml;. Put on your geek hat, twirl the
    propeller, and get ready for a wacky ride!</para>

    <para>Blah blah blah blah blah...</para>
  </simplesect>

          <!-- the first section (A-head) -->

  <sect1 id="XML-PREF-SECT-1">
    <title>Why &xml;?</title>

    <para>&xml; is a markup language development kit.  Blah blah
    blah...</para>

    <para>Blah blah blah blah blah...</para>
  </sect1>

          <!-- the second section -->

  <sect1 id="XML-PREF-SECT-2">
    <title>What's inside</title>

    <para><xref linkend="XML-PART-1"/> starts off with an
    introduction to some basic &xml; areas that any author ought to be
    familiar with. Blah blah blah...</para>
  </sect1>

          <!-- Other sections... -->

</preface>

5.4. Part Pages

A <part> is an element that contains chapters or appendixes, dividing the book into major categories. At O'Reilly, we separate chapters into different files by convention, so the part is spread across several files. The front page of the part doesn't belong in any particular chapter, so it gets its own file:

Example 1. Contents of part1.xml

<part id="XML-PART-1">
  <title>Basic Concepts</title>

  <partintro>
    <para>In this part of the book, we focus on easy material.
    Blah blah blah...</para>
  </partintro>

  <!-- External entity references -->

  &ch01;
  &ch02;
  &ch03;
  &ch04;

</part>

No entity declarations are necessary because they were made in the book.xml file and carried over. The <partinfo> is an intermediate-level division that contains all elements between the <title> and the chapter-level children of the <part>.

5.5. The Chapter

Chapters follow the same basic pattern as a preface:

Example 1. Contents of ch01.xml

<chapter id="XML-CH-1">
  <title>&xml; Basics</title>

          <!-- preamble (no A-head) -->

  <simplesect>
    <para>In this chapter, we cover the
    fundamentals of markup and document structure.  Blah blah blah...</para>

    <para>Blah blah blah blah blah...</para>
  </simplesect>

          <!-- the first section (A-head) -->

  <sect1 id="XML-CH-1-SECT-1">
    <title>What is Markup?</title>

    <para>There's a lot of stuff you can
    do with markup.  Blah blah blah...</para>

    <para>Blah blah blah blah blah...</para>
  </sect1>

          <!-- the second section -->

  <sect1 id="XML-CH-1-SECT-2">
    <title>A historical perspective</title>

    <para>It's useful to see how &xml;
    fits in the long line of markup languages. Blah blah blah...</para>

          <!-- a sub-section -->

    <sect2 id="XML-CH-1-SECT-2.1>
      <title>The earliest days</title>

      <para>Back in the olden days of
      digital text, work was very hard. We had to flip knife switches
      to program computers and &ascii;
      was the only character set in town. Blah blah blah...</para>

      <para>Blah blah blah blah blah...</para>

          <!-- a sub-sub-section -->

      <sect3 id="XML-CH-1-SECT-2.1.1>
        <title>How I had to walk to work barefoot, uphill
        both ways</title>

        <para>Blah blah blah blah blah...</para>

        <figure id="XML-CH-1-FIG-1">
          <title>Picture of my blistered feet</title>
          <graphic fileref="figs/soretoes.gif"/>
        </figure>

      </sect3>
    </sect2>
  </sect1>

          <!-- Other sections... -->

</chapter>

The hierarchy of sections is important. A <sect1> contains <sect2>s, which can contain <sect3>s, etc. Also important: we now require that elements after the chapter <title>, but before the first <sect1>, be enclosed in a <simplesect>.

Appendixes are pretty much the same as chapters, except that they are contained in an <appendix> element instead of a <chapter>, and the id format is different.

5.6. Glossary

A glossary is a collection of definitions for terms used in the book. A <glossary> element surrounds the whole thing, and is usually at the same level as a chapter. Each definition is contained in a <glossentry> element with one <glossentry>, containing the term being defined, and an optional (if there is a see-also, for example) <glossdef>, containing the definition. It can also contain any number of <glosssee> and <glossseealso> elements, which redirect the reader's attention to another term. Stylistically, a <glossentry> should not contain both a <glossdef> and <glosssee>, but a <glossdef> and <glossseealso> are okay.

A glossary looks like this:

Example 1. Contents of gloss.xml

<glossary id="the-glossary">

  <!-- SECTION: XML-SPECIFIC TERMS -->

  <glossdiv>
    <title>XML Terms</title>
    <glossentry>
      <glossterm>absolute location term</glossterm>
      <glossdef>
        <para>A term that completely identifies the location of a
          resource via XPointer. A unique ID attribute assigned to an
          element can be used as an absolute location
          term.</para>
      </glossdef>
      <glossseealso>relative location term</glossseealso>
      <glossseealso>XPath</glossseealso>
      <glossseealso>XPointer</glossseealso>
    </glossentry>

    <glossentry>
      <glossterm>actuation</glossterm>
      <glossdef>
        <para>How a link in a document is triggered. For example, a
          link to an imported graphic automatically includes a graphic
          in the document, and a link to a URL resource requires a
          signal from a human.</para>
      </glossdef>
    </glossentry>
  </glossdiv>

  <!-- SECTION: OTHER TERMS -->

  <glossdiv>
    <title>Other Terms</title>
    <glossentry>
      <glossterm>albatross<glossterm>
      <glosssee>birds</glosssee>
    </glossentry>
  </glossdiv>
</glossary>

5.7. Bibliography

Coming soon...

6. Block elements

A block element is any element that starts a new line when formatted and contains inline elements. We have seen three already: <para>, <title>, and <figure>. Following is a table of block elements with examples:

Table 1. Common block elements

element purpose example
<para> Paragraph.
<para>The quick, brown fox jumped over the lazy dog. The
quick, brown fox jumped over the lazy dog. The quick, brown fox jumped
over the lazy dog.</para>
<title> Title.
<title>The Benefits of Laughter</title>
<comment>

Comment that is not meant for the audience, and outside of the flow of text. Usually, a communication between the author, editor, reviewers, copyeditors, etc.

<comment>This section is too short
and needs more examples. [Ellen]</comment>
<blockquote> Quotation.
<blockquote>
  <para>"They who can give up essential 
  liberty to purchase a little temporary 
  safety, deserve neither liberty nor safety," 
  spat Ben Franklin.</para>
  <para>"Yes," replied Mark Twain, "but
  loyalty to petrified opinion never broke a 
  chain or freed a human soul."</para>
</blockquote>
<itemizedlist>

A list of items where order doesn't matter.

<itemizedlist>
  <listitem><para>dogsled</para></listitem>
  <listitem><para>hang-glider</para></listitem>
  <listitem><para>roller blades</para></listitem>
</itemizedlist>
<orderedlist>

A list of items where order is important.

<orderedlist>
  <listitem><para>Get a bowl.</para></listitem>
  <listitem><para>Pour the cereal.</para></listitem>
  <listitem><para>Add the milk.</para></listitem>
  <listitem><para>Eat.</para></listitem>
</orderedlist>
<variablelist>

A list that contains terms and their definitions.

<variablelist>
  <varlistentry><term>Snickers</term>
  <listitem><para>Peanuts in nougat
    covered in chocolate.</para></listitem></varlistentry> 
  <varlistentry><term>Payday</term>
  <listitem><para>A peanut cluster cemented with caramel and 
    delicious sticky stuff.</para></listitem></varlistentry> 
</variablelist>
<programlisting>

A piece of computer code or example markup where whitespace and other formatting must be preserved.

<programlisting>public void init(ServletConfig config) 
                                     throws ServletException {
  super.init(config);
  String greeting = getInitParameter("greeting");
}</programlisting>
<screen>

A representation of data displayed on a computer screen. (Whitespace and other formatting are preserved.)

<screen>&gt; ls -l
total 6860
-r--r--r--   1 jliggett ora         2570 Mar 27 19:38 BOOKFILES
-rw-rw-r--   1 jliggett ora       103283 Mar 27 19:38 BOOKIDS
-rw-rw-r--   1 bsalter  ora         2692 Mar 28 14:43 Makefile
drwxrwxr-x   2 bsalter  ora          512 Aug 10 14:22 RCS/
-rw-rw-r--   1 jliggett ora           39 Jul 26 17:39 README.txt</screen>
<literallayout>

Traditional text with special linebreaks to be preserved.

<literallayout>to be yourself, in a world that 
     tries,           night and day,       to make you 
just like everybody else, is to fight 
 the greatest battle there ever is 
 to fight,                and never stop fighting   
   e. e. cummings</literallayout>
<figure>

A graphic with a title.

<figure id="FOO-APP-E-FIG-19">
  <title>The garden variety eggplant</title>
  <graphic fileref="figs/eggplant.eps"/>
</figure>
<example>

Anything that serves as an example and requires a title. (Use an <informalexample> if you don't need a title.)

<example id="BAZ-CH-14-EX-5">
  <title>Contents of the file <filename>blather.cfg</filename></title>
  <programlisting>CATS	= -c/usr/local/prod/sgml/CATALOG
DSLCAT	= -c/usr/local/sp/dsssl/catalog
DECL	= /usr/local/sp/pubtext/xml.dcl
SRCHURL = xsrch.htm
STYLE	= dbwrap.dsl
VALOPTS	= -sv -wxml</programlisting>
</example>

7. Inline elements

In contrast to block elements, inline elements do not force a line break, but coexist peacefully with their siblings inside a block element. There are two basic types: those that contain data, and those that don't. The first group is used to label one or more words as a special kind of object, or deserving of special processing. Those in the second group function as markers in the text, anchoring a cross reference or marking some other kind of positional data. The following table lists inline elements and their function.

Table 1. Inline Elements

element purpose example
<abbrev> An abbreviated term.
She's in <abbrev>bldg</abbrev 42.>
<accel> A shortcut.
Type <accel>Ctl-s</accel> to search for
a term.
<acronym>

Mark text as being an acronym.

<acronym>ASCII</acronym>
<action>

A user interface action like a mouse click.

Click <action>mouse
button-3</action> for a pop-up menu.
<application>

The name of a computer software program.

We can convert any document written in
<application>Microsoft Word</application>.
<citation>

The source of a quote or piece of information.

<citation>Bill Gates</citation> 
once said <quote>256 kB of RAM ought to be good enough
for anybody.</quote>
<citetitle>

The name of a book or article.

<citetitle>The Hobbit</citetitle>
<classname>

An identifier for a class in some programming language.

The <classname>string</class>
class has six public methods.
<classref>

A cross reference to a class, with special formatting such as displaying the class name. (Used mainly in Java books.)

<command>

Any command one would type in a computer terminal.

To print the file to screen, use the 
<command>lpr</command> command.
<computeroutput>

Text that would be output by a computer program.

When we run the script we get the result
<computeroutput>file not found</computeroutput>.
<email>

An email address. (Note that there may be some conflict with the <systemitem> element.)

Send questions to
<email>tools@oreilly.com</email>.
<emphasis>

Give special emphasis to a word or phrase. Usually this formats as italic, but default formatting can be overridden with a role attribute such as role="bold".

This step is <emphasis>very</emphasis> important
<envar> An environment variable.
Set the variable <envar>EDITOR</envar>
to <literal>emacs</literal>.
<filename> Tags a word as being a filename.
Be sure to read
<filename>readme.txt</filename>.
<firstterm>

The first time an important term is mentioned.

A <firstterm>squib</firstterm>
is someone born to a wizard family but who can't do magic.
<foreignphrase> Words from another language.
There's nothing wrong with borrowing code 
<foreignphrase>per se<foreignphrase>.
<footnoteref>

A marker that imports a <footnote> where there would otherwise be a redundant footnote defintion.

The Eiffel Tower is huge<footnote id="ABC-CH-4-FN-2"><para>Although, compared to a breadbox, any
building is huge.</para></footnote>. So is a redwood 
tree<footnoteref linkend="ABC-CH-4-FN-2">.
<graphic>

An icon or picture to be imported into the document.

Examples marked with a disk icon 
<graphic fileref="figs/icon.eps"> are on
the companion disk.
<function>

The name of a function, method, or subroutine.

The function
<function>alpha_sort</function> can be made more
efficient.
<guibutton>

A clickable control (e.g. a button) in a graphical interface.

Select the
<guibutton>print<guibutton> button to get 
hardcopy.
<guimenu>

A menu or submenu in a graphical interface.

Close the program by selecting 
<guimenuitem>exit</guimenuitem> from the
<guimenu>file</guimenu> menu.
<guimenuitem>

An item in a menu or submenu in a graphical interface.

Close the program by selecting 
<guimenuitem>exit</guimenuitem> from the
<guimenu>file</guimenu> menu.
<keycap>

A character to be represented as a key on a keyboard.

Pressing <keycap>s</keycap>
will save the buffer to a file.
<keysym>
<lineannotation>

An annotation appearing inside a <screen> or <programlisting>.

for( int $i=0; $i<10; $i++ ) {
  <lineannotation>body of loop</lineannotation>
}
<literal>

A token or string that is part of a computer program or script, which should be formatted in constant width.

If the parameter's value is 
<literal>YELLOW</literal> your subroutine will
explode.
<option>

A code to apply an optional parameter to a command, program, or function.

The command synopsis is 
<command>rm <option>-i</option> *.txt</command>.
<optional>

Designates some text as an optional item.

The stylesheet specification is optional: 
<command>formatfiles <optional>stylesheet</optional> in.xml<command>.
<parameter>

The name of a parameter for a function, method, or subroutine.

In the <function>factorial</function>
function, there is only one parameter,
<parameter>num</parameter>.
<prompt>

A word meant to appear as a prompt in a computer display.

At the prompt 
<prompt>Data?</prompt>, type in your age in
hexadecimal.
<quote>

Quoted text.

Our motto is 
<foreignphrase>Caveat Emptor</foreignphrase>, 
which means <quote>we hope you like it!</quote>
<replaceable>

Marks the data as a replaceable item, a value to be filled in.

...where <replaceable>w</replaceable>
is the width.
<returnvalue>

Data that has been returned from a program or function.

The <function>reverse_string</function>
gives the value <returnvalue>tesolcmoorb</returnvalue>.
<sgmltag>

The name of an SGML or XML element.

The <sgmltag>P</sgmltag>
element adds space above and below.
<structfield>

The name of a field in a data structure.

<structfield>name</structfield>
is a fixed array of bytes.
<structname>

The name of a data structure.

To add a record, we must create a new 
<structname>PartStruct</structname>.
<subscript>

Text that should be rendered in subscript (smaller and below the baseline).

The molecule
H<subscript>2</subscript>O has many strange
properties.
<superscript>

Text that should be rendered in superscript (smaller and above the midline).

Einstein revolutionized physics with 
the simple equation E=MC<superscript>2</superscript>
<symbol>

A special symbol or token.

The mutant gene <symbol>BLu-6</symbol>
is responsible for Smurfs' vivid azure hue.
<systemitem>

Designates data as a special item having to do with computers or networks. Most common use is to encode a URL.

<systemitem class="url">http://www.oreilly.com</systemitem>
<type>

A variable or constant data type.

The function returns a value of type 
<type>boolean</type>.
<userinput>

Text entered by a human into a computer terminal.

At the command line, type
<userinput>telnet bubba.beerguzzlin.org</userinput>
<wordasword>

A word used as an example.

By <wordasword>snake</wordasword>, 
I mean <quote>dirty, stinkin' varmint</quote>.
<xref>

A cross reference to some element in the book. The required linkend attribute contains the id of the element being referenced.

For more information, refer to
<xref linkend="XYZ-CH-4"/>.

8. Tables

The tables used in DocBook Lite are a slimmed-down version of the CALS table model, a popular markup format for tables. There are two outer elements for tables: <table>, which requires a title, and <informaltable>, which does not. These elements contain a <tgroup> element with an attribute cols that specifies the number of columns in the table.

The table has a head, body, and foot, contained in <thead>, <tbody>, and <tfoot> elements, respectively. Only the body is required. The head and body contain a set of rows, each a <row> element. The foot contains text that will appear just below the table, usually within the lines.

A <row> element contains some number of <entry>s, each corresponding to a table cell. The entry may either contain mixed content text, or a block element such as a paragraph. The following is an example of a simple titled table:

<table id="ABC-CH-1-TABLE-5">
  <title>States and Their Capitals</title>
  <tgroup cols="2">
    <thead>
      <row>
        <entry>State</entry>
        <entry>Capital</entry>
      </row>
    </thead>
    <tbody>
      <row>
        <entry>New York</entry>
        <entry>Albany</entry>
      </row>
      <row>
        <entry>Massachusetts</entry>
        <entry>Boston</entry>
      </row>
      <row>
        <entry>Hawaii</entry>
        <entry>Honalulu</entry>
      </row>
    </tbody>
  </tgroup>
</table>

To span a row, add to the <entry> element an attribute morerows="N", where N is the number of rows to span beyond the current row. For example, to make an entry that spans 3 rows, use morerows="2". For each of the following rows that are spanned, leave out an <entry> element, since the spanning cell will inhabit that space.

To span columns, it's a bit more complicated (we didn't try to fix the weird CALS way of doing it). First, you have to name the columns. Second, you need to create named spans. Finally, you reference the spans within the table cells. Here's an example:

<informaltable>
  <tgroup cols="3">
    <colspec colnum="1" colname="c1">
    <colspec colnum="3" colname="c3">
    <spanspec spanname="span13" namest="c1" nameend="c3">
    <tbody>
      <row>
        <entry>A</entry>
        <entry>B</entry>
        <entry>C</entry>
      </row>
      <row>
        <entry colspan="span13" >D</entry>
      </row>
      <row>
        <entry>E</entry>
        <entry>F</entry>
        <entry>G</entry>
      </row>
    </tbody>
  </tgroup>
</table>

9. Indexterms

We generate indexes for books automatically, with the data originating in <indexterm> elements interspersed throughout the book. An indexterm holds all the information necessary for a single entry in an index, including the primary, secondary, and tertiary terms, references to other entries (see, see also), how to sort the term, and whether it should cross a range of pages. An indexterm typically looks like this:

<indexterm id="ixt-blather-frobozz-zmic">
  <primary>blather</primary>
  <secondary>frobozz</secondary>
  <tertiary sortas="@">zmic</tertiary>
  <seealso>fuj</seealso>
</indexterm>

The term in this example is a tertiary-level term "zmic", which appears under the secondary term "frobozz", under the primary term "blather". It will be sorted as if it began with the character "@", which will pull it to the top of the secondary term's listing. The term will display "see also fuj". Here's how the final index entry might look in an index:

blaam 24-26
blather
  abba 12, 15
  crufty 99-105, 411
  frobozz 75
    zmic 10 (see also fuj)          <-- the term
    asca 19, 22
    gumm 82-88
    splat 470
    zingle (see grooby)
  gurgle 99, 111
bmm (see kluk)
bravy
  scoot 45-48
  yodle 91

Indexterms can appear inside a wide variety of elements, but they typically appear inside paragraphs, lists, tables, or sections. They are forbidden from appearing in titles, and should only rarely appear inside program listings.

To create a term that spans a segment of text, you use two <indexterm> elements linked by an id-startref attribute pair and class attributes. For example:

<!-- start of the range -->
<indexterm id="idx-fooby" class="startofrange">
  <primary>fooby</primary>
</indexterm>

<!-- content to be indexed -->
<sect1>
  <title>Programming Your Fooby</title>
  <para>Blah blah blah...</para>
  ...
</sect1>

<!-- end of the range -->
<indexterm class="endofrange" startref="idx-fooby"/>

10. Out-of-flow Text

Out-of-flow text is handled in several ways. Sidebars are for short discussions that don't belong in the general flow, aren't suitable for their own section, and can easily be encapsulated as a one-page aside. Admonitions (e.g. warnings, cautions, tips, etc.) are like sidebars but attract attention to themselves with more dramatic formatting and often an icon. Footnotes are shorter notes that have only a weak connection to the text and should be removed from view to the bottom of the page.

10.1. Sidebars

A sidebar functions like a section, but cannot contain sections within itself. Any other block content is allowed. They can appear at any level in the document underneath the chapter level. For example:

<section>
  <title>Bathyscaph Care and Maintenance</title>
  <para>The hull of your submersible chamber is warranted
  for seven years against seal ruptures and corrosion of fittings.
  With proper care, you can extend the usable lifetime 
  considerably. Barnacles are the most common cause of
  metal fatigue and gasket deterioration (see the sidebar for
  tips in removing these pests).</para>

  <sidebar>
    <title>Scraping Barnacles</title>
    <para>You'll need a wire brush and a solution of equal parts
    vinegar and water. Pour the solution over the barnacles and let it
    sit for several hours. When the barnacle shells are soft, scrub
    them vigorously with the brush...</para>
    ...
  </sidebar>
  ...
</sect2>

10.2. Admonitions

To catch a reader's attention about a serious consideration, use an admonition. DocBook provides a whole bunch: caution, important, note, tip, and warning. In the absence of a title, either a default will be used (e.g. "WARNING!") or an icon will catch the reader's attention. Here's an example:

<caution>
  <para>Make sure your craft has reached the surface
  before unsealing the hatch. Otherwise, high-pressure 
  water will flood the compartment.</para>
</caution>

An admonition can appear in any section but cannot contain sections. Try to keep its content simple, using only paragraphs and lists if possible.

10.3. Footnotes

A <footnote> is coded as a block that interrupts a paragraph. It can look a little odd:

<para>When on Mars, be sure to 
visit the great volcano Olympus Mons<footnote>
  <para>It happens to be the tallest mountain in the Solar
  System, so bring your best hiking shoes.</para>
</footnote>...

When the same footnote applies to different places in a document, you can use a <footnoteref> element to reference it. The following example shows how:

<table>
  <tgroup cols="2">
    <row>
      <entry>apple</entry>
      <entry>red</entry>
    </row>
    <row>
      <entry>banana<footnote id="warning">
        <para>Peel it first!</para>
        </footnote></entry>
      <entry>yellow</entry>
    </row>
    <row>
      <entry>grape</entry>
      <entry>purple</entry>
    </row>
    <row>
      <entry>orange<footnoteref 
        linkend="warning"/></entry>
      <entry>orange</entry>
    </row>
  </tgroup>
</table>

10.4. Endnotes

In some cases, you want a footnote's text to appear in another section or chapter. The <endnote> element serves that function. It also stores the body of the note elsewhere. For example:

<para>We suspect that lightning often appears in hues other 
  than white. This hypothesis is supported by
  Dr. Indigo Riceway<endnote linkend="note-riceway"/>.</para>
...
<endnote id="note-riceway">
  <para>Riceway wrote about green lightning in his
  book...</para>
</endnote>

11. Reference Pages

One of the more complex block elements is <refentry>. It is used to encode a compact set of information about a command, application, or other technical entity. There are several different ways to format a reference entry, so we provide a role attribute to let you choose the one that best fits your book.

There are three main types supported:

11.1. Default Reference Pages

Here are three examples of common reference pages used in DocBook Lite:

Figure 1. A refentry from Samba, Appendix C

Example 1. How the above refentry is coded

<refentry>
 <refmeta>
  <refmiscinfo class="allowable values">YES, NO</refmiscinfo>
  <refmiscinfo class="default">NO</refmiscinfo>
 </refmeta>
 <refnamediv>
  <refname>alternate permissions = boolean</refname>
 </refnamediv>
 <refsynopsisdiv>
  <para>Obsolete. Has no effect in Samba 2. Files will be shown as
   read-only if the owner can't write them. In Samba 1.9 and
   earlier, setting this option would set the DOS filesystem read-only
   attribute on any file the user couldn't read. This in turn
   required the <literal>delete readonly</literal> 
   option.</para>
  </refsynopsisdiv>
</refentry>

Figure 2. A refentry from MySQL and mSQL, Chapter 21

Example 2. How the above refentry is coded

<refentry>
 <refmeta>
  <refentrytitle>DBI::do</refentrytitle>
 </refmeta>
 <refnamediv>
  <refname>DBI::do</refname>
 </refnamediv>
 <refsynopsisdiv>
  <synopsis>$rows_affected  = $db-&gt;do($statement);
$rows_affected  = $db-&gt;do($statement, \%unused);
$rows_affected  = 
    $db-&gt;do($statement, \%unused, @bind_values);</synopsis>
  <para><literal>DBI::do</literal> directly performs a
  non-<literal>SELECT</literal> SQL statement and 
  returns the number of rows affected by the statement. This is 
  faster than a <literal>DBI::prepare/DBI::execute 
  </literal>pair which requires two function calls. The 
  first argument is the SQL statement itself. The
  second argument is unused in DBD::mSQL and DBD::mysql, 
  but can hold a reference to a hash of attributes for other 
  DBD modules. The final argument is an array of values used 
  to replace `placeholders,' which are indicated with a
  `?' in the statement. The values of the array are
  substituted for the placeholders from left to right. As an
  additional bonus, <literal>DBI::do</literal> will 
  automatically quote string values before substitution.</para>
 </refsynopsisdiv>
 <refsect1>
  <title>Example</title>
  <programlisting>use DBI;
my $db = DBI-&gt;connect('DBI:mSQL:mydata',undef,undef);

my $rows_affected = 
  $db-&gt;do("UPDATE mytable SET name='Joe' WHERE name='Bob'");
print "$rows_affected Joe's were changed to Bob's\n";

my $rows_affected2 = 
  $db-&gt;do("INSERT INTO mytable (name) VALUES (?)",
				{}, ("Sheldon's Cycle"));
# After quoting and substitution, the statement:
# INSERT INTO mytable (name) VALUES ('Sheldon's Cycle')
# was sent to the database server.</programlisting>
 </refsect1>
</refentry>

Figure 3. A refentry from Apache: The Definitive Guide, Chapter 14

Example 3. How the above refentry is coded

<refentry>
 <refmeta>
  <refentrytitle>ap_pstrndup</refentrytitle>
 </refmeta>
 <refnamediv>
  <refname>ap_pstrndup</refname>
  <refpurpose>duplicate a string in a pool with 
    limited length</refpurpose>
 </refnamediv>
 <refsynopsisdiv>
  <synopsis>
    char *ap_pstrndup(pool *p, const char *s, int n)</synopsis>
  <para>Allocates <literal>n</literal>+1 bytes 
  of memory and copies up to <literal>n</literal> 
  characters from <literal>s</literal>,
  <literal>NULL</literal>- terminating the result. 
  The memory is destroyed when the pool is destroyed. Returns 
  a pointer to the new block of memory, or 
  <literal>NULL</literal> if <literal>s</literal>
  is <literal>NULL</literal>.</para>
 </refsynopsisdiv>
</refentry>

11.2. Nutshell Type Reference Pages

Nutshell books use a particular kind of reference structure that looks like this:

Figure 1. A refentry from Unix in a Nutshell, Chapter 5

Example 1. How the above refentry is coded

<nutlist longestterm="unseten">

 <nutentry><term>alias</term>
  <nutsynopsis><literal>alias</literal> 
  [<replaceable>name</replaceable> 
  [<replaceable>command</replaceable>]]</nutsynopsis>
  <nutentrybody>
   <para>Assign <emphasis>name</emphasis> 
    as the shorthand name, or alias, for 
    <emphasis>command</emphasis>.  If
    <emphasis>command</emphasis> is omitted, print the 
    alias for <emphasis>name</emphasis>; 
    if <emphasis>name</emphasis> is also
    omitted, print all aliases.  Aliases can be defined on the command
    line, but they are more often stored in 
    <literal>.cshrc</literal>
    so that they take effect after login.  (See [cross ref deleted]
    earlier in this chapter.)  Alias definitions can reference
    command-line arguments, much like the history list.  Use
    <literal>\!*</literal> to refer to all command-line
    arguments, <literal>\!^</literal> for the first argument,
    <literal>\!$</literal> for the last, etc.  An alias
    <emphasis>name</emphasis> can be any valid Unix 
    command; however, you lose the original command's meaning unless 
    you type <emphasis>\name</emphasis>.  
    See also <emphasis
    role="bold">unalias</emphasis>.</para>

    <refsect2><title>Examples</title> 
    <para>Set the size for <literal>xterm</literal> 
    windows under the X Window System:</para>
    <programlisting>alias R 'set noglob; eval `resize`; 
      unset noglob'</programlisting>
    <para>Show aliases that contain the string
     <emphasis>ls</emphasis>:</para>
    <programlisting>alias | grep ls</programlisting>
    <para>Run <literal>nroff</literal> 
    on all command-line arguments:</para>
    <programlisting>alias ms 'nroff -ms \!*'</programlisting>
    <para>Copy the file that is named as the first argument:</para>
    <programlisting>alias back 'cp \!^ \!^.old'</programlisting>
    <para>Use the regular <literal>ls</literal>, 
    not its alias:</para>
    <programlisting>% <userinput>\ls
      </userinput></programlisting>
   </refsect2>
  </nutentrybody>
 </nutentry>

</nutlist>

11.3. Java Type Reference Pages

This example shows how Java <refentry>s look:

Figure 1. A refentry from Java Fundamental Classes in a Nutshell, Chapter 32.

Example 1. How the above refentry is coded

<refentry role="java" 
    id="java.io.dataoutputstream">
 <refmeta>
  <refmiscinfo class="version">Java 1.0</refmiscinfo>
  <refmiscinfo class="package">java.io</refmiscinfo>
  <refmiscinfo class="flags">PJ1.1</refmiscinfo>
 </refmeta>
 <refnamediv>
  <refname>DataOutputStream</refname>
 </refnamediv>

 <refsect1 role="intro">
  <para>This class is a subclass of
   <literal>FilterOutputStream</literal> that allows 
   you to write Java primitive data types in a portable binary 
   format. Create a <literal>DataOutputStream</literal> 
   by specifying the <literal>OutputStream</literal> 
   that is to be filtered in the call to the constructor. 
   <literal>DataOutputStream</literal> has methods
   that output only primitive types; use
   <literal>ObjectOutputStream</literal> to output object
   values. </para>
 </refsect1>

 <refsynopsisdiv>
  <classsynopsis keyword="class">
   <modifiers>public</modifiers>
   <classname>DataOutputStream</classname>
   <extends><classref package="java.io" 
     class="FilterOutputStream"/></extends>
   <implements><classref 
     package="java.io" class="DataOutput"/></implements>

   <members><title>Public Constructors</title>
    <membergroup>
     <funcprototype revision="" role="method" flags="">
      <funcdef><modifiers>public</modifiers>
       <function>DataOutputStream</function>
      </funcdef>
      <paramdef>
       <type><classref role="includePkg" 
         package="java.io" class="OutputStream"/></type>
       <parameter>out</parameter>
      </paramdef>
     </funcprototype>
    </membergroup>
   </members>

   <members><title>Public Instance Methods</title>
    <membergroup>
     <funcprototype revision="" role="method" flags="">
      <funcdef><modifiers>public final</modifiers> 
       <type>int</type>
       <function>size</function>
      </funcdef>
     </funcprototype>
    </membergroup>
   </members>

   <members><title>Methods Implementing 
     <classref package="java.io" class="DataOutput"/></title>
    <membergroup>
     <funcprototype revision="" role="method" flags=" synchronized">
      <funcdef><modifiers>public</modifiers> <type>void</type>
       <function>write</function>
      </funcdef>
      <paramdef><type>int</type> <parameter>b</parameter></paramdef>
      <throws><classref package="java.io" class="IOException"/></throws>
     </funcprototype>
    </membergroup>
    <membergroup>
     <funcprototype revision="" role="method" flags=" synchronized">
      <funcdef><modifiers>public</modifiers> <type>void</type>
       <function>write</function>
      </funcdef>
      <paramdef>
       <type>byte[&thinsp;]</type> <parameter>b</parameter></paramdef>
      <paramdef><type>int</type> <parameter>off</parameter></paramdef>
      <paramdef><type>int</type> <parameter>len</parameter></paramdef>
      <throws><classref package="java.io" class="IOException"/></throws>
     </funcprototype>
    </membergroup>
    <membergroup>
     <funcprototype revision="" role="method" flags="">
      <funcdef><modifiers>public final</modifiers> <type>void</type>
       <function>writeBoolean</function>
      </funcdef>
      <paramdef><type>boolean</type> <parameter>v</parameter></paramdef>
      <throws><classref package="java.io" class="IOException"/></throws>
     </funcprototype>
    </membergroup>
    <membergroup>
     <funcprototype revision="" role="method" flags="">
      <funcdef><modifiers>public final</modifiers> <type>void</type>
       <function>writeByte</function>
      </funcdef>
      <paramdef><type>int</type> <parameter>v</parameter></paramdef>
      <throws><classref package="java.io" class="IOException"/></throws>
     </funcprototype>
    </membergroup>
    <membergroup>
     <funcprototype revision="" role="method" flags="">
      <funcdef><modifiers>public final</modifiers> <type>void</type>
       <function>writeInt</function>
      </funcdef>
      <paramdef><type>int</type> <parameter>v</parameter></paramdef>
      <throws><classref package="java.io" class="IOException"/></throws>
     </funcprototype>
    </membergroup>
   </members>

   <members><title>Public Methods Overriding <classref package="java.io"
     class="FilterOutputStream"/></title>
    <membergroup>
     <funcprototype revision="" role="method" flags="">
      <funcdef><modifiers>public</modifiers> <type>void</type>
       <function>flush</function>
      </funcdef>
      <throws><classref package="java.io" class="IOException"/></throws>
     </funcprototype>
    </membergroup>
   </members>

   <members>
    <title>Protected Instance Fields</title>
    <membergroup>
     <funcprototype revision="" role="field" flags="">
      <funcdef><modifiers>protected</modifiers> <type>int</type>
       <function>written</function>
      </funcdef>
     </funcprototype>
    </membergroup>
   </members>
  </classsynopsis>
 </refsynopsisdiv>

 <refsect1>
  <title>Hierarchy</title>
  <para>
   <literal><classref package="java.lang" class="Object"/> &rarr;
   <classref role="includePkg" package="java.io"
   class="OutputStream"/> &rarr; <classref package="java.io"
   class="FilterOutputStream"/> &rarr; <classref role="includePkg"
   package="java.io" class="DataOutputStream"/> (<classref
   package="java.io" class="DataOutput"/>)</literal></para>
 </refsect1>
</refentry>

12. See Also...

For a simpler markup language better suited for general-purpose documentation, try out SimpleDoc.


Written by:
Erik Ray eray@oreilly.com