Using the luceneSupport optional tool The luceneSupport plugin is an optional tool that lets you use Apache Lucene to perform full-text indexing and searching of the contents of text columns. optional toolsluceneSupport luceneSupport optional tool toolsluceneSupport optional tool

The mainline API documentation for Apache Lucene at is a useful starting point for understanding Lucene's capabilities.

The luceneSupport plugin can be used only after a database has been fully upgraded to Release 10.11 or higher. (See "Upgrading a database" in the for more information.) The plugin cannot be used on a database that is at Release 10.10 or lower.
Terminology

The following concepts are important to an understanding of the luceneSupport plugin.

  • Analyzer: An analyzer is an implementation of org.apache.lucene.analysis.Analyzer. It extracts indexable terms from a block of text. The same analyzer should be used to index the text and to query it. An analyzer may perform language-specific tasks such as stemming and filtering. More information on analyzers can be found in the Lucene API documentation. Users can extend the existing Lucene analyzers or write their own custom analyzers.
  • Filtering: Filtering is the language-specific task of throwing away insignificant words such as articles and conjunctions.
  • Query-parsing: Query-parsing is the process of interpreting a Lucene query string. Lucene has its own query language. By extending the default Lucene QueryParser class, users can enhance the Lucene query language or replace it with some other query language.
  • Score: The score measures how well a query matches a block of text (a text column value). The higher the score, the better the match. The score is a float value. There is no minimum or maximum value.
  • Stemming: Stemming is the language-specific task of reducing related words to their common root. For instance, an English stemmer might map all of the following words onto the common root "house": "house", "houses", "housed", and "housing".
Classpath for running the luceneSupport optional tool

Before you run the luceneSupport optional tool, make sure that your classpath/modulepath contains the following jar files:

  • derbyshared.jar
  • derbytools.jar
  • derby.jar
  • derbyoptionaltools.jar
  • core: The core Lucene machinery. For Lucene 4.5.0, this is lucene-core-4.5.0.jar.
  • analyzers-common: The common Lucene analyzers. For Lucene 4.5.0, this is lucene-analyzers-common-4.5.0.jar.
  • queryparser: The basic Lucene logic for query-parsing. For Lucene 4.5.0, this is lucene-queryparser-4.5.0.jar.

The Lucene jar files are included in the source tree; alternatively, you can download them from .

Loading and unloading the luceneSupport optional tool

In a database protected by SQL authorization, only the database owner can issue the commands which load and unload the Lucene plugin. (See "Database Owner" in the Derby Security Guide for more information.)

Loading the plugin looks very much like loading any other optional tool. You call the SYSCS_UTIL.SYSCS_REGISTER_TOOL system procedure in a statement like the following:

call syscs_util.syscs_register_tool( 'luceneSupport', true );

This command creates the LUCENESUPPORT schema, which contains the following objects:

  • CREATEINDEX: A database procedure for indexing text columns. See for details.
  • UPDATEINDEX: A database procedure for refreshing an index built by CREATEINDEX. See for details.
  • DROPINDEX: A database procedure for dropping an index built by CREATEINDEX. See for details.
  • LISTINDEXES: A table function for listing the indexes created by CREATEINDEX. See for details.

Removing the plugin also looks much like unloading other optional tools. Call the SYSCS_UTIL.SYSCS_REGISTER_TOOL system procedure in a statement like the following:

call syscs_util.syscs_register_tool( 'luceneSupport', false );

This command does the following:

  • Drops Lucene directories: Deletes the directories which were created to hold the Lucene indexes
  • Drops schema objects: Drops all schema objects created by CREATEINDEX commands
  • Drops LUCENESUPPORT: Drops the LUCENESUPPORT schema and all schema objects which it contains

See the for information about the SYSCS_UTIL.SYSCS_REGISTER_TOOL system procedure.

Encryption and the luceneSupport tool

The luceneSupport tool may not be used on an encrypted database. Users who need full-text indexing of encrypted data should store the database in an encrypted directory or on an encrypted device.

Lucene versions

The community has tested the luceneSupport tool against the following versions of Lucene. Other versions of Lucene may or may not work.

  • 4.5.0
  • 4.7.1
  • 4.8.1
  • 4.9.0

cannot make any guarantees about the compatibility of two different versions of Lucene. Users should bear the following in mind:

  • No time travel: will raise an error if you try to use an earlier version of Lucene to read an index created by a later version of Lucene.
  • Bounce your indexes: When you change versions of Lucene, it is always safest to call LUCENESUPPORT.UPDATEINDEX on all of your existing Lucene indexes (see ).