Apache > Lucene
 

Apache Lucene - Overview

Apache Lucene

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Apache Lucene is an open source project available for free download. Please use the links on the left to access Lucene.

Lucene News

6 November 2009 - Lucene Java 2.9.1 available

This release fixes bugs from 2.9.0, including one serious bug whereby BooleanQuery could silently fail to retrieve certain matching documents.

There are also some minor API changes, including a Version parameter added to QueryParser and contrib Analyzers, so that version dependent defaults are consistent across classes, as well as un-deprecating of certain methods (we were too zealous in a few cases!).

Otherwise the changes are all bug fixes and documentation improvements.

This release is fully compatible with 2.9.0. We strongly recommend upgrading to 2.9.1 if you are using 2.9.0. Furthermore, because some additional APIs were deprecated in 2.9.1, to ensure a clean ("JAR drop in") upgrade to 3.0 you'll need to ensure your code compiles against 2.9.1 without deprecation warnings.

See CHANGES for details.

Binary and source distributions are available here.

Maven artifacts are available here.

07 Oct. 2009 - Lucene at US ApacheCon November 2-6, 2009

ApacheCon Logo ApacheCon US 2009 Live Video Streaming ApacheCon US is once again in the Bay Area and Lucene is coming along for the ride! The Lucene community has planned two full days of talks, plus a meetup and the usual bevy of training. With a well-balanced mix of first time and veteran ApacheCon speakers, the Lucene track at ApacheCon US promises to have something for everyone. Be sure not to miss:

Training:

Thursday, Nov. 5th

Friday, Nov. 6th (these sessions are all available via FREE ApacheCon US live video streams!)

25 September 2009 - Lucene Java 2.9.0 available

This release has many improvements since release 2.4.1, including:

  • Per segment searching and caching (can lead to much faster reopen among other things)
  • Near real-time search capabilities added to IndexWriter
  • New Query types
  • Smarter, more scalable multi-term queries (wildcard, range, etc)
  • A freshly optimized Collector/Scorer API
  • Improved Unicode support and the addition of Collation contrib
  • A new Attribute based TokenStream API
  • A new QueryParser framework in contrib with a core QueryParser replacement impl included.
  • Scoring is now optional when sorting by Field, or using a custom Collector, gaining sizable performance when scores are not required.
  • New analyzers (PersianAnalyzer, ArabicAnalyzer, SmartChineseAnalyzer)
  • New fast-vector-highlighter for large documents
  • Lucene now includes high-performance handling of numeric fields. Such fields are indexed with a trie structure, enabling simple to use and much faster numeric range searching without having to externally pre-process numeric values into textual values.
See CHANGES for details.

While we generally try and maintain full backwards compatibility between major versions, Lucene 2.9 has a variety of breaks that are spelled out in the 'Changes in backwards compatibility policy' section of CHANGES. We recommend that you recompile your application with Lucene 2.9 rather than attempting to “drop” it in. This will alert you to any issues you may have to fix if you are affected by one of the backward compatibility breaks.

Binary and source distributions are available here.

Maven artifacts are available here.

9 March 2009 - Lucene Java 2.4.1 available

This release contains fixes for bugs found in 2.4.0, including one data loss bug (LUCENE-1452) where in certain situations binary fields would be truncated to 0 bytes.

See CHANGES for details.

2.4.1 does not contain any new features, API or file format changes, which makes it fully compatible with 2.4.0.

Binary and source distributions are available here.

Maven artifacts are available here.

09 February 2009 - Lucene at ApacheCon Europe 2009 in Amsterdam

ApacheCon EU 2009 Logo Lucene will be extremely well represented at ApacheCon EU 2009 in Amsterdam, Netherlands this March 23-27, 2009:

8 October 2008 - Lucene Java 2.4.0 available

This release has many improvements since release 2.3.2, including:

  • New InstantiatedIndex (contrib/instantiated): RAM-based index that enables much faster searching than RAMDirectory.
  • New IndexWriter constructors now default autoCommit to false.
  • New commit() method in IndexWriter lets you control when changes are made visible and permanent in the index.
  • A machine or OS crash, or power loss, while IndexWriter is writing to an index will no longer corrupt the index.
  • TimeLimitedCollector adds timeout to searches.
  • Delete documents by Query in IndexWriter.
  • Pure boolean indexing (no frequency, positions nor payloads are indexed) using Field.setOmitTf().
  • A new Directory implementation, NIOFSDirectory, using java.nio's APIs to allow multiple threads to read from the same open file without locking.
  • IndexWriter.expungeDeletes() reclaims disk space from deleted documents by merging away segments that have deletions.
  • All filters now return a DocIdSet instead of java.util.BitSet, making filters more efficient and flexible.
  • Searching with a Filter is more efficient: now the filter is applied to a document before scoring is done.
  • IndexReader can be opened with new readOnly=true mode, which gives better performance in a multi-threaded environment.
See CHANGES for details.

Lucene 2.4.0 includes index format changes that are not readable by older versions of Lucene. Lucene 2.4.0 can both read and update older Lucene indexes. Adding to an index with an older format will cause it to be converted to the newer format.

Binary and source distributions are available here.

Maven artifacts are available here.

06 May 2008 - Lucene Java 2.3.2 available

This release contains fixes for bugs found in 2.3.1.

See CHANGES.txt for a detailed listing of changes.

2.3.2 does not contain any new features, API or file format changes, which makes it fully compatible to 2.3.0 and 2.3.1.

Binary and source distributions are available here.

Maven artifacts are available here.

23 February 2008 - Lucene Java 2.3.1 available

This release contains fixes for serious bugs in 2.3.0 that could cause index corruptions in autoCommit=false mode or in cases where multiple threads are adding documents where some have term-vector enabled fields and some don't. The autoCommit option was added to IndexWriter with release 2.2.0. If not explicitly set to false, the IndexWriter runs in autoCommit=true mode.

See CHANGES.txt for a detailed listing of changes.

2.3.1 does not contain any new features, API or file format changes, which makes it fully compatible to 2.3.0.

We would like to encourage everyone who is currently using Lucene Java 2.3.0 to upgrade to 2.3.1 to prevent possible index corruptions!

Binary and source distributions are available here.

Maven artifacts are available here.

24 January 2008 - Lucene Java 2.3.0 available

This release has many improvements since release 2.2, including:

  • Significantly improved indexing performance
  • Segment merging in background threads
  • Refreshable IndexReaders
  • Faster StandardAnalyzer and improved Token API
  • TermVectorMapper to customize how term vectors are loaded
  • Live backups (without pausing indexing) with SnapshotDeletionPolicy
  • CheckIndex tool to test and recover a corrupt index
  • Pluggable MergePolicy and MergeScheduler
  • "Partial" optimize(int maxNumSegments) method
  • New contrib module for working with Wikipedia content

In addition Lucene 2.3.0 has many performance improvements, bug fixes, etc. See CHANGES.txt for details.

Lucene 2.3.0 includes index format changes that are not readable by older versions of Lucene. Lucene 2.3.0 can both read and update older Lucene indexes. Adding to an index with an older format will cause it to be converted to the newer format.

Binary and source distributions are available here.

Maven artifacts are available here.

23 January 2008 - Lucene at ApacheCon Europe

ApacheCon EU logo Lucene projects will be well represented at ApacheCon Europe in Amsterdam this year. Please join us at one or more of the following sessions:

24 December 2007 - Nightly Snapshots available in the Apache Maven Snapshot Repository

We are now publishing nightly artifacts to the Maven Snapshot Repository. The current version is '2.3-SNAPSHOT'.

The artifacts include:

  • Binary jars
  • Sources
  • Javadocs
You can find separate artifacts for the core, demo, and the different contrib modules.

Merry Christmas!

26 August 2007 - Lucene at ApacheCon Atlanta

ApacheCon US logo Lucene will once again be well represented at ApacheCon USA in Atlanta this November 12-16, 2007.

The following talks and trainings are scheduled for this year's conference:

19 June 2007 - Release 2.2 available

This release has many improvements since release 2.1. New major features:

In addition Lucene 2.2 has many performance improvements, bug fixes, etc. See CHANGES.txt for details.

Lucene 2.2 includes index format changes that are not readable by older versions of Lucene. Lucene 2.2 can both read and update older Lucene indexes. Adding to an index with an older format will cause it to be converted to the newer format.

Binary and source distributions are available here.

18 February 2007 - Lucene at ApacheCon Europe

ApacheCon Europe logo Lucene Java and related Lucene projects will have extensive representation at ApacheCon Europe in Amsterdam this year. For the 2007 session, Yonik Seeley will be giving the Full-Text Search with Lucene talk at 10:30 am on May 2nd. Immediately following, Grant Ingersoll will be presenting Advanced Lucene at 11:30. Grant will also be leading a full day tutorial session on May 1st titled Lucene Boot Camp.

Lucene related talks include Solr committer Bertrand Delacrétaz's talk titled Beyond full-text searches with Solr and Lucene and Hadoop committer Owen O'Malley's Introduction to Hadoop.

Registration is now open on the ApacheCon website.

17 February 2007 - Release 2.1 available

This release has many improvements since release 2.0, including new features, performance improvements, bug fixes, etc. See CHANGES.txt for details.

Lucene 2.1 includes index format changes that are not readable by older versions of Lucene. Lucene 2.1 can both read and update older Lucene indexes. Adding to an index with an older format will cause it to be converted to the newer format.

Binary and source distributions are available here.

3 January 2007 - Nightly Source builds available

Nightly source builds of the current development version of Lucene are now available at http://people.apache.org/builds/lucene/java/nightly/. Files are named lucene-DATE-src.tar.gz where DATE is the date of the build.