=== Lucene Status Report: March, 2009 === TLP Jukka Zitting was added to the PMC. The PMC also accepted a software grant to add PyLucene, a Python based port of Lucene into the Lucene family. LUCENE JAVA Lucene Java is a search-engine toolkit. Development has been active and we are working towards the release of 2.9. 2.4.1 was released on March 9, 2009. SOLR Solr is a full text search server. Development and the community is active. Solr is working towards the release of 1.4 NUTCH Nutch is a web-search engine: crawler, indexer and search runtime. Nutch is in the process of releasing version 1.0 LUCY After some public debate on the (lack) of progress on the project, we will be keeping a close eye on it over the next six months. People have expressed an interest in seeing it continue and some progress has been made to that end. LUCENE.NET (incubating) The .NET community is picking up some steam and has begun looking into graduation from the incubator. MAHOUT Apache Mahout is working towards building a suite of scalable machine learning libraries for text and data mining. Some progress has been made on adding more clustering algorithms as well as perceptron and winnow. We are working on the processes for a 0.1 release and expect to do that release soon. PyLucene PyLucene was donated by the Open Source Applications Foundation. Andi Vajda and Mike McCandless are the initial committers on the project. PyLucene is working towards it's first release as an Apache hosted project. TIKA Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. The first candidate for the 0.3 release is already in place and the release should be pushed out in March. Metadata handling and metadata frameworks like XMP have been a source of much discussion, but so far no clear consensus on has been reached on whether or how the metadata features in Tika should be extended. A wiki was created for Tika.