=== Lucene Status Report: March, 2010 === TLP The TLP is considering some restructuring of subprojects per Board suggestions in December. Solr and Lucene are merging committers (there is already quite a bit of overlap) and development efforts, but maintaining separate user lists and artifacts. Mahout and Tika have both begun discussions on becoming TLPs and all signs are positive for such a move, but there is no board resolution to consider yet. The TLP has elected to sponsor incubation of the Lucene Connectors Framework. The project is now underway in the incubator. We expect this project will become a TLP as well. Added Mark Miller as a PMC Member. LUCENE JAVA Lucene Java is a search-engine toolkit. Development has been active and we have released 2.9.2 and 3.0.1. Added Chris Male as a committer. SOLR Solr is a full text search server. Development and the community is active. Community is working toward a 1.5 release. NUTCH Nutch is a web-search engine: crawler, indexer and search runtime. Bug fixes and other improvements have been flying by, with many of the issues being addressed by new Nutch committer Julien Nioche. Work has been performed to integrate Tika parsing into Nutch (in addition to the existing work to integrate Tika's mime detection functionality). Community is working towards a 1.1. release. Added Julien Nioche as a committer. LUCY Lucy is a loose C port of Lucene targeted at dynamic language bindings. Basic thread support for the object system was completed. The community decided to transition from C89 to a dialect defined by the intersection of C99 and C++. LUCENE.NET Lucene.NET is a .NET based port of Lucene Java. Development and the community are active. Community is working towards a 2.9.2 release. Added Michael Garski as a committer MAHOUT Apache Mahout is working towards building a suite of scalable machine learning libraries for text and data mining. Development is active and we are working towards a 0.3 release. The Mahout community has begun discussing becoming a TLP and will likely request such a move after the 0.3 release is final. Added Drew Farris as a committer. Added Benson Margulies as a committer. Open Relevance Project The Open Relevance Project is a new project aimed at providing Lucene and others tools for judging the quality of search and machine learning approaches. We added support for a third test collection: the TREC9 filtering corpus, added documentation, and improved use with Lucene's benchmarking package. PyLucene PyLucene is a Python integration of Lucene Java. Development is active. PyLucene 3.0.1-1 and 2.9.2-1 were released this quarter. TIKA Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Progress has been steady, with 2 remaining issues in JIRA ready for a 0.7 release, which should happen likely within the next month or so.