2011-07-21 Apache Nutch Apache Nutch is an open source web-search software project. Apache Nutch is an open source web-search software project. Stemming from Apache Lucene, it now builds on Apache Solr adding web-specifics, such as a crawler, a link-graph database and parsing support handled by Apache Tika for HTML and and array other document formats. Apache Nutch can run on a single machine, but gains a lot of its strength from running in a Hadoop cluster The system can be enhanced (eg other document formats can be parsed) using a highly flexible, easily extensible and thoroughly maintained plugin infrastructure. Java Apache Nutch 1.5.1 2012-07-10 1.5.1 Apache Nutch 2.0 2012-07-07 2.0 Apache Nutch 1.5 2012-06-07 1.5 Apache Nutch 1.4 2011-04-11 1.4 Apache Nutch 1.3 2011-06-07 1.3 branch-1.0 nutch-1.0 2009-03-23 1.0 branch-0.9 nutch-0.9 2007-04-01 0.9 branch-0.8 nutch-0.8.1 2006-09-24 0.8.1 branch-0.8 nutch-0.8 2006-06-25 0.8 branch-0.7 nutch-0.7.2 2006-03-31 0.7.2 Nutch PMC