Parent Directory
|
Revision Log
| Links to HEAD: | (view) (annotate) |
| Sticky Revision: |
NUTCH-758 Set subversion eol-style to "native".
NUTCH-634 Upgrade Nutch to Hadoop 0.17.1 .
NUTCH-598 - Remove deprecated use of ToolBase. Use generics in Hadoop API.
NUTCH-580 Remove deprecated hadoop api calls (FS)
NUTCH-510 - IndexMerger delete working dir. Contributed by Enis.
Upgrade to Hadoop 0.11.2 and Lucene 2.1.0 releases.
NUTCH-400 update headers
NUTCH-383: upgrade to Hadoop 0.7.1 and Lucene 2.0.0. NUTCH-373: replace DeleteDuplicates with a version that implements both parts of the algorithm. Add JUnit test.
This patch addresses two issues: * NUTCH-242: The code to activate url normalization and filtering has been refactored and extracted into CrawlDbFilter and LinkDbFilter. These two concerns (normmaliztion and filtering) have been made independent. Command line options have been modified to reflect these changes. * NUTCH-143: all command-line tools have been modified to return meaningful OS exit codes. At the moment this uses a modified copy of Hadoop's ToolBase, which will be removed when HADOOP-488 is fixed and Nutch upgrades to Hadoop 0.6.0 . All JUnit tests pass.
NUTCH-341 - if -workingdir is specified, always create a unique subdir. Also, use unique directory names to allow multiple IndexMergers to run simultaneously.
NUTCH-309 : Added logging code guards
NUTCH-303 : Make use of the Commons Logging API and use log4j as the default implementation
Change parameters passed to Hadoop's FileSystem from (now-deprecated) java.io.File to (new) org.apache.hadoop.fs.Path.
NUTCH-221, removed deprecated Lucene API usage
Undo unintentional changes made in r381751. Thanks, Jerome, for catching this!
Adding DOAP for Nutch. Contributed by Chris Mattmann.
NUTCH-193: MapReduce and NDFS code moved to new project, Hadoop. See bug report for details.
removed unused imports
Apply patches from NUTCH-169 (remove static NutchConf). Submitted by: Marko Bauhardt, Stefan Groschupf, Jerome Charron.
Merge mapred branch to trunk & remove it.
Moving Nutch from the Incubator to Lucene.
Add ability to set Lucene's term index interval from config.
Initial import of Nutch to Apache.
This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.
| apache@apache.org | ViewVC Help |
| Powered by ViewVC 1.1.2 |