Parent Directory
|
Revision Log
| Links to HEAD: | (view) (annotate) |
| Sticky Revision: |
NUTCH-758 Set subversion eol-style to "native".
NUTCH-641 IndexSorter incorrectly copies stored fields.
NUTCH-598 - Remove deprecated use of ToolBase. Use generics in Hadoop API.
NUTCH-604 Upgrade to Lucene 2.3.0.
NUTCH-536 - Reduce number of warnings in nutch core.
Upgrade to Lucene 2.2.0 and Hadoop 0.12.3.
NUTCH-400 update headers
NUTCH-383: upgrade to Hadoop 0.7.1 and Lucene 2.0.0. NUTCH-373: replace DeleteDuplicates with a version that implements both parts of the algorithm. Add JUnit test.
This patch addresses two issues: * NUTCH-242: The code to activate url normalization and filtering has been refactored and extracted into CrawlDbFilter and LinkDbFilter. These two concerns (normmaliztion and filtering) have been made independent. Command line options have been modified to reflect these changes. * NUTCH-143: all command-line tools have been modified to return meaningful OS exit codes. At the moment this uses a modified copy of Hadoop's ToolBase, which will be removed when HADOOP-488 is fixed and Nutch upgrades to Hadoop 0.6.0 . All JUnit tests pass.
NUTCH-193: MapReduce and NDFS code moved to new project, Hadoop. See bug report for details.
Apply patches from NUTCH-169 (remove static NutchConf). Submitted by: Marko Bauhardt, Stefan Groschupf, Jerome Charron.
Add index sorter & ability to stop searching after N hits.
This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.
| apache@apache.org | ViewVC Help |
| Powered by ViewVC 1.1.2 |