/[Apache-SVN]/lucene/nutch/trunk/src/java/org/apache/nutch/indexer/DeleteDuplicates.java
ViewVC logotype

Log of /lucene/nutch/trunk/src/java/org/apache/nutch/indexer/DeleteDuplicates.java

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Revision:

Revision 823614 - (view) (annotate) - [select for diffs]
Modified Fri Oct 9 17:02:32 2009 UTC (6 weeks, 4 days ago) by ab
File length: 17232 byte(s)
Diff to previous 722475 (colored)
NUTCH-758 Set subversion eol-style to "native".

Revision 722475 - (view) (annotate) - [select for diffs]
Modified Tue Dec 2 14:41:09 2008 UTC (11 months, 3 weeks ago) by kubes
File length: 17232 byte(s)
Diff to previous 678533 (colored)
NUTCH-662: Upgrade Nutch to use Lucene 2.4

Revision 678533 - (view) (annotate) - [select for diffs]
Modified Mon Jul 21 19:20:21 2008 UTC (16 months ago) by ab
File length: 17185 byte(s)
Diff to previous 669300 (colored)
NUTCH-634 Upgrade Nutch to Hadoop 0.17.1 .

Revision 669300 - (view) (annotate) - [select for diffs]
Modified Wed Jun 18 21:34:17 2008 UTC (17 months, 1 week ago) by ab
File length: 16929 byte(s)
Diff to previous 638779 (colored)
Avoid NPE when pocessing empty / corrupted indexes.

Revision 638779 - (view) (annotate) - [select for diffs]
Modified Wed Mar 19 10:34:14 2008 UTC (20 months, 1 week ago) by ab
File length: 16904 byte(s)
Diff to previous 593263 (colored)
NUTCH-598 - Remove deprecated use of ToolBase. Use generics in Hadoop API.

Revision 593263 - (view) (annotate) - [select for diffs]
Modified Thu Nov 8 19:13:37 2007 UTC (2 years ago) by dogacan
File length: 16694 byte(s)
Diff to previous 591791 (colored)
NUTCH-494 - FindBugs: CrawlDbReader and DeleteDuplicates.

Revision 591791 - (view) (annotate) - [select for diffs]
Modified Sun Nov 4 15:38:35 2007 UTC (2 years ago) by kubes
File length: 16694 byte(s)
Diff to previous 559754 (colored)
NUTCH-552 - Upgrade Nutch to Hadoop 0.15.x.

Revision 559754 - (view) (annotate) - [select for diffs]
Modified Thu Jul 26 08:44:33 2007 UTC (2 years, 4 months ago) by dogacan
File length: 16684 byte(s)
Diff to previous 532105 (colored)
NUTCH-525 - DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment. Contributed by Vishal Shah.

Revision 532105 - (view) (annotate) - [select for diffs]
Modified Tue Apr 24 22:13:53 2007 UTC (2 years, 7 months ago) by ab
File length: 16684 byte(s)
Diff to previous 495397 (colored)
Prevent NPE when working with small, possibly empty indexes.

Revision 495397 - (view) (annotate) - [select for diffs]
Modified Thu Jan 11 22:00:51 2007 UTC (2 years, 10 months ago) by ab
File length: 16376 byte(s)
Diff to previous 495392 (colored)
Fix NUTCH-420 - DeleteDuplicates depended on the order of IndexDoc
processing..

Revision 495392 - (view) (annotate) - [select for diffs]
Modified Thu Jan 11 21:51:20 2007 UTC (2 years, 10 months ago) by ab
File length: 16359 byte(s)
Diff to previous 473936 (colored)
Upgrade to Hadoop 0.10.1. HTTPClient is now a dependency - move it
to lib/ and remove it as a plugin.

Add also native Linux libraries for Hadoop compression, plus corresponding
logic in bin/nutch.

Hadoop uses larger buffers now - explicitly set large heap size for
JUnit tests. All tests should pass now.

Revision 473936 - (view) (annotate) - [select for diffs]
Modified Sun Nov 12 11:37:02 2006 UTC (3 years ago) by siren
File length: 16309 byte(s)
Diff to previous 464654 (colored)
NUTCH-400 update headers

Revision 464654 - (view) (annotate) - [select for diffs]
Added Mon Oct 16 20:38:57 2006 UTC (3 years, 1 month ago) by ab
File length: 16121 byte(s)
NUTCH-383: upgrade to Hadoop 0.7.1 and Lucene 2.0.0.

NUTCH-373: replace DeleteDuplicates with a version that implements both
parts of the algorithm. Add JUnit test.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

apache@apache.org
ViewVC Help
Powered by ViewVC 1.1.2