Parent Directory
|
Revision Log
| Links to HEAD: | (view) (annotate) |
| Sticky Revision: |
NUTCH-758 Set subversion eol-style to "native".
NUTCH-662: Upgrade Nutch to use Lucene 2.4
NUTCH-634 Upgrade Nutch to Hadoop 0.17.1 .
NUTCH-580 Remove deprecated hadoop api calls (FS)
NUTCH-552 - Upgrade Nutch to Hadoop 0.15.x.
Upgrade to Lucene 2.2.0 and Hadoop 0.12.3.
NUTCH-400 update headers
Change parameters passed to Hadoop's FileSystem from (now-deprecated) java.io.File to (new) org.apache.hadoop.fs.Path.
NUTCH-193: MapReduce and NDFS code moved to new project, Hadoop. See bug report for details.
Apply patches from NUTCH-169 (remove static NutchConf). Submitted by: Marko Bauhardt, Stefan Groschupf, Jerome Charron.
Merge mapred branch to trunk & remove it.
Store checksums for all files written and verify them on read. CRCs are stored for every 512 bytes of data, so that randomly accessed data may be verified. Errors are reported to the filesystem implementation. Local file errors cause files to be moved to a bad file directory, so that bad disk areas are not reused. NDFS file errors should cause blocks to be moved to a bad block directory on the datanode, forcing the use of replicas of the bad blocks with no loss of data. This is not yet implemented for NDFS.
First working version of MapReduce-based dedup.
Get search working on NDFS-resident, MapReduce-created crawl.
New class to permit reading of Lucene indexes stored in NDFS. Writing is not yet supported, since Lucene (in only one place!) requires random access when writing indexes, and NDFS does not support random access when writing files.
This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.
| apache@apache.org | ViewVC Help |
| Powered by ViewVC 1.1.2 |