Interface | Description |
---|---|
IndexingFilter |
Extension point for indexing.
|
IndexWriter |
Class | Description |
---|---|
CleaningJob |
The class scans CrawlDB looking for entries with status DB_GONE (404) or
DB_DUPLICATE and
sends delete requests to indexers for those documents.
|
CleaningJob.DBFilter | |
CleaningJob.DeleterReducer | |
IndexerMapReduce | |
IndexerOutputFormat | |
IndexingFilters |
Creates and caches
IndexingFilter implementing plugins. |
IndexingFiltersChecker |
Reads and parses a URL and run the indexers on it.
|
IndexingJob |
Generic indexer which relies on the plugins implementing IndexWriter
|
IndexWriters |
Creates and caches
IndexWriter implementing plugins. |
NutchDocument |
A
NutchDocument is the unit of indexing. |
NutchField |
This class represents a multi-valued field with a weight.
|
NutchIndexAction |
A
NutchIndexAction is the new unit of indexing holding the
document and action information. |
Exception | Description |
---|---|
IndexingException |
Copyright © 2014 The Apache Software Foundation