org.apache.nutch.indexer.feed
Class FeedIndexingFilter
java.lang.Object
org.apache.nutch.indexer.feed.FeedIndexingFilter
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, IndexingFilter, FieldPluggable, Pluggable
public class FeedIndexingFilter
- extends Object
- implements IndexingFilter
- Since:
- NUTCH-444
An
IndexingFilter
implementation to pull out the
relevant extracted Metadata
fields from the RSS feeds
and into the index.
- Author:
- dogacan, mattmann
Method Summary |
NutchDocument |
filter(NutchDocument doc,
Parse parse,
org.apache.hadoop.io.Text url,
CrawlDatum datum,
Inlinks inlinks)
Extracts out the relevant fields:
FEED_AUTHOR
FEED_TAGS
FEED_PUBLISHED
FEED_UPDATED
FEED
And sends them to the Indexer for indexing within the Nutch
index. |
org.apache.hadoop.conf.Configuration |
getConf()
|
void |
setConf(org.apache.hadoop.conf.Configuration conf)
Sets the Configuration object used to configure this
IndexingFilter . |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
dateFormatStr
public static final String dateFormatStr
- See Also:
- Constant Field Values
FeedIndexingFilter
public FeedIndexingFilter()
filter
public NutchDocument filter(NutchDocument doc,
Parse parse,
org.apache.hadoop.io.Text url,
CrawlDatum datum,
Inlinks inlinks)
throws IndexingException
- Extracts out the relevant fields:
- FEED_AUTHOR
- FEED_TAGS
- FEED_PUBLISHED
- FEED_UPDATED
- FEED
And sends them to the Indexer
for indexing within the Nutch
index.
- Throws:
IndexingException
getConf
public org.apache.hadoop.conf.Configuration getConf()
- Specified by:
getConf
in interface org.apache.hadoop.conf.Configurable
- Returns:
- the
Configuration
object used to configure
this IndexingFilter
.
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf)
- Sets the
Configuration
object used to configure this
IndexingFilter
.
- Specified by:
setConf
in interface org.apache.hadoop.conf.Configurable
- Parameters:
conf
- The Configuration
object used to configure
this IndexingFilter
.
Copyright © 2013 The Apache Software Foundation