org.apache.nutch.crawl
Class Generator.Selector

java.lang.Object
  extended by org.apache.nutch.crawl.Generator.Selector
All Implemented Interfaces:
Closeable, JobConfigurable, Mapper<Text,CrawlDatum,FloatWritable,Generator.SelectorEntry>, Partitioner<FloatWritable,Writable>, Reducer<FloatWritable,Generator.SelectorEntry,FloatWritable,Generator.SelectorEntry>
Enclosing class:
Generator

public static class Generator.Selector
extends Object
implements Mapper<Text,CrawlDatum,FloatWritable,Generator.SelectorEntry>, Partitioner<FloatWritable,Writable>, Reducer<FloatWritable,Generator.SelectorEntry,FloatWritable,Generator.SelectorEntry>

Selects entries due for fetch.


Constructor Summary
Generator.Selector()
           
 
Method Summary
 void close()
           
 void configure(JobConf job)
           
 int getPartition(FloatWritable key, Writable value, int numReduceTasks)
          Partition by host / domain or IP.
 void map(Text key, CrawlDatum value, OutputCollector<FloatWritable,Generator.SelectorEntry> output, Reporter reporter)
          Select & invert subset due for fetch.
 void reduce(FloatWritable key, Iterator<Generator.SelectorEntry> values, OutputCollector<FloatWritable,Generator.SelectorEntry> output, Reporter reporter)
          Collect until limit is reached.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Generator.Selector

public Generator.Selector()
Method Detail

configure

public void configure(JobConf job)
Specified by:
configure in interface JobConfigurable

close

public void close()
Specified by:
close in interface Closeable

map

public void map(Text key,
                CrawlDatum value,
                OutputCollector<FloatWritable,Generator.SelectorEntry> output,
                Reporter reporter)
         throws IOException
Select & invert subset due for fetch.

Specified by:
map in interface Mapper<Text,CrawlDatum,FloatWritable,Generator.SelectorEntry>
Throws:
IOException

getPartition

public int getPartition(FloatWritable key,
                        Writable value,
                        int numReduceTasks)
Partition by host / domain or IP.

Specified by:
getPartition in interface Partitioner<FloatWritable,Writable>

reduce

public void reduce(FloatWritable key,
                   Iterator<Generator.SelectorEntry> values,
                   OutputCollector<FloatWritable,Generator.SelectorEntry> output,
                   Reporter reporter)
            throws IOException
Collect until limit is reached.

Specified by:
reduce in interface Reducer<FloatWritable,Generator.SelectorEntry,FloatWritable,Generator.SelectorEntry>
Throws:
IOException


Copyright © 2012 The Apache Software Foundation