org.apache.nutch.indexer.field
Class AnchorFields.Collector

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.nutch.indexer.field.AnchorFields.Collector
All Implemented Interfaces:
Closeable, Configurable, JobConfigurable, Mapper<Text,Writable,Text,ObjectWritable>, Reducer<Text,ObjectWritable,Text,FieldWritable>
Enclosing class:
AnchorFields

public static class AnchorFields.Collector
extends Configured
implements Mapper<Text,Writable,Text,ObjectWritable>, Reducer<Text,ObjectWritable,Text,FieldWritable>

Collects and creates FieldWritable objects from the inlinks. Inlinks are first sorted by descending score before being collected.


Constructor Summary
AnchorFields.Collector()
           
 
Method Summary
 void close()
           
 void configure(JobConf conf)
          Configures the jobs.
 void map(Text key, Writable value, OutputCollector<Text,ObjectWritable> output, Reporter reporter)
          Wraps values in ObjectWritable
 void reduce(Text key, Iterator<ObjectWritable> values, OutputCollector<Text,FieldWritable> output, Reporter reporter)
          Aggregates and sorts inlinks.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AnchorFields.Collector

public AnchorFields.Collector()
Method Detail

configure

public void configure(JobConf conf)
Configures the jobs. Sets maximum number of inlinks and whether to tokenize and store.

Specified by:
configure in interface JobConfigurable

close

public void close()
Specified by:
close in interface Closeable

map

public void map(Text key,
                Writable value,
                OutputCollector<Text,ObjectWritable> output,
                Reporter reporter)
         throws IOException
Wraps values in ObjectWritable

Specified by:
map in interface Mapper<Text,Writable,Text,ObjectWritable>
Throws:
IOException

reduce

public void reduce(Text key,
                   Iterator<ObjectWritable> values,
                   OutputCollector<Text,FieldWritable> output,
                   Reporter reporter)
            throws IOException
Aggregates and sorts inlinks. Then converts up to a max number to FieldWritable objects.

Specified by:
reduce in interface Reducer<Text,ObjectWritable,Text,FieldWritable>
Throws:
IOException


Copyright © 2006 The Apache Software Foundation