org.apache.nutch.crawl
Class URLPartitioner
java.lang.Object
org.apache.nutch.crawl.URLPartitioner
- All Implemented Interfaces:
- JobConfigurable, Partitioner<Text,Writable>
public class URLPartitioner
- extends Object
- implements Partitioner<Text,Writable>
Partition urls by host, domain name or IP depending on the value of the
parameter 'partition.url.mode' which can be 'byHost', 'byDomain' or 'byIP'
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PARTITION_MODE_KEY
public static final String PARTITION_MODE_KEY
- See Also:
- Constant Field Values
PARTITION_MODE_HOST
public static final String PARTITION_MODE_HOST
- See Also:
- Constant Field Values
PARTITION_MODE_DOMAIN
public static final String PARTITION_MODE_DOMAIN
- See Also:
- Constant Field Values
PARTITION_MODE_IP
public static final String PARTITION_MODE_IP
- See Also:
- Constant Field Values
URLPartitioner
public URLPartitioner()
configure
public void configure(JobConf job)
- Specified by:
configure
in interface JobConfigurable
close
public void close()
getPartition
public int getPartition(Text key,
Writable value,
int numReduceTasks)
- Hash by domain name.
- Specified by:
getPartition
in interface Partitioner<Text,Writable>
Copyright © 2011 The Apache Software Foundation