org.apache.nutch.tools.arc
Class ArcInputFormat
java.lang.Object
org.apache.hadoop.mapred.FileInputFormat<Text,BytesWritable>
org.apache.nutch.tools.arc.ArcInputFormat
- All Implemented Interfaces:
- InputFormat<Text,BytesWritable>
public class ArcInputFormat
- extends FileInputFormat<Text,BytesWritable>
A input format the reads arc files.
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat |
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ArcInputFormat
public ArcInputFormat()
getRecordReader
public RecordReader<Text,BytesWritable> getRecordReader(InputSplit split,
JobConf job,
Reporter reporter)
throws IOException
- Returns the
RecordReader
for reading the arc file.
- Specified by:
getRecordReader
in interface InputFormat<Text,BytesWritable>
- Specified by:
getRecordReader
in class FileInputFormat<Text,BytesWritable>
- Parameters:
split
- The InputSplit of the arc file to process.job
- The job configuration.reporter
- The progress reporter.
- Throws:
IOException
Copyright © 2011 The Apache Software Foundation