org.apache.nutch.tools.arc
Class ArcInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat<Text,BytesWritable>
      extended by org.apache.nutch.tools.arc.ArcInputFormat
All Implemented Interfaces:
InputFormat<Text,BytesWritable>

public class ArcInputFormat
extends FileInputFormat<Text,BytesWritable>

A input format the reads arc files.


Field Summary
 
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
LOG
 
Constructor Summary
ArcInputFormat()
           
 
Method Summary
 RecordReader<Text,BytesWritable> getRecordReader(InputSplit split, JobConf job, Reporter reporter)
          Returns the RecordReader for reading the arc file.
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ArcInputFormat

public ArcInputFormat()
Method Detail

getRecordReader

public RecordReader<Text,BytesWritable> getRecordReader(InputSplit split,
                                                        JobConf job,
                                                        Reporter reporter)
                                                 throws IOException
Returns the RecordReader for reading the arc file.

Specified by:
getRecordReader in interface InputFormat<Text,BytesWritable>
Specified by:
getRecordReader in class FileInputFormat<Text,BytesWritable>
Parameters:
split - The InputSplit of the arc file to process.
job - The job configuration.
reporter - The progress reporter.
Throws:
IOException


Copyright © 2011 The Apache Software Foundation