XmlCollectionWithTagInputFormat (VXQuery 0.6 API)

java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<K,V>
- - org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
  - - org.apache.hadoop.mapreduce.lib.input.TextInputFormat
    - - org.apache.vxquery.hdfs2.XmlCollectionWithTagInputFormat

public class XmlCollectionWithTagInputFormat
extends org.apache.hadoop.mapreduce.lib.input.TextInputFormat

Reads records that are delimited by a specific begin/end tag.

Nested Class Summary

Nested Classes
Modifier and Type	Class	Description
`static class`	`XmlCollectionWithTagInputFormat.XmlRecordReader`	XMLRecordReader class to read through a given xml document to output xml blocks as records as specified by the end tag

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter

Field Summary

Fields
Modifier and Type Field Description

static String ENDING_TAG

static String STARTING_TAG
- Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
  DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE

Fields
Modifier and Type	Field	Description
`static String`	`ENDING_TAG`
`static String`	`STARTING_TAG`

Constructor Summary

Constructors
Constructor Description

XmlCollectionWithTagInputFormat()

Constructors
Constructor	Description
`XmlCollectionWithTagInputFormat()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)`

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.TextInputFormat
isSplitable

Field Detail

STARTING_TAG
```
public static String STARTING_TAG
```

ENDING_TAG
```
public static String ENDING_TAG
```

Constructor Detail
- XmlCollectionWithTagInputFormat
```
public XmlCollectionWithTagInputFormat()
```

Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                                                org.apache.hadoop.mapreduce.TaskAttemptContext context)

Overrides:: createRecordReader in class org.apache.hadoop.mapreduce.lib.input.TextInputFormat

Class XmlCollectionWithTagInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Methods inherited from class java.lang.Object

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.TextInputFormat

Field Detail

STARTING_TAG

ENDING_TAG

Constructor Detail

XmlCollectionWithTagInputFormat

Method Detail

createRecordReader