VectorizedParquetRecordReader (Hive 2.3.9 API)

java.lang.Object
- org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase
- - org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader

All Implemented Interfaces:

org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
```
public class VectorizedParquetRecordReader
extends ParquetRecordReaderBase
implements org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
```
This reader is used to read a batch of record from inputsplit, part of the code is referred from Apache Spark and Apache Parquet.

Field Summary

Fields
Modifier and Type	Field and Description
`protected org.apache.parquet.schema.MessageType`	`fileSchema`
`static org.slf4j.Logger`	`LOG`
`protected org.apache.parquet.schema.MessageType`	`requestedSchema`
`protected long`	`totalRowCount` The total number of rows this RecordReader will eventually read.

Fields inherited from class org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase
file, filtedBlocks, jobConf, projectionPusher, reader, schemaSize, serDeStats, skipTimestampConversion

Constructor Summary

Constructors
Constructor and Description
`VectorizedParquetRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit, org.apache.hadoop.mapred.JobConf conf)`
`VectorizedParquetRecordReader(org.apache.hadoop.mapred.InputSplit oldInputSplit, org.apache.hadoop.mapred.JobConf conf)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`close()`
`org.apache.hadoop.io.NullWritable`	`createKey()`
`org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch`	`createValue()`
`long`	`getPos()`
`float`	`getProgress()`
`void`	`initialize(org.apache.hadoop.mapreduce.InputSplit oldSplit, org.apache.hadoop.mapred.JobConf configuration)`
`boolean`	`next(org.apache.hadoop.io.NullWritable nullWritable, org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch vectorizedRowBatch)`

Methods inherited from class org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase
getFiltedBlocks, getSplit, getStats, setFilter

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LOG
```
public static final org.slf4j.Logger LOG
```
  - fileSchema
```
protected org.apache.parquet.schema.MessageType fileSchema
```
  - requestedSchema
```
protected org.apache.parquet.schema.MessageType requestedSchema
```
  - totalRowCount
```
protected long totalRowCount
```
    The total number of rows this RecordReader will eventually read. The sum of the rows of all the row groups.
- Constructor Detail
  - VectorizedParquetRecordReader
```
public VectorizedParquetRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit,
                                     org.apache.hadoop.mapred.JobConf conf)
```
  - VectorizedParquetRecordReader
```
public VectorizedParquetRecordReader(org.apache.hadoop.mapred.InputSplit oldInputSplit,
                                     org.apache.hadoop.mapred.JobConf conf)
```
- Method Detail
  - initialize
```
public void initialize(org.apache.hadoop.mapreduce.InputSplit oldSplit,
                       org.apache.hadoop.mapred.JobConf configuration)
                throws IOException,
                       InterruptedException
```
    Throws:
    
    IOException
    
    InterruptedException
  - next
```
public boolean next(org.apache.hadoop.io.NullWritable nullWritable,
                    org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch vectorizedRowBatch)
             throws IOException
```
    Specified by:
    
    next in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
    
    Throws:
    
    IOException
  - createKey
```
public org.apache.hadoop.io.NullWritable createKey()
```
    Specified by:
    
    createKey in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
  - createValue
```
public org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch createValue()
```
    Specified by:
    
    createValue in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
  - getPos
```
public long getPos()
            throws IOException
```
    Specified by:
    
    getPos in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
    
    Throws:
    
    IOException
  - close
```
public void close()
           throws IOException
```
    Specified by:
    
    close in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
    
    Throws:
    
    IOException
  - getProgress
```
public float getProgress()
                  throws IOException
```
    Specified by:
    
    getProgress in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
    
    Throws:
    
    IOException

Class VectorizedParquetRecordReader

Field Summary

Fields inherited from class org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase

Methods inherited from class java.lang.Object

Field Detail

LOG

fileSchema

requestedSchema

totalRowCount

Constructor Detail

VectorizedParquetRecordReader

VectorizedParquetRecordReader

Method Detail

initialize

next

createKey

createValue

getPos

close

getProgress