public class VectorizedOrcAcidRowBatchReader extends Object implements org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
Modifier and Type | Class and Description |
---|---|
protected static interface |
VectorizedOrcAcidRowBatchReader.DeleteEventRegistry
An interface that can determine which rows have been deleted
from a given vectorized row batch.
|
Modifier and Type | Field and Description |
---|---|
protected Object[] |
partitionValues |
protected float |
progress |
Constructor and Description |
---|
VectorizedOrcAcidRowBatchReader(OrcSplit inputSplit,
org.apache.hadoop.mapred.JobConf conf,
org.apache.hadoop.mapred.Reporter reporter,
org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch> baseReader,
VectorizedRowBatchCtx rbCtx,
boolean isFlatPayload)
LLAP IO c'tor
|
Modifier and Type | Method and Description |
---|---|
void |
close() |
org.apache.hadoop.io.NullWritable |
createKey() |
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch |
createValue() |
long |
getPos() |
float |
getProgress() |
boolean |
next(org.apache.hadoop.io.NullWritable key,
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch value)
There are 2 types of schema from the
baseReader that this handles. |
void |
setBaseAndInnerReader(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch> baseReader) |
protected float progress
protected Object[] partitionValues
public VectorizedOrcAcidRowBatchReader(OrcSplit inputSplit, org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.Reporter reporter, org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch> baseReader, VectorizedRowBatchCtx rbCtx, boolean isFlatPayload) throws IOException
IOException
public void setBaseAndInnerReader(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch> baseReader)
public boolean next(org.apache.hadoop.io.NullWritable key, org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch value) throws IOException
baseReader
that this handles. In the case
the data was written to a transactional table from the start, every row is decorated with
transaction related info and looks like RecordIdentifier
. They are assigned
each time the table is read in a way that needs to project VirtualColumn.ROWID
.
Major compaction will attach these values to each row permanently.
It's critical that these generated column values are assigned exactly the same way by each
read of the same row and by the Compactor.
See CompactorMR
and
OrcRawRecordMerger.OriginalReaderPairToCompact
for the Compactor read path.
(Longer term should make compactor use this class)
This only decorates original rows with metadata if something above is requesting these values
or if there are Delete events to apply.next
in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
value
is emptyIOException
public org.apache.hadoop.io.NullWritable createKey()
createKey
in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
public org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch createValue()
createValue
in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
public long getPos() throws IOException
getPos
in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
IOException
public void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
close
in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
IOException
public float getProgress() throws IOException
getProgress
in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
IOException
Copyright © 2022 The Apache Software Foundation. All rights reserved.