org.apache.hadoop.hive.ql.io (Hive 4.0.0-beta-1 API)

Interface Summary
Interface	Description
AcidInputFormat<KEY extends org.apache.hadoop.io.WritableComparable,VALUE>	The interface required for input formats that what to support ACID transactions.
AcidInputFormat.AcidRecordReader<K,V>	RecordReader returned by AcidInputFormat working in row-at-a-time mode should AcidRecordReader.
AcidInputFormat.RawReader<V>
AcidInputFormat.RowReader<V>
AcidOutputFormat<K extends org.apache.hadoop.io.WritableComparable,V>	An extension for OutputFormats that want to implement ACID transactions.
AcidUtils.Directory
AcidUtils.HdfsDirSnapshot	DFS dir listing.
AcidUtils.ParsedDirectory
BatchToRowInputFormat
ColumnarSplit	Interface when implemented should return the estimated size of columnar projections that will be read from the split.
CombineHiveInputFormat.AvoidSplitCombination	This is a marker interface that is used to identify the formats where combine split generation is not applicable
ContentSummaryInputFormat	ContentSummayInputFormat provides an interface to let the input format itself figure the content summary for a give input path.
FlatFileInputFormat.SerializationContext<S>	An implementation of SerializationContext is responsible for looking up the Serialization implementation for the given RecordReader.
HiveOutputFormat<K,V>	`HiveOutputFormat` describes the output-specification for Hive's operators.
HivePartitioner<K2,V2>	Partition keys by their `Object.hashCode()`.
InputFormatChecker	Check for validity of the input files.
LlapAwareSplit	Split that is aware that it could be executed in LLAP.
LlapCacheOnlyInputFormatInterface	Marker interface for LLAP IO.
LlapCacheOnlyInputFormatInterface.VectorizedOnly	For inputformats that can only accept LLAP caching with vectorization turned on.
LlapWrappableInputFormatInterface	Marker interface for LLAP IO.
RecordUpdater	API for supporting updating records.
ReworkMapredInputFormat
RowPositionAwareVectorizedRecordReader
SelfDescribingInputFormatInterface	Marker interface to indicate a given input format is self-describing and can perform schema evolution itself.
StatsProvidingRecordReader	If a file format internally gathers statistics (like ORC) then it can expose the statistics through this interface.
StatsProvidingRecordWriter	If a file format internally gathers statistics (like ORC) while writing then it can expose the statistics through this record writer interface.
StorageFormatDescriptor	Subclasses represent a storage format for the CREATE TABLE ...
StreamingOutputFormat	Marker interface for streaming output formats.

Class Summary
Class	Description
AbstractStorageFormatDescriptor
AcidDirectory	AcidDirectory used to provide ACID directory layout information, which directories and files to read.
AcidInputFormat.DeltaFileMetaData
AcidInputFormat.DeltaMetaData
AcidInputFormat.Options	Options for controlling the record readers.
AcidOutputFormat.Options	Options to control how the files are written
AcidUtils	Utilities that are shared by all of the ACID input and output formats.
AcidUtils.AcidOperationalProperties	Current syntax for creating full acid transactional tables is any one of following 3 ways: create table T (a int, b int) stored as orc tblproperties('transactional'='true').
AcidUtils.AnyIdDirFilter
AcidUtils.BucketMetaData	Represents bucketId and copy_N suffix
AcidUtils.FileInfo	A simple wrapper class that stores the information about a base file and its type.
AcidUtils.HdfsDirSnapshotImpl
AcidUtils.IdFullPathFiler	Full recursive PathFilter version of IdPathFilter (filtering files for a given writeId and stmtId).
AcidUtils.IdPathFilter
AcidUtils.MetaDataFile	General facility to place a metadata file into a dir created by acid/compactor write.
AcidUtils.OrcAcidVersion	Logic related to versioning acid data format.
AcidUtils.ParsedBase	In addition to `AcidUtils.ParsedBaseLight` this knows if the data is in raw format, i.e.
AcidUtils.ParsedBaseLight	Since version 3 but prior to version 4, format of a base is "base_X" where X is a writeId.
AcidUtils.ParsedDelta	In addition to `AcidUtils.ParsedDeltaLight` this knows if the data is in raw format, i.e.
AcidUtils.ParsedDeltaLight	This encapsulates info obtained form the file path.
AcidUtils.TableSnapshot
AvroStorageFormatDescriptor
BatchToRowReader<StructType,UnionType>	A record reader wrapper that converts VRB reader into an OI-based reader.
BatchToRowReader.VirtualColumnHandler	Wrapper class to map a virtual column to a handler defined by subclasses of `BatchToRowReader`.
BucketIdentifier	Stores bucket and writeId of the bucket files.
BucketizedHiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	BucketizedHiveInputFormat serves the similar function as hiveInputFormat but its getSplits() always group splits from one input file into one wrapper split.
BucketizedHiveInputSplit	HiveInputSplit encapsulates an InputSplit with its corresponding inputFormatClass.
BucketizedHiveRecordReader<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	BucketizedHiveRecordReader is a wrapper on a list of RecordReader.
CodecPool	A global compressor/decompressor pool used to save and reuse (possibly native) compression/decompression codecs.
CombineHiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	CombineHiveInputFormat is a parameterized InputFormat which looks at the path name and determine the correct InputFormat for that path name from mapredPlan.pathToPartitionInfo().
CombineHiveInputFormat.CombineHiveInputSplit	CombineHiveInputSplit encapsulates an InputSplit with its corresponding inputFormatClassName.
CombineHiveRecordReader<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	CombineHiveRecordReader.
DefaultHivePartitioner<K2,V2>	Partition keys by their `Object.hashCode()`.
FlatFileInputFormat<T>	Deprecated
FlatFileInputFormat.RowContainer<T>	A work-around until HADOOP-1230 is fixed.
FlatFileInputFormat.SerializationContextFromConf<S>	An implementation of `FlatFileInputFormat.SerializationContext` that reads the Serialization class and specific subclass to be deserialized from the JobConf.
HdfsUtils	Common FileSystem utilities around FileId.
HdfsUtils.HdfsFileStatusWithoutId
HiveBinaryOutputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	HiveBinaryOutputFormat writes out the values consecutively without any separators.
HiveContextAwareRecordReader<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	This class prepares an IOContext, and provides the ability to perform a binary search on the data.
HiveFileFormatUtils	An util class for various Hive file format tasks.
HiveFileFormatUtils.FileChecker
HiveFileFormatUtils.NullOutputCommitter
HiveIgnoreKeyTextOutputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	HiveIgnoreKeyTextOutputFormat replaces key with null before feeding the <key, value> to TextOutputFormat.RecordWriter.
HiveIgnoreKeyTextOutputFormat.IgnoreKeyWriter<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
HiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	HiveInputFormat is a parameterized InputFormat which looks at the path name and determine the correct InputFormat for that path name from mapredPlan.pathToPartitionInfo().
HiveInputFormat.HiveInputSplit	HiveInputSplit encapsulates an InputSplit with its corresponding inputFormatClass.
HiveInputFormat.HiveInputSplitComparator
HiveKey	HiveKey is a simple wrapper on Text which allows us to set the hashCode easily.
HiveKey.Comparator	A Comparator optimized for HiveKey.
HiveNullValueSequenceFileOutputFormat<K,V>	A `HiveOutputFormat` that writes `SequenceFile`s with the content saved in the keys, and null in the values.
HiveOutputFormatImpl<K extends org.apache.hadoop.io.WritableComparable<K>,V extends org.apache.hadoop.io.Writable>	Hive does not use OutputFormat's in a conventional way, but constructs and uses the defined OutputFormat for each table from FileSinkOperator.
HivePassThroughOutputFormat<K,V>	This pass through class is used to wrap OutputFormat implementations such that new OutputFormats not derived from HiveOutputFormat gets through the checker
HivePassThroughRecordWriter<K extends org.apache.hadoop.io.WritableComparable<?>,V extends org.apache.hadoop.io.Writable>
HiveRecordReader<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	HiveRecordReader is a simple wrapper on RecordReader.
HiveSequenceFileInputFormat<K extends org.apache.hadoop.io.LongWritable,V extends BytesRefArrayWritable>	HiveSequenceFileInputFormat.
HiveSequenceFileOutputFormat<K,V>	A `HiveOutputFormat` that writes `SequenceFile`s.
IgnoreKeyTextOutputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	Deprecated use `HiveIgnoreKeyTextOutputFormat` instead}
IgnoreKeyTextOutputFormat.IgnoreKeyWriter<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
IOConstants
IOContext	IOContext basically contains the position information of the current key/value.
IOContextMap	NOTE: before LLAP branch merge, there's no LLAP code here.
IOPrepareCache	IOPrepareCache is used to cache pre-query io-related objects.
JsonFileStorageFormatDescriptor	A storage format descriptor class to support "STORED AS JSONFILE" syntax.
NonSyncDataInputBuffer	A thread-not-safe version of Hadoop's DataInputBuffer, which removes all synchronized modifiers.
NonSyncDataOutputBuffer	A thread-not-safe version of Hadoop's DataOutputBuffer, which removes all synchronized modifiers.
NullRowsInputFormat	NullRowsInputFormat outputs null rows, maximum 100.
NullRowsInputFormat.DummyInputSplit
NullRowsInputFormat.NullRowsRecordReader
NullScanFileSystem	The bogus filesystem that makes Hive not read files for nullscans via lies and deceit.
OneNullRowInputFormat	OneNullRowInputFormat outputs one null row.
OneNullRowInputFormat.OneNullRowRecordReader
ORCFileStorageFormatDescriptor
OriginalDirectory	Basic implementation of AcidUtils.Directory.
ParquetFileStorageFormatDescriptor
PositionDeleteInfo
ProxyLocalFileSystem	This class is to workaround existing issues on LocalFileSystem.
RCFile	`RCFile`s, short of Record Columnar File, are flat files consisting of binary key/value pairs, which shares much similarity with `SequenceFile`.
RCFile.KeyBuffer	KeyBuffer is the key of each record in RCFile.
RCFile.Reader	Read KeyBuffer/ValueBuffer pairs from a RCFile.
RCFile.ValueBuffer	ValueBuffer is the value of each record in RCFile.
RCFile.Writer	Write KeyBuffer/ValueBuffer pairs to a RCFile.
RCFileInputFormat<K extends org.apache.hadoop.io.LongWritable,V extends BytesRefArrayWritable>	RCFileInputFormat.
RCFileOutputFormat	RCFileOutputFormat.
RCFileRecordReader<K extends org.apache.hadoop.io.LongWritable,V extends BytesRefArrayWritable>	RCFileRecordReader.
RCFileStorageFormatDescriptor
RecordIdentifier	Gives the Record identifier information for the current record.
RecordIdentifier.StructInfo	RecordIdentifier is passed along the operator tree as a struct.
SchemaAwareCompressionInputStream	SchemaAwareCompressionInputStream adds the ability to inform the compression stream what column is being read.
SchemaAwareCompressionOutputStream	SchemaAwareCompressionOutputStream adds the ability to inform the compression stream the current column being compressed.
SchemaInferenceUtils
SequenceFileInputFormatChecker	SequenceFileInputFormatChecker.
SequenceFileStorageFormatDescriptor
SingleFileSystem	Implements an abstraction layer to show files in a single directory.
SingleFileSystem.ABFS
SingleFileSystem.ABFSS
SingleFileSystem.ADL
SingleFileSystem.FILE
SingleFileSystem.GS
SingleFileSystem.HDFS
SingleFileSystem.O3FS
SingleFileSystem.OFS
SingleFileSystem.PFILE
SingleFileSystem.S3A
SkippingTextInputFormat	SkippingInputFormat is a header/footer aware input format.
StorageFormatFactory
SymbolicInputFormat
SymlinkTextInputFormat	Symlink file is a text file which contains a list of filename / dirname.
SymlinkTextInputFormat.SymlinkTextInputSplit	This input split wraps the FileSplit generated from TextInputFormat.getSplits(), while setting the original link file path as job input path.
SyntheticFileId
TeradataBinaryFileInputFormat	https://cwiki.apache.org/confluence/display/Hive/TeradataBinarySerde.
TeradataBinaryFileOutputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>	https://cwiki.apache.org/confluence/display/Hive/TeradataBinarySerde.
TeradataBinaryRecordReader	The TeradataBinaryRecordReader reads the record from Teradata binary files.
TextFileStorageFormatDescriptor
ZeroRowsInputFormat	Same as OneNullRowInputFormat, but with 0 rows.
ZeroRowsInputFormat.ZeroRowsRecordReader

Enum Summary
Enum	Description
AcidUtils.AcidBaseFileType
AcidUtils.Operation
BucketCodec	This class makes sense of `RecordIdentifier.getBucketProperty()`.
IOContext.Comparison
RecordIdentifier.Field	This is in support of `VirtualColumn.ROWID` Contains metadata about each field in RecordIdentifier that needs to be part of ROWID which is represented as a struct `RecordIdentifier.StructInfo`.

Package org.apache.hadoop.hive.ql.io