public class HiveAccumuloTableInputFormat extends Object implements org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,AccumuloHiveRow>
Modifier and Type | Field and Description |
---|---|
protected org.apache.accumulo.core.client.mapred.AccumuloRowInputFormat |
accumuloInputFormat |
protected HiveAccumuloHelper |
helper |
protected AccumuloPredicateHandler |
predicateHandler |
Constructor and Description |
---|
HiveAccumuloTableInputFormat() |
Modifier and Type | Method and Description |
---|---|
protected void |
addIterators(org.apache.hadoop.mapred.JobConf conf,
List<org.apache.accumulo.core.client.IteratorSetting> iterators) |
protected void |
configure(org.apache.hadoop.mapred.JobConf conf,
org.apache.accumulo.core.client.Instance instance,
org.apache.accumulo.core.client.Connector connector,
AccumuloConnectionParameters accumuloParams,
ColumnMapper columnMapper,
List<org.apache.accumulo.core.client.IteratorSetting> iterators,
Collection<org.apache.accumulo.core.data.Range> ranges)
Configure the underlying AccumuloInputFormat
|
protected void |
fetchColumns(org.apache.hadoop.mapred.JobConf conf,
Set<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> cfCqPairs) |
protected ColumnMapper |
getColumnMapper(org.apache.hadoop.conf.Configuration conf) |
protected HashSet<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> |
getPairCollection(List<ColumnMapping> columnMappings)
Create col fam/qual pairs from pipe separated values, usually from config object.
|
org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,AccumuloHiveRow> |
getRecordReader(org.apache.hadoop.mapred.InputSplit inputSplit,
org.apache.hadoop.mapred.JobConf jobConf,
org.apache.hadoop.mapred.Reporter reporter)
Setup accumulo input format from conf properties.
|
org.apache.hadoop.mapred.InputSplit[] |
getSplits(org.apache.hadoop.mapred.JobConf jobConf,
int numSplits) |
protected String |
getTableName(org.apache.accumulo.core.client.mapred.RangeInputSplit split)
Reflection to work around Accumulo 1.5 and 1.6 incompatibilities.
|
protected void |
setConnectorInfo(org.apache.hadoop.mapred.JobConf conf,
String user,
org.apache.accumulo.core.client.security.tokens.AuthenticationToken token) |
protected void |
setInputTableName(org.apache.hadoop.mapred.JobConf conf,
String tableName) |
protected void |
setMockInstance(org.apache.hadoop.mapred.JobConf conf,
String instanceName) |
protected void |
setRanges(org.apache.hadoop.mapred.JobConf conf,
Collection<org.apache.accumulo.core.data.Range> ranges) |
protected void |
setScanAuthorizations(org.apache.hadoop.mapred.JobConf conf,
org.apache.accumulo.core.security.Authorizations auths) |
protected void |
setTableName(org.apache.accumulo.core.client.mapred.RangeInputSplit split,
String tableName)
Sets the table name on a RangeInputSplit, accounting for change in method name.
|
protected void |
setZooKeeperInstance(org.apache.hadoop.mapred.JobConf conf,
String instanceName,
String zkHosts,
boolean isSasl) |
protected org.apache.accumulo.core.client.mapred.AccumuloRowInputFormat accumuloInputFormat
protected AccumuloPredicateHandler predicateHandler
protected HiveAccumuloHelper helper
public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf jobConf, int numSplits) throws IOException
getSplits
in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,AccumuloHiveRow>
IOException
public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,AccumuloHiveRow> getRecordReader(org.apache.hadoop.mapred.InputSplit inputSplit, org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.mapred.Reporter reporter) throws IOException
getRecordReader
in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,AccumuloHiveRow>
inputSplit
- jobConf
- reporter
- IOException
protected ColumnMapper getColumnMapper(org.apache.hadoop.conf.Configuration conf) throws IOException, TooManyAccumuloColumnsException
protected void configure(org.apache.hadoop.mapred.JobConf conf, org.apache.accumulo.core.client.Instance instance, org.apache.accumulo.core.client.Connector connector, AccumuloConnectionParameters accumuloParams, ColumnMapper columnMapper, List<org.apache.accumulo.core.client.IteratorSetting> iterators, Collection<org.apache.accumulo.core.data.Range> ranges) throws org.apache.accumulo.core.client.AccumuloSecurityException, org.apache.accumulo.core.client.AccumuloException, SerDeException, IOException
conf
- Job configurationinstance
- Accumulo instanceconnector
- Accumulo connectoraccumuloParams
- Connection information to the Accumulo instancecolumnMapper
- Configuration of Hive to Accumulo columnsiterators
- Any iterators to be configured server-sideranges
- Accumulo ranges on for the queryorg.apache.accumulo.core.client.AccumuloSecurityException
org.apache.accumulo.core.client.AccumuloException
SerDeException
IOException
protected void setMockInstance(org.apache.hadoop.mapred.JobConf conf, String instanceName)
protected void setZooKeeperInstance(org.apache.hadoop.mapred.JobConf conf, String instanceName, String zkHosts, boolean isSasl) throws IOException
IOException
protected void setConnectorInfo(org.apache.hadoop.mapred.JobConf conf, String user, org.apache.accumulo.core.client.security.tokens.AuthenticationToken token) throws org.apache.accumulo.core.client.AccumuloSecurityException
org.apache.accumulo.core.client.AccumuloSecurityException
protected void setInputTableName(org.apache.hadoop.mapred.JobConf conf, String tableName)
protected void setScanAuthorizations(org.apache.hadoop.mapred.JobConf conf, org.apache.accumulo.core.security.Authorizations auths)
protected void addIterators(org.apache.hadoop.mapred.JobConf conf, List<org.apache.accumulo.core.client.IteratorSetting> iterators)
protected void setRanges(org.apache.hadoop.mapred.JobConf conf, Collection<org.apache.accumulo.core.data.Range> ranges)
protected void fetchColumns(org.apache.hadoop.mapred.JobConf conf, Set<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> cfCqPairs)
protected HashSet<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getPairCollection(List<ColumnMapping> columnMappings)
columnMappings
- The list of ColumnMappings for the given queryprotected String getTableName(org.apache.accumulo.core.client.mapred.RangeInputSplit split) throws IOException
IOException
for any reflection related exceptionssplit
- A RangeInputSplitIOException
protected void setTableName(org.apache.accumulo.core.client.mapred.RangeInputSplit split, String tableName) throws IOException
IOException
split
- The RangeInputSplit to operate ontableName
- The name of the table to setIOException
Copyright © 2017 The Apache Software Foundation. All rights reserved.