Utilities (Hive 1.2.2 API)

java.lang.Object
- org.apache.hadoop.hive.ql.exec.Utilities

public final class Utilities
extends Object

Utilities.

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`Utilities.CollectionPersistenceDelegate`
`static class`	`Utilities.CommonTokenDelegate` Need to serialize org.antlr.runtime.CommonToken
`static class`	`Utilities.DatePersistenceDelegate` DatePersistenceDelegate.
`static class`	`Utilities.EnumDelegate` Java 1.5 workaround.
`static class`	`Utilities.ListDelegate`
`static class`	`Utilities.MapDelegate`
`static class`	`Utilities.PathDelegate`
`static class`	`Utilities.ReduceField` ReduceField: KEY: record key VALUE: record value
`static class`	`Utilities.SetDelegate`
`static class`	`Utilities.SQLCommand<T>`
`static class`	`Utilities.StreamStatus` StreamStatus.
`static class`	`Utilities.TimestampPersistenceDelegate` TimestampPersistenceDelegate.

Field Summary

Fields
Modifier and Type	Field and Description
`static int`	`carriageReturnCode`
`static int`	`ctrlaCode`
`static TableDesc`	`defaultTd`
`static String`	`HADOOP_LOCAL_FS` The object in the reducer are composed of these top level fields.
`static String`	`HIVE_ADDED_JARS`
`static String`	`INDENT`
`static String`	`INPUT_NAME`
`static String`	`MAP_PLAN_NAME`
`static String`	`MAPNAME`
`static String`	`MAPRED_MAPPER_CLASS`
`static String`	`MAPRED_REDUCER_CLASS`
`static String`	`MERGE_PLAN_NAME`
`static int`	`newLineCode`
`static String`	`NSTR`
`static String`	`nullStringOutput`
`static String`	`nullStringStorage`
`static Random`	`randGen`
`static String`	`REDUCE_PLAN_NAME`
`static List<String>`	`reduceFieldNameList`
`static String`	`REDUCENAME`
`static ThreadLocal<com.esotericsoftware.kryo.Kryo>`	`runtimeSerializationKryo`
`static ThreadLocal<com.esotericsoftware.kryo.Kryo>`	`sparkSerializationKryo`
`static char`	`sqlEscapeChar`
`static String`	`suffix`
`static int`	`tabCode`

Method Summary

Methods
Modifier and Type	Method and Description
`static String`	`abbreviate(String str, int max)` convert "From src insert blah blah" to "From src insert ...
`static ClassLoader`	`addToClassPath(ClassLoader cloader, String[] newPaths)` Add new elements to the classpath.
`static void`	`cacheBaseWork(org.apache.hadoop.conf.Configuration conf, String name, BaseWork work, org.apache.hadoop.fs.Path hiveScratchDir)`
`static void`	`cacheMapWork(org.apache.hadoop.conf.Configuration conf, MapWork work, org.apache.hadoop.fs.Path hiveScratchDir)`
`static void`	`clearWork(org.apache.hadoop.conf.Configuration conf)`
`static void`	`clearWorkMap()`
`static void`	`clearWorkMapForConf(org.apache.hadoop.conf.Configuration conf)`
`static BaseWork`	`cloneBaseWork(BaseWork plan)` Clones using the powers of XML.
`static List<Operator<?>>`	`cloneOperatorTree(org.apache.hadoop.conf.Configuration conf, List<Operator<?>> roots)`
`static MapredWork`	`clonePlan(MapredWork plan)` Clones using the powers of XML.
`static Connection`	`connectWithRetry(String connectionString, long waitWindow, int maxRetries)` Retry connecting to a database with random backoff (same as the one implemented in HDFS-767).
`static boolean`	`contentsEqual(InputStream is1, InputStream is2, boolean ignoreWhitespace)`
`static void`	`copyTableJobPropertiesToConf(TableDesc tbl, org.apache.hadoop.conf.Configuration job)` Copies the storage handler properties configured for a table descriptor to a runtime job configuration.
`static void`	`copyTablePropertiesToConf(TableDesc tbl, org.apache.hadoop.mapred.JobConf job)` Copies the storage handler proeprites configured for a table descriptor to a runtime job configuration.
`static OutputStream`	`createCompressedStream(org.apache.hadoop.mapred.JobConf jc, OutputStream out)` Convert an output stream to a compressed output stream based on codecs and compression options specified in the Job Configuration.
`static OutputStream`	`createCompressedStream(org.apache.hadoop.mapred.JobConf jc, OutputStream out, boolean isCompressed)` Convert an output stream to a compressed output stream based on codecs codecs in the Job Configuration.
`static boolean`	`createDirsWithPermission(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path mkdirPath, org.apache.hadoop.fs.permission.FsPermission fsPermission, boolean recursive)`
`static RCFile.Writer`	`createRCFileWriter(org.apache.hadoop.mapred.JobConf jc, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path file, boolean isCompressed, org.apache.hadoop.util.Progressable progressable)` Create a RCFile output stream based on job configuration Uses user supplied compression flag (rather than obtaining it from the Job Configuration).
`static org.apache.hadoop.io.SequenceFile.Writer`	`createSequenceWriter(org.apache.hadoop.mapred.JobConf jc, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path file, Class<?> keyClass, Class<?> valClass, boolean isCompressed, org.apache.hadoop.util.Progressable progressable)` Create a sequencefile output stream based on job configuration Uses user supplied compression flag (rather than obtaining it from the Job Configuration).
`static org.apache.hadoop.io.SequenceFile.Writer`	`createSequenceWriter(org.apache.hadoop.mapred.JobConf jc, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path file, Class<?> keyClass, Class<?> valClass, org.apache.hadoop.util.Progressable progressable)` Create a sequencefile output stream based on job configuration.
`static File`	`createTempDir(String baseDir)` Create a temp dir in specified baseDir This can go away once hive moves to support only JDK 7 and can use Files.createTempDirectory Guava Files.createTempDir() does not take a base dir
`static void`	`createTmpDirs(org.apache.hadoop.conf.Configuration conf, MapWork mWork)` Hive uses tmp directories to capture the output of each FileSinkOperator.
`static void`	`createTmpDirs(org.apache.hadoop.conf.Configuration conf, ReduceWork rWork)` Hive uses tmp directories to capture the output of each FileSinkOperator.
`static ExprNodeGenericFuncDesc`	`deserializeExpression(String s)`
`static ExprNodeGenericFuncDesc`	`deserializeExpressionFromKryo(byte[] bytes)` Deserializes expression from Kryo.
`static <T extends Serializable> T`	`deserializeObject(String s, Class<T> clazz)`
`static <T> T`	`deserializePlan(InputStream in, Class<T> planClass, org.apache.hadoop.conf.Configuration conf)` Deserializes the plan.
`static String`	`escapeSqlLike(String key)` Escape the '_', '%', as well as the escape characters inside the string key.
`static int`	`estimateNumberOfReducers(HiveConf conf, org.apache.hadoop.fs.ContentSummary inputSummary, MapWork work, boolean finalMapRed)` Estimate the number of reducers needed for this job, based on job input, and configuration parameters.
`static int`	`estimateReducers(long totalInputFileSize, long bytesPerReducer, int maxReducers, boolean powersOfTwo)`
`static <T> T`	`executeWithRetry(Utilities.SQLCommand<T> cmd, PreparedStatement stmt, long baseWindow, int maxRetries)` Retry SQL execution with random backoff (same as the one implemented in HDFS-767).
`static String`	`formatBinaryString(byte[] array, int start, int length)`
`static String`	`formatMsecToStr(long msec)` Format number of milliseconds to strings
`static String`	`generateFileName(Byte tag, String bigBucketFileName)`
`static String`	`generatePath(org.apache.hadoop.fs.Path baseURI, String filename)`
`static org.apache.hadoop.fs.Path`	`generatePath(org.apache.hadoop.fs.Path basePath, String dumpFilePrefix, Byte tag, String bigBucketFileName)`
`static String`	`generateTarFileName(String name)`
`static org.apache.hadoop.fs.Path`	`generateTarPath(org.apache.hadoop.fs.Path basePath, String filename)`
`static org.apache.hadoop.fs.Path`	`generateTmpPath(org.apache.hadoop.fs.Path basePath, String id)`
`static String`	`getBucketFileNameFromPathSubString(String bucketName)`
`static List<String>`	`getColumnNames(Properties props)`
`static List<String>`	`getColumnNamesFromFieldSchema(List<FieldSchema> partCols)`
`static List<String>`	`getColumnNamesFromSortCols(List<Order> sortCols)`
`static List<String>`	`getColumnTypes(Properties props)`
`static String`	`getDatabaseName(String dbTableName)` Accepts qualified name which is in the form of dbname.tablename and returns dbname from it
`static String[]`	`getDbTableName(String dbtable)` Extract db and table name from dbtable string, where db and table are separated by "." If there is no db name part, set the current sessions default db
`static String[]`	`getDbTableName(String defaultDb, String dbtable)`
`static int`	`getDefaultNotificationInterval(org.apache.hadoop.conf.Configuration hconf)` Gets the default notification interval to send progress updates to the tracker.
`static List<String>`	`getFieldSchemaString(List<FieldSchema> fl)`
`static String`	`getFileExtension(org.apache.hadoop.mapred.JobConf jc, boolean isCompressed)` Deprecated. Use `getFileExtension(JobConf, boolean, HiveOutputFormat)`
`static String`	`getFileExtension(org.apache.hadoop.mapred.JobConf jc, boolean isCompressed, HiveOutputFormat<?,?> hiveOutputFormat)` Based on compression option, output format, and configured output codec - get extension for output file.
`static String`	`getFileNameFromDirName(String dirName)`
`static int`	`getFooterCount(TableDesc table, org.apache.hadoop.mapred.JobConf job)` Get footer line count for a table.
`static List<LinkedHashMap<String,String>>`	`getFullDPSpecs(org.apache.hadoop.conf.Configuration conf, DynamicPartitionCtx dpCtx)` Construct a list of full partition spec from Dynamic Partition Context and the directory names corresponding to these dynamic partitions.
`static String`	`getHashedStatsPrefix(String statsPrefix, int maxPrefixLength)` If statsPrefix's length is greater than maxPrefixLength and maxPrefixLength > 0, then it returns an MD5 hash of statsPrefix followed by path separator, otherwise it returns statsPrefix
`static int`	`getHeaderCount(TableDesc table)` Get header line count for a table.
`static double`	`getHighestSamplePercentage(MapWork work)` Returns the highest sample percentage of any alias in the given MapWork
`static List<org.apache.hadoop.fs.Path>`	`getInputPaths(org.apache.hadoop.mapred.JobConf job, MapWork work, org.apache.hadoop.fs.Path hiveScratchDir, Context ctx, boolean skipDummy)` Computes a list of all input paths needed to compute the given MapWork.
`static List<org.apache.hadoop.fs.Path>`	`getInputPathsTez(org.apache.hadoop.mapred.JobConf job, MapWork work)` On Tez we're not creating dummy files when getting/setting input paths.
`static org.apache.hadoop.fs.ContentSummary`	`getInputSummary(Context ctx, MapWork work, org.apache.hadoop.fs.PathFilter filter)` Calculate the total size of input files.
`static List<String>`	`getInternalColumnNamesFromSignature(List<ColumnInfo> colInfos)`
`static Set<String>`	`getJarFilesByPath(String path)` get the jar files from specified directory or get jar files by several jar names sperated by comma
`static MapredWork`	`getMapRedWork(org.apache.hadoop.conf.Configuration conf)`
`static MapWork`	`getMapWork(org.apache.hadoop.conf.Configuration conf)`
`static Map<Integer,String>`	`getMapWorkVectorScratchColumnTypeMap(org.apache.hadoop.conf.Configuration hiveConf)`
`static BaseWork`	`getMergeWork(org.apache.hadoop.mapred.JobConf jconf)`
`static BaseWork`	`getMergeWork(org.apache.hadoop.mapred.JobConf jconf, String prefix)`
`static List<ExecDriver>`	`getMRTasks(List<Task<? extends Serializable>> tasks)`
`static String`	`getNameMessage(Exception e)`
`static String`	`getOpTreeSkel(Operator<?> op)`
`static PartitionDesc`	`getPartitionDesc(Partition part)`
`static PartitionDesc`	`getPartitionDescFromTableDesc(TableDesc tblDesc, Partition part)`
`static org.apache.hadoop.fs.Path`	`getPlanPath(org.apache.hadoop.conf.Configuration conf)`
`static String`	`getPrefixedTaskIdFromFilename(String filename)` Get the part-spec + task id from the filename.
`static String`	`getQualifiedPath(HiveConf conf, org.apache.hadoop.fs.Path path)` Convert path to qualified path.
`static long`	`getRandomWaitTime(long baseWindow, int failures, Random r)` Introducing a random factor to the wait time before another retry.
`static ReduceWork`	`getReduceWork(org.apache.hadoop.conf.Configuration conf)`
`static String`	`getResourceFiles(org.apache.hadoop.conf.Configuration conf, SessionState.ResourceType t)`
`static ClassLoader`	`getSessionSpecifiedClassLoader()` get session specified class loader and get current class loader if fall
`static List<SparkTask>`	`getSparkTasks(List<Task<? extends Serializable>> tasks)`
`static StatsPublisher`	`getStatsPublisher(org.apache.hadoop.mapred.JobConf jc)`
`static TableDesc`	`getTableDesc(String cols, String colTypes)`
`static TableDesc`	`getTableDesc(Table tbl)`
`static String`	`getTableName(String dbTableName)` Accepts qualified name which is in the form of dbname.tablename and returns tablename from it
`static String`	`getTaskId(org.apache.hadoop.conf.Configuration hconf)` Gets the task id if we are running as a Hadoop job.
`static String`	`getTaskIdFromFilename(String filename)` Get the task id from the filename.
`static List<TezTask>`	`getTezTasks(List<Task<? extends Serializable>> tasks)`
`static long`	`getTotalInputFileSize(org.apache.hadoop.fs.ContentSummary inputSummary, MapWork work, double highestSamplePercentage)` Computes the total input file size.
`static long`	`getTotalInputNumFiles(org.apache.hadoop.fs.ContentSummary inputSummary, MapWork work, double highestSamplePercentage)` Computes the total number of input files.
`static boolean`	`isCopyFile(String filename)`
`static boolean`	`isDefaultNameNode(HiveConf conf)` Checks if current hive script was executed with non-default namenode
`static boolean`	`isEmptyPath(org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.fs.Path dirPath)`
`static boolean`	`isEmptyPath(org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.fs.Path dirPath, Context ctx)`
`static boolean`	`isPerfOrAboveLogging(HiveConf conf)` Checks if the current HiveServer2 logging operation level is >= PERFORMANCE.
`static boolean`	`isTempPath(org.apache.hadoop.fs.FileStatus file)` Detect if the supplied file is a temporary path.
`static boolean`	`isVectorMode(org.apache.hadoop.conf.Configuration conf)` Returns true if a plan is both configured for vectorized execution and vectorization is allowed.
`static String`	`join(String... elements)`
`static org.apache.hadoop.fs.FileStatus[]`	`listStatusIfExists(org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.FileSystem fs)` returns null if path is not exist
`static ArrayList`	`makeList(Object... olist)`
`static HashMap`	`makeMap(Object... olist)`
`static Properties`	`makeProperties(String... olist)`
`static List<String>`	`mergeUniqElems(List<String> src, List<String> dest)`
`static void`	`mvFileToFinalPath(org.apache.hadoop.fs.Path specPath, org.apache.hadoop.conf.Configuration hconf, boolean success, org.apache.commons.logging.Log log, DynamicPartitionCtx dpCtx, FileSinkDesc conf, org.apache.hadoop.mapred.Reporter reporter)`
`static String`	`now()`
`static PreparedStatement`	`prepareWithRetry(Connection conn, String stmt, long waitWindow, int maxRetries)` Retry preparing a SQL statement with random backoff (same as the one implemented in HDFS-767).
`static Utilities.StreamStatus`	`readColumn(DataInput in, OutputStream out)`
`static String`	`realFile(String newFile, org.apache.hadoop.conf.Configuration conf)` Shamelessly cloned from GenericOptionsParser.
`protected static void`	`removeField(com.esotericsoftware.kryo.Kryo kryo, Class type, String fieldName)`
`static void`	`removeFromClassPath(String[] pathsToRemove)` remove elements from the classpath.
`static HashMap<String,org.apache.hadoop.fs.FileStatus>`	`removeTempOrDuplicateFiles(org.apache.hadoop.fs.FileStatus[] items, org.apache.hadoop.fs.FileSystem fs)`
`static void`	`removeTempOrDuplicateFiles(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)` Remove all temporary files and duplicate (double-committed) files from a given directory.
`static ArrayList<String>`	`removeTempOrDuplicateFiles(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, DynamicPartitionCtx dpCtx)` Remove all temporary files and duplicate (double-committed) files from a given directory.
`static String`	`removeValueTag(String column)`
`static void`	`rename(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dst)` Rename src to dst, or in the case dst already exists, move files in src to dst.
`static void`	`renameOrMoveFiles(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dst)` Rename src to dst, or in the case dst already exists, move files in src to dst.
`static String`	`replaceTaskIdFromFilename(String filename, int bucketNum)` Replace the task id from the filename.
`static String`	`replaceTaskIdFromFilename(String filename, String fileId)`
`static void`	`restoreSessionSpecifiedClassLoader(ClassLoader prev)`
`static void`	`reworkMapRedWork(Task<? extends Serializable> task, boolean reworkMapredWork, HiveConf conf)` The check here is kind of not clean.
`static String`	`serializeExpression(ExprNodeGenericFuncDesc expr)`
`static byte[]`	`serializeExpressionToKryo(ExprNodeGenericFuncDesc expr)` Serializes expression via Kryo.
`static String`	`serializeObject(Serializable expr)`
`static void`	`serializePlan(Object plan, OutputStream out, org.apache.hadoop.conf.Configuration conf)` Serializes the plan.
`static void`	`setBaseWork(org.apache.hadoop.conf.Configuration conf, String name, BaseWork work)` Pushes work into the global work map
`static void`	`setColumnNameList(org.apache.hadoop.mapred.JobConf jobConf, Operator op)`
`static void`	`setColumnNameList(org.apache.hadoop.mapred.JobConf jobConf, Operator op, boolean excludeVCs)`
`static void`	`setColumnTypeList(org.apache.hadoop.mapred.JobConf jobConf, Operator op)`
`static void`	`setColumnTypeList(org.apache.hadoop.mapred.JobConf jobConf, Operator op, boolean excludeVCs)`
`static void`	`setInputAttributes(org.apache.hadoop.conf.Configuration conf, MapWork mWork)` Set hive input format, and input format file if necessary.
`static void`	`setInputPaths(org.apache.hadoop.mapred.JobConf job, List<org.apache.hadoop.fs.Path> pathsToAdd)` setInputPaths add all the paths in the provided list to the Job conf object as input paths for the job.
`static void`	`setMapRedWork(org.apache.hadoop.conf.Configuration conf, MapredWork w, org.apache.hadoop.fs.Path hiveScratchDir)`
`static void`	`setMapWork(org.apache.hadoop.conf.Configuration conf, MapWork work)`
`static org.apache.hadoop.fs.Path`	`setMapWork(org.apache.hadoop.conf.Configuration conf, MapWork w, org.apache.hadoop.fs.Path hiveScratchDir, boolean useCache)`
`static org.apache.hadoop.fs.Path`	`setMergeWork(org.apache.hadoop.mapred.JobConf conf, MergeJoinWork mergeJoinWork, org.apache.hadoop.fs.Path mrScratchDir, boolean useCache)`
`static void`	`setQueryTimeout(Statement stmt, int timeout)`
`static void`	`setReduceWork(org.apache.hadoop.conf.Configuration conf, ReduceWork work)`
`static org.apache.hadoop.fs.Path`	`setReduceWork(org.apache.hadoop.conf.Configuration conf, ReduceWork w, org.apache.hadoop.fs.Path hiveScratchDir, boolean useCache)`
`static void`	`setWorkflowAdjacencies(org.apache.hadoop.conf.Configuration conf, QueryPlan plan)`
`static double`	`showTime(long time)`
`static boolean`	`skipHeader(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable> currRecReader, int headerCount, org.apache.hadoop.io.WritableComparable key, org.apache.hadoop.io.Writable value)` Skip header lines in the table file when reading the record.
`static long`	`sumOf(Map<String,Long> aliasToSize, Set<String> aliases)`
`static long`	`sumOfExcept(Map<String,Long> aliasToSize, Set<String> aliases, Set<String> excepts)`
`static org.apache.hadoop.fs.Path`	`toTaskTempPath(org.apache.hadoop.fs.Path orig)`
`static org.apache.hadoop.fs.Path`	`toTempPath(org.apache.hadoop.fs.Path orig)`
`static org.apache.hadoop.fs.Path`	`toTempPath(String orig)` Given a path, convert to a temporary path.
`static void`	`validateColumnNames(List<String> colNames, List<String> checkCols)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

HADOOP_LOCAL_FS
```
public static String HADOOP_LOCAL_FS
```
The object in the reducer are composed of these top level fields.

MAP_PLAN_NAME
```
public static String MAP_PLAN_NAME
```

REDUCE_PLAN_NAME
```
public static String REDUCE_PLAN_NAME
```

MERGE_PLAN_NAME
```
public static String MERGE_PLAN_NAME
```

INPUT_NAME
```
public static final String INPUT_NAME
```
See Also:
Constant Field Values

MAPRED_MAPPER_CLASS

public static final String MAPRED_MAPPER_CLASS

See Also:: Constant Field Values

MAPRED_REDUCER_CLASS

public static final String MAPRED_REDUCER_CLASS

See Also:: Constant Field Values

HIVE_ADDED_JARS

public static final String HIVE_ADDED_JARS

See Also:: Constant Field Values

MAPNAME
```
public static String MAPNAME
```

REDUCENAME
```
public static String REDUCENAME
```

reduceFieldNameList

public static List<String> reduceFieldNameList

runtimeSerializationKryo

public static ThreadLocal<com.esotericsoftware.kryo.Kryo> runtimeSerializationKryo

sparkSerializationKryo

public static ThreadLocal<com.esotericsoftware.kryo.Kryo> sparkSerializationKryo

defaultTd
```
public static TableDesc defaultTd
```

carriageReturnCode
```
public static final int carriageReturnCode
```
See Also:
Constant Field Values

newLineCode
```
public static final int newLineCode
```
See Also:
Constant Field Values

tabCode
```
public static final int tabCode
```
See Also:
Constant Field Values

ctrlaCode
```
public static final int ctrlaCode
```
See Also:
Constant Field Values

INDENT
```
public static final String INDENT
```
See Also:
Constant Field Values

nullStringStorage
```
public static String nullStringStorage
```

nullStringOutput
```
public static String nullStringOutput
```

randGen
```
public static Random randGen
```

NSTR
```
public static final String NSTR
```
See Also:
Constant Field Values

suffix
```
public static String suffix
```

sqlEscapeChar
```
public static final char sqlEscapeChar
```
See Also:
Constant Field Values

Method Detail

removeValueTag

public static String removeValueTag(String column)

clearWork

public static void clearWork(org.apache.hadoop.conf.Configuration conf)

getMapRedWork

public static MapredWork getMapRedWork(org.apache.hadoop.conf.Configuration conf)

cacheMapWork

public static void cacheMapWork(org.apache.hadoop.conf.Configuration conf,
                MapWork work,
                org.apache.hadoop.fs.Path hiveScratchDir)

setMapWork

public static void setMapWork(org.apache.hadoop.conf.Configuration conf,
              MapWork work)

getMapWork

public static MapWork getMapWork(org.apache.hadoop.conf.Configuration conf)

setReduceWork

public static void setReduceWork(org.apache.hadoop.conf.Configuration conf,
                 ReduceWork work)

getReduceWork

public static ReduceWork getReduceWork(org.apache.hadoop.conf.Configuration conf)

setMergeWork

public static org.apache.hadoop.fs.Path setMergeWork(org.apache.hadoop.mapred.JobConf conf,
                                     MergeJoinWork mergeJoinWork,
                                     org.apache.hadoop.fs.Path mrScratchDir,
                                     boolean useCache)

getMergeWork

public static BaseWork getMergeWork(org.apache.hadoop.mapred.JobConf jconf)

getMergeWork

public static BaseWork getMergeWork(org.apache.hadoop.mapred.JobConf jconf,
                    String prefix)

cacheBaseWork

public static void cacheBaseWork(org.apache.hadoop.conf.Configuration conf,
                 String name,
                 BaseWork work,
                 org.apache.hadoop.fs.Path hiveScratchDir)

setBaseWork

public static void setBaseWork(org.apache.hadoop.conf.Configuration conf,
               String name,
               BaseWork work)

Pushes work into the global work map

getMapWorkVectorScratchColumnTypeMap

public static Map<Integer,String> getMapWorkVectorScratchColumnTypeMap(org.apache.hadoop.conf.Configuration hiveConf)

setWorkflowAdjacencies

public static void setWorkflowAdjacencies(org.apache.hadoop.conf.Configuration conf,
                          QueryPlan plan)

getFieldSchemaString

public static List<String> getFieldSchemaString(List<FieldSchema> fl)

setMapRedWork

public static void setMapRedWork(org.apache.hadoop.conf.Configuration conf,
                 MapredWork w,
                 org.apache.hadoop.fs.Path hiveScratchDir)

setMapWork

public static org.apache.hadoop.fs.Path setMapWork(org.apache.hadoop.conf.Configuration conf,
                                   MapWork w,
                                   org.apache.hadoop.fs.Path hiveScratchDir,
                                   boolean useCache)

setReduceWork

public static org.apache.hadoop.fs.Path setReduceWork(org.apache.hadoop.conf.Configuration conf,
                                      ReduceWork w,
                                      org.apache.hadoop.fs.Path hiveScratchDir,
                                      boolean useCache)

getPlanPath

public static org.apache.hadoop.fs.Path getPlanPath(org.apache.hadoop.conf.Configuration conf)

serializeExpressionToKryo
```
public static byte[] serializeExpressionToKryo(ExprNodeGenericFuncDesc expr)
```
Serializes expression via Kryo.

Parameters:
expr - Expression.

Returns:
Bytes.

deserializeExpressionFromKryo
```
public static ExprNodeGenericFuncDesc deserializeExpressionFromKryo(byte[] bytes)
```
Deserializes expression from Kryo.

Parameters:
bytes - Bytes containing the expression.

Returns:
Expression; null if deserialization succeeded, but the result type is incorrect.

serializeExpression

public static String serializeExpression(ExprNodeGenericFuncDesc expr)

deserializeExpression

public static ExprNodeGenericFuncDesc deserializeExpression(String s)

serializeObject

public static String serializeObject(Serializable expr)

deserializeObject

public static <T extends Serializable> T deserializeObject(String s,
                                           Class<T> clazz)

cloneOperatorTree

public static List<Operator<?>> cloneOperatorTree(org.apache.hadoop.conf.Configuration conf,
                                  List<Operator<?>> roots)

serializePlan
```
public static void serializePlan(Object plan,
                 OutputStream out,
                 org.apache.hadoop.conf.Configuration conf)
```
Serializes the plan.

Parameters:
plan - The plan, such as QueryPlan, MapredWork, etc.
out - The stream to write to.
conf - to pick which serialization format is desired.

deserializePlan

public static <T> T deserializePlan(InputStream in,
                    Class<T> planClass,
                    org.apache.hadoop.conf.Configuration conf)

Deserializes the plan.

Parameters:: in - The stream to read from.; planClass - class of plan; conf - configuration
Returns:: The plan, such as QueryPlan, MapredWork, etc.

clonePlan
```
public static MapredWork clonePlan(MapredWork plan)
```
Clones using the powers of XML. Do not use unless necessary.

Parameters:
plan - The plan.

Returns:
The clone.

cloneBaseWork
```
public static BaseWork cloneBaseWork(BaseWork plan)
```
Clones using the powers of XML. Do not use unless necessary.

Parameters:
plan - The plan.

Returns:
The clone.

removeField

protected static void removeField(com.esotericsoftware.kryo.Kryo kryo,
               Class type,
               String fieldName)

getTaskId
```
public static String getTaskId(org.apache.hadoop.conf.Configuration hconf)
```
Gets the task id if we are running as a Hadoop job. Gets a random number otherwise.

makeMap

public static HashMap makeMap(Object... olist)

makeProperties

public static Properties makeProperties(String... olist)

makeList

public static ArrayList makeList(Object... olist)

getTableDesc

public static TableDesc getTableDesc(Table tbl)

getTableDesc

public static TableDesc getTableDesc(String cols,
                     String colTypes)

getPartitionDesc

public static PartitionDesc getPartitionDesc(Partition part)
                                      throws HiveException

Throws:: HiveException

getPartitionDescFromTableDesc

public static PartitionDesc getPartitionDescFromTableDesc(TableDesc tblDesc,
                                          Partition part)
                                                   throws HiveException

Throws:: HiveException

getOpTreeSkel

public static String getOpTreeSkel(Operator<?> op)

contentsEqual

public static boolean contentsEqual(InputStream is1,
                    InputStream is2,
                    boolean ignoreWhitespace)
                             throws IOException

Throws:: IOException

abbreviate

public static String abbreviate(String str,
                int max)

convert "From src insert blah blah" to "From src insert ... blah"

readColumn

public static Utilities.StreamStatus readColumn(DataInput in,
                                OutputStream out)
                                         throws IOException

Throws:: IOException

createCompressedStream
```
public static OutputStream createCompressedStream(org.apache.hadoop.mapred.JobConf jc,
                                  OutputStream out)
                                           throws IOException
```
Convert an output stream to a compressed output stream based on codecs and compression options specified in the Job Configuration.

Parameters:
jc - Job Configuration
out - Output Stream to be converted into compressed output stream

Returns:
compressed output stream

Throws:

IOException

createCompressedStream
```
public static OutputStream createCompressedStream(org.apache.hadoop.mapred.JobConf jc,
                                  OutputStream out,
                                  boolean isCompressed)
                                           throws IOException
```
Convert an output stream to a compressed output stream based on codecs codecs in the Job Configuration. Caller specifies directly whether file is compressed or not

Parameters:
jc - Job Configuration
out - Output Stream to be converted into compressed output stream
isCompressed - whether the output stream needs to be compressed or not

Returns:
compressed output stream

Throws:

IOException

getFileExtension
```
@Deprecated
public static String getFileExtension(org.apache.hadoop.mapred.JobConf jc,
                                 boolean isCompressed)
```
Deprecated. Use getFileExtension(JobConf, boolean, HiveOutputFormat)

Based on compression option and configured output codec - get extension for output file. This is only required for text files - not sequencefiles

Parameters:
jc - Job Configuration
isCompressed - Whether the output file is compressed or not

Returns:
the required file extension (example: .gz)

getFileExtension
```
public static String getFileExtension(org.apache.hadoop.mapred.JobConf jc,
                      boolean isCompressed,
                      HiveOutputFormat<?,?> hiveOutputFormat)
```
Based on compression option, output format, and configured output codec - get extension for output file. Text files require an extension, whereas others, like sequence files, do not.
The property hive.output.file.extension is used to determine the extension - if set, it will override other logic for choosing an extension.

Parameters:
jc - Job Configuration
isCompressed - Whether the output file is compressed or not
hiveOutputFormat - The output format, used to detect if the format is text

Returns:
the required file extension (example: .gz)

createSequenceWriter

public static org.apache.hadoop.io.SequenceFile.Writer createSequenceWriter(org.apache.hadoop.mapred.JobConf jc,
                                                            org.apache.hadoop.fs.FileSystem fs,
                                                            org.apache.hadoop.fs.Path file,
                                                            Class<?> keyClass,
                                                            Class<?> valClass,
                                                            org.apache.hadoop.util.Progressable progressable)
                                                                     throws IOException

Create a sequencefile output stream based on job configuration.

Parameters:: jc - Job configuration; fs - File System to create file in; file - Path to be created; keyClass - Java Class for key; valClass - Java Class for value
Returns:: output stream over the created sequencefile
Throws:: IOException

createSequenceWriter

public static org.apache.hadoop.io.SequenceFile.Writer createSequenceWriter(org.apache.hadoop.mapred.JobConf jc,
                                                            org.apache.hadoop.fs.FileSystem fs,
                                                            org.apache.hadoop.fs.Path file,
                                                            Class<?> keyClass,
                                                            Class<?> valClass,
                                                            boolean isCompressed,
                                                            org.apache.hadoop.util.Progressable progressable)
                                                                     throws IOException

Create a sequencefile output stream based on job configuration Uses user supplied compression flag (rather than obtaining it from the Job Configuration).

Parameters:: jc - Job configuration; fs - File System to create file in; file - Path to be created; keyClass - Java Class for key; valClass - Java Class for value
Returns:: output stream over the created sequencefile
Throws:: IOException

createRCFileWriter

public static RCFile.Writer createRCFileWriter(org.apache.hadoop.mapred.JobConf jc,
                               org.apache.hadoop.fs.FileSystem fs,
                               org.apache.hadoop.fs.Path file,
                               boolean isCompressed,
                               org.apache.hadoop.util.Progressable progressable)
                                        throws IOException

Create a RCFile output stream based on job configuration Uses user supplied compression flag (rather than obtaining it from the Job Configuration).

Parameters:: jc - Job configuration; fs - File System to create file in; file - Path to be created
Returns:: output stream over the created rcfile
Throws:: IOException

realFile

public static String realFile(String newFile,
              org.apache.hadoop.conf.Configuration conf)
                       throws IOException

Shamelessly cloned from GenericOptionsParser.

Throws:: IOException

mergeUniqElems

public static List<String> mergeUniqElems(List<String> src,
                          List<String> dest)

toTaskTempPath

public static org.apache.hadoop.fs.Path toTaskTempPath(org.apache.hadoop.fs.Path orig)

toTempPath

public static org.apache.hadoop.fs.Path toTempPath(org.apache.hadoop.fs.Path orig)

toTempPath

public static org.apache.hadoop.fs.Path toTempPath(String orig)

Given a path, convert to a temporary path.

isTempPath

public static boolean isTempPath(org.apache.hadoop.fs.FileStatus file)

Detect if the supplied file is a temporary path.

rename
```
public static void rename(org.apache.hadoop.fs.FileSystem fs,
          org.apache.hadoop.fs.Path src,
          org.apache.hadoop.fs.Path dst)
                   throws IOException,
                          HiveException
```
Rename src to dst, or in the case dst already exists, move files in src to dst. If there is an existing file with the same name, the new file's name will be appended with "_1", "_2", etc.

Parameters:
fs - the FileSystem where src and dst are on.
src - the src directory
dst - the target directory

Throws:

IOException

HiveException

renameOrMoveFiles
```
public static void renameOrMoveFiles(org.apache.hadoop.fs.FileSystem fs,
                     org.apache.hadoop.fs.Path src,
                     org.apache.hadoop.fs.Path dst)
                              throws IOException,
                                     HiveException
```
Rename src to dst, or in the case dst already exists, move files in src to dst. If there is an existing file with the same name, the new file's name will be appended with "_1", "_2", etc.

Parameters:
fs - the FileSystem where src and dst are on.
src - the src directory
dst - the target directory

Throws:

IOException

HiveException

getTaskIdFromFilename
```
public static String getTaskIdFromFilename(String filename)
```
Get the task id from the filename. It is assumed that the filename is derived from the output of getTaskId

Parameters:
filename - filename to extract taskid from

getPrefixedTaskIdFromFilename
```
public static String getPrefixedTaskIdFromFilename(String filename)
```
Get the part-spec + task id from the filename. It is assumed that the filename is derived from the output of getTaskId

Parameters:
filename - filename to extract taskid from

getFileNameFromDirName

public static String getFileNameFromDirName(String dirName)

replaceTaskIdFromFilename
```
public static String replaceTaskIdFromFilename(String filename,
                               int bucketNum)
```
Replace the task id from the filename. It is assumed that the filename is derived from the output of getTaskId

Parameters:
filename - filename to replace taskid "0_0" or "0_0.gz" by 33 to "33_0" or "33_0.gz"

replaceTaskIdFromFilename

public static String replaceTaskIdFromFilename(String filename,
                               String fileId)

listStatusIfExists

public static org.apache.hadoop.fs.FileStatus[] listStatusIfExists(org.apache.hadoop.fs.Path path,
                                                   org.apache.hadoop.fs.FileSystem fs)
                                                            throws IOException

returns null if path is not exist

Throws:: IOException

mvFileToFinalPath

public static void mvFileToFinalPath(org.apache.hadoop.fs.Path specPath,
                     org.apache.hadoop.conf.Configuration hconf,
                     boolean success,
                     org.apache.commons.logging.Log log,
                     DynamicPartitionCtx dpCtx,
                     FileSinkDesc conf,
                     org.apache.hadoop.mapred.Reporter reporter)
                              throws IOException,
                                     HiveException

Throws:: IOException; HiveException

removeTempOrDuplicateFiles

public static void removeTempOrDuplicateFiles(org.apache.hadoop.fs.FileSystem fs,
                              org.apache.hadoop.fs.Path path)
                                       throws IOException

Remove all temporary files and duplicate (double-committed) files from a given directory.

Throws:: IOException

removeTempOrDuplicateFiles

public static ArrayList<String> removeTempOrDuplicateFiles(org.apache.hadoop.fs.FileSystem fs,
                                           org.apache.hadoop.fs.Path path,
                                           DynamicPartitionCtx dpCtx)
                                                    throws IOException

Remove all temporary files and duplicate (double-committed) files from a given directory.

Returns:: a list of path names corresponding to should-be-created empty buckets.
Throws:: IOException

removeTempOrDuplicateFiles

public static HashMap<String,org.apache.hadoop.fs.FileStatus> removeTempOrDuplicateFiles(org.apache.hadoop.fs.FileStatus[] items,
                                                                         org.apache.hadoop.fs.FileSystem fs)
                                                                                  throws IOException

Throws:: IOException

isCopyFile

public static boolean isCopyFile(String filename)

getBucketFileNameFromPathSubString

public static String getBucketFileNameFromPathSubString(String bucketName)

getNameMessage

public static String getNameMessage(Exception e)

getResourceFiles

public static String getResourceFiles(org.apache.hadoop.conf.Configuration conf,
                      SessionState.ResourceType t)

getSessionSpecifiedClassLoader
```
public static ClassLoader getSessionSpecifiedClassLoader()
```
get session specified class loader and get current class loader if fall

Returns:

restoreSessionSpecifiedClassLoader

public static void restoreSessionSpecifiedClassLoader(ClassLoader prev)

getJarFilesByPath
```
public static Set<String> getJarFilesByPath(String path)
```
get the jar files from specified directory or get jar files by several jar names sperated by comma

Parameters:
path -

Returns:

addToClassPath

public static ClassLoader addToClassPath(ClassLoader cloader,
                         String[] newPaths)
                                  throws Exception

Add new elements to the classpath.

Parameters:: newPaths - Array of classpath elements
Throws:: Exception

removeFromClassPath
```
public static void removeFromClassPath(String[] pathsToRemove)
                                throws Exception
```
remove elements from the classpath.

Parameters:
pathsToRemove - Array of classpath elements

Throws:

Exception

formatBinaryString

public static String formatBinaryString(byte[] array,
                        int start,
                        int length)

getColumnNamesFromSortCols

public static List<String> getColumnNamesFromSortCols(List<Order> sortCols)

getColumnNamesFromFieldSchema

public static List<String> getColumnNamesFromFieldSchema(List<FieldSchema> partCols)

getInternalColumnNamesFromSignature

public static List<String> getInternalColumnNamesFromSignature(List<ColumnInfo> colInfos)

getColumnNames

public static List<String> getColumnNames(Properties props)

getColumnTypes

public static List<String> getColumnTypes(Properties props)

getDbTableName
```
public static String[] getDbTableName(String dbtable)
                               throws SemanticException
```
Extract db and table name from dbtable string, where db and table are separated by "." If there is no db name part, set the current sessions default db

Parameters:
dbtable -

Returns:
String array with two elements, first is db name, second is table name

Throws:

HiveException

SemanticException

getDbTableName

public static String[] getDbTableName(String defaultDb,
                      String dbtable)
                               throws SemanticException

Throws:: SemanticException

getDatabaseName
```
public static String getDatabaseName(String dbTableName)
                              throws SemanticException
```
Accepts qualified name which is in the form of dbname.tablename and returns dbname from it

Parameters:
dbTableName -

Returns:
dbname

Throws:

SemanticException - input string is not qualified name

getTableName
```
public static String getTableName(String dbTableName)
                           throws SemanticException
```
Accepts qualified name which is in the form of dbname.tablename and returns tablename from it

Parameters:
dbTableName -

Returns:
tablename

Throws:

SemanticException - input string is not qualified name

validateColumnNames

public static void validateColumnNames(List<String> colNames,
                       List<String> checkCols)
                                throws SemanticException

Throws:: SemanticException

getDefaultNotificationInterval
```
public static int getDefaultNotificationInterval(org.apache.hadoop.conf.Configuration hconf)
```
Gets the default notification interval to send progress updates to the tracker. Useful for operators that may not output data for a while.

Parameters:
hconf -

Returns:
the interval in milliseconds

copyTableJobPropertiesToConf
```
public static void copyTableJobPropertiesToConf(TableDesc tbl,
                                org.apache.hadoop.conf.Configuration job)
```
Copies the storage handler properties configured for a table descriptor to a runtime job configuration.

Parameters:
tbl - table descriptor from which to read
job - configuration which receives configured properties

copyTablePropertiesToConf
```
public static void copyTablePropertiesToConf(TableDesc tbl,
                             org.apache.hadoop.mapred.JobConf job)
```
Copies the storage handler proeprites configured for a table descriptor to a runtime job configuration. This differs from copyTablePropertiesToConf(org.apache.hadoop.hive.ql.plan.TableDesc, org.apache.hadoop.mapred.JobConf) in that it does not allow parameters already set in the job to override the values from the table. This is important for setting the config up for reading, as the job may already have values in it from another table.

Parameters:
tbl -
job -

getInputSummary

public static org.apache.hadoop.fs.ContentSummary getInputSummary(Context ctx,
                                                  MapWork work,
                                                  org.apache.hadoop.fs.PathFilter filter)
                                                           throws IOException

Calculate the total size of input files.

Parameters:: ctx - the hadoop job context; work - map reduce job plan; filter - filter to apply to the input paths before calculating size
Returns:: the summary of all the input paths.
Throws:: IOException

sumOf

public static long sumOf(Map<String,Long> aliasToSize,
         Set<String> aliases)

sumOfExcept

public static long sumOfExcept(Map<String,Long> aliasToSize,
               Set<String> aliases,
               Set<String> excepts)

isEmptyPath

public static boolean isEmptyPath(org.apache.hadoop.mapred.JobConf job,
                  org.apache.hadoop.fs.Path dirPath,
                  Context ctx)
                           throws Exception

Throws:: Exception

isEmptyPath

public static boolean isEmptyPath(org.apache.hadoop.mapred.JobConf job,
                  org.apache.hadoop.fs.Path dirPath)
                           throws Exception

Throws:: Exception

getTezTasks

public static List<TezTask> getTezTasks(List<Task<? extends Serializable>> tasks)

getSparkTasks

public static List<SparkTask> getSparkTasks(List<Task<? extends Serializable>> tasks)

getMRTasks

public static List<ExecDriver> getMRTasks(List<Task<? extends Serializable>> tasks)

getFullDPSpecs

public static List<LinkedHashMap<String,String>> getFullDPSpecs(org.apache.hadoop.conf.Configuration conf,
                                                DynamicPartitionCtx dpCtx)
                                                         throws HiveException

Construct a list of full partition spec from Dynamic Partition Context and the directory names corresponding to these dynamic partitions.

Throws:: HiveException

getStatsPublisher

public static StatsPublisher getStatsPublisher(org.apache.hadoop.mapred.JobConf jc)

getHashedStatsPrefix
```
public static String getHashedStatsPrefix(String statsPrefix,
                          int maxPrefixLength)
```
If statsPrefix's length is greater than maxPrefixLength and maxPrefixLength > 0, then it returns an MD5 hash of statsPrefix followed by path separator, otherwise it returns statsPrefix

Parameters:
statsPrefix - prefix of stats key
maxPrefixLength - max length of stats key

Returns:
if the length of prefix is longer than max, return MD5 hashed value of the prefix

join

public static String join(String... elements)

setColumnNameList

public static void setColumnNameList(org.apache.hadoop.mapred.JobConf jobConf,
                     Operator op)

setColumnNameList

public static void setColumnNameList(org.apache.hadoop.mapred.JobConf jobConf,
                     Operator op,
                     boolean excludeVCs)

setColumnTypeList

public static void setColumnTypeList(org.apache.hadoop.mapred.JobConf jobConf,
                     Operator op)

setColumnTypeList

public static void setColumnTypeList(org.apache.hadoop.mapred.JobConf jobConf,
                     Operator op,
                     boolean excludeVCs)

generatePath

public static org.apache.hadoop.fs.Path generatePath(org.apache.hadoop.fs.Path basePath,
                                     String dumpFilePrefix,
                                     Byte tag,
                                     String bigBucketFileName)

generateFileName

public static String generateFileName(Byte tag,
                      String bigBucketFileName)

generateTmpPath

public static org.apache.hadoop.fs.Path generateTmpPath(org.apache.hadoop.fs.Path basePath,
                                        String id)

generateTarPath

public static org.apache.hadoop.fs.Path generateTarPath(org.apache.hadoop.fs.Path basePath,
                                        String filename)

generateTarFileName

public static String generateTarFileName(String name)

generatePath

public static String generatePath(org.apache.hadoop.fs.Path baseURI,
                  String filename)

now
```
public static String now()
```

showTime

public static double showTime(long time)

reworkMapRedWork
```
public static void reworkMapRedWork(Task<? extends Serializable> task,
                    boolean reworkMapredWork,
                    HiveConf conf)
                             throws SemanticException
```
The check here is kind of not clean. It first use a for loop to go through all input formats, and choose the ones that extend ReworkMapredInputFormat to a set. And finally go through the ReworkMapredInputFormat set, and call rework for each one. Technically all these can be avoided if all Hive's input formats can share a same interface. As in today's hive and Hadoop, it is not possible because a lot of Hive's input formats are in Hadoop's code. And most of Hadoop's input formats just extend InputFormat interface.

Parameters:
task -
reworkMapredWork -
conf -

Throws:

SemanticException

executeWithRetry
```
public static <T> T executeWithRetry(Utilities.SQLCommand<T> cmd,
                     PreparedStatement stmt,
                     long baseWindow,
                     int maxRetries)
                          throws SQLException
```
Retry SQL execution with random backoff (same as the one implemented in HDFS-767). This function only retries when the SQL query throws a SQLTransientException (which might be able to succeed with a simple retry). It doesn't retry when the exception is a SQLRecoverableException or SQLNonTransientException. For SQLRecoverableException the caller needs to reconnect to the database and restart the whole transaction.

Parameters:
cmd - the SQL command
stmt - the prepared statement of SQL.
baseWindow - The base time window (in milliseconds) before the next retry. see getRandomWaitTime(long, int, java.util.Random) for details.
maxRetries - the maximum # of retries when getting a SQLTransientException.

Throws:

SQLException - throws SQLRecoverableException or SQLNonTransientException the first time it is caught, or SQLTransientException when the maxRetries has reached.

connectWithRetry
```
public static Connection connectWithRetry(String connectionString,
                          long waitWindow,
                          int maxRetries)
                                   throws SQLException
```
Retry connecting to a database with random backoff (same as the one implemented in HDFS-767). This function only retries when the SQL query throws a SQLTransientException (which might be able to succeed with a simple retry). It doesn't retry when the exception is a SQLRecoverableException or SQLNonTransientException. For SQLRecoverableException the caller needs to reconnect to the database and restart the whole transaction.

Parameters:
connectionString - the JDBC connection string.
waitWindow - The base time window (in milliseconds) before the next retry. see getRandomWaitTime(long, int, java.util.Random) for details.
maxRetries - the maximum # of retries when getting a SQLTransientException.

Throws:

SQLException - throws SQLRecoverableException or SQLNonTransientException the first time it is caught, or SQLTransientException when the maxRetries has reached.

prepareWithRetry
```
public static PreparedStatement prepareWithRetry(Connection conn,
                                 String stmt,
                                 long waitWindow,
                                 int maxRetries)
                                          throws SQLException
```
Retry preparing a SQL statement with random backoff (same as the one implemented in HDFS-767). This function only retries when the SQL query throws a SQLTransientException (which might be able to succeed with a simple retry). It doesn't retry when the exception is a SQLRecoverableException or SQLNonTransientException. For SQLRecoverableException the caller needs to reconnect to the database and restart the whole transaction.

Parameters:
conn - a JDBC connection.
stmt - the SQL statement to be prepared.
waitWindow - The base time window (in milliseconds) before the next retry. see getRandomWaitTime(long, int, java.util.Random) for details.
maxRetries - the maximum # of retries when getting a SQLTransientException.

Throws:

SQLException - throws SQLRecoverableException or SQLNonTransientException the first time it is caught, or SQLTransientException when the maxRetries has reached.

setQueryTimeout

public static void setQueryTimeout(Statement stmt,
                   int timeout)
                            throws SQLException

Throws:: SQLException

getRandomWaitTime
```
public static long getRandomWaitTime(long baseWindow,
                     int failures,
                     Random r)
```
Introducing a random factor to the wait time before another retry. The wait time is dependent on # of failures and a random factor. At the first time of getting an exception , the wait time is a random number between 0..baseWindow msec. If the first retry still fails, we will wait baseWindow msec grace period before the 2nd retry. Also at the second retry, the waiting window is expanded to 2*baseWindow msec alleviating the request rate from the server. Similarly the 3rd retry will wait 2*baseWindow msec. grace period before retry and the waiting window is expanded to 3*baseWindow msec and so on.

Parameters:
baseWindow - the base waiting window.
failures - number of failures so far.
r - a random generator.

Returns:
number of milliseconds for the next wait time.

escapeSqlLike
```
public static String escapeSqlLike(String key)
```
Escape the '_', '%', as well as the escape characters inside the string key.

Parameters:
key - the string that will be used for the SQL LIKE operator.

Returns:
a string with escaped '_' and '%'.

formatMsecToStr
```
public static String formatMsecToStr(long msec)
```
Format number of milliseconds to strings

Parameters:
msec - milliseconds

Returns:
a formatted string like "x days y hours z minutes a seconds b msec"

estimateNumberOfReducers
```
public static int estimateNumberOfReducers(HiveConf conf,
                           org.apache.hadoop.fs.ContentSummary inputSummary,
                           MapWork work,
                           boolean finalMapRed)
                                    throws IOException
```
Estimate the number of reducers needed for this job, based on job input, and configuration parameters. The output of this method should only be used if the output of this MapRedTask is not being used to populate a bucketed table and the user has not specified the number of reducers to use.

Returns:
the number of reducers.

Throws:

IOException

estimateReducers

public static int estimateReducers(long totalInputFileSize,
                   long bytesPerReducer,
                   int maxReducers,
                   boolean powersOfTwo)

getTotalInputFileSize
```
public static long getTotalInputFileSize(org.apache.hadoop.fs.ContentSummary inputSummary,
                         MapWork work,
                         double highestSamplePercentage)
```
Computes the total input file size. If block sampling was used it will scale this value by the highest sample percentage (as an estimate for input).

Parameters:
inputSummary -
work -
highestSamplePercentage -

Returns:
estimated total input size for job

getTotalInputNumFiles
```
public static long getTotalInputNumFiles(org.apache.hadoop.fs.ContentSummary inputSummary,
                         MapWork work,
                         double highestSamplePercentage)
```
Computes the total number of input files. If block sampling was used it will scale this value by the highest sample percentage (as an estimate for # input files).

Parameters:
inputSummary -
work -
highestSamplePercentage -

Returns:

getHighestSamplePercentage
```
public static double getHighestSamplePercentage(MapWork work)
```
Returns the highest sample percentage of any alias in the given MapWork

getInputPathsTez

public static List<org.apache.hadoop.fs.Path> getInputPathsTez(org.apache.hadoop.mapred.JobConf job,
                                               MapWork work)
                                                        throws Exception

On Tez we're not creating dummy files when getting/setting input paths. We let Tez handle the situation. We're also setting the paths in the AM so we don't want to depend on scratch dir and context.

Throws:: Exception

getInputPaths
```
public static List<org.apache.hadoop.fs.Path> getInputPaths(org.apache.hadoop.mapred.JobConf job,
                                            MapWork work,
                                            org.apache.hadoop.fs.Path hiveScratchDir,
                                            Context ctx,
                                            boolean skipDummy)
                                                     throws Exception
```
Computes a list of all input paths needed to compute the given MapWork. All aliases are considered and a merged list of input paths is returned. If any input path points to an empty table or partition a dummy file in the scratch dir is instead created and added to the list. This is needed to avoid special casing the operator pipeline for these cases.

Parameters:
job - JobConf used to run the job
work - MapWork encapsulating the info about the task
hiveScratchDir - The tmp dir used to create dummy files if needed
ctx - Context object

Returns:
List of paths to process for the given MapWork

Throws:

Exception

setInputPaths

public static void setInputPaths(org.apache.hadoop.mapred.JobConf job,
                 List<org.apache.hadoop.fs.Path> pathsToAdd)

setInputPaths add all the paths in the provided list to the Job conf object as input paths for the job.

Parameters:: job -; pathsToAdd -

setInputAttributes

public static void setInputAttributes(org.apache.hadoop.conf.Configuration conf,
                      MapWork mWork)

Set hive input format, and input format file if necessary.

createTmpDirs
```
public static void createTmpDirs(org.apache.hadoop.conf.Configuration conf,
                 MapWork mWork)
                          throws IOException
```
Hive uses tmp directories to capture the output of each FileSinkOperator. This method creates all necessary tmp directories for FileSinks in the Mapwork.

Parameters:
conf - Used to get the right FileSystem
mWork - Used to find FileSinkOperators

Throws:

IOException

createTmpDirs
```
public static void createTmpDirs(org.apache.hadoop.conf.Configuration conf,
                 ReduceWork rWork)
                          throws IOException
```
Hive uses tmp directories to capture the output of each FileSinkOperator. This method creates all necessary tmp directories for FileSinks in the ReduceWork.

Parameters:
conf - Used to get the right FileSystem
rWork - Used to find FileSinkOperators

Throws:

IOException

createDirsWithPermission

public static boolean createDirsWithPermission(org.apache.hadoop.conf.Configuration conf,
                               org.apache.hadoop.fs.Path mkdirPath,
                               org.apache.hadoop.fs.permission.FsPermission fsPermission,
                               boolean recursive)
                                        throws IOException

Throws:: IOException

isVectorMode
```
public static boolean isVectorMode(org.apache.hadoop.conf.Configuration conf)
```
Returns true if a plan is both configured for vectorized execution and vectorization is allowed. The plan may be configured for vectorization but vectorization disallowed eg. for FetchOperator execution.

clearWorkMapForConf

public static void clearWorkMapForConf(org.apache.hadoop.conf.Configuration conf)

clearWorkMap
```
public static void clearWorkMap()
```

createTempDir
```
public static File createTempDir(String baseDir)
```
Create a temp dir in specified baseDir This can go away once hive moves to support only JDK 7 and can use Files.createTempDirectory Guava Files.createTempDir() does not take a base dir

Parameters:
baseDir - - directory under which new temp dir will be created

Returns:
File object for new temp dir

skipHeader
```
public static boolean skipHeader(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable> currRecReader,
                 int headerCount,
                 org.apache.hadoop.io.WritableComparable key,
                 org.apache.hadoop.io.Writable value)
                          throws IOException
```
Skip header lines in the table file when reading the record.

Parameters:
currRecReader - Record reader.
headerCount - Header line number of the table files.
key - Key of current reading record.
value - Value of current reading record.

Returns:
Return true if there are 0 or more records left in the file after skipping all headers, otherwise return false.

Throws:

IOException

getHeaderCount
```
public static int getHeaderCount(TableDesc table)
                          throws IOException
```
Get header line count for a table.

Parameters:
table - Table description for target table.

Throws:

IOException

getFooterCount
```
public static int getFooterCount(TableDesc table,
                 org.apache.hadoop.mapred.JobConf job)
                          throws IOException
```
Get footer line count for a table.

Parameters:
table - Table description for target table.
job - Job configuration for current job.

Throws:

IOException

getQualifiedPath

public static String getQualifiedPath(HiveConf conf,
                      org.apache.hadoop.fs.Path path)
                               throws HiveException

Convert path to qualified path.

Parameters:: conf - Hive configuration.; path - Path to convert.
Returns:: Qualified path
Throws:: HiveException

isDefaultNameNode
```
public static boolean isDefaultNameNode(HiveConf conf)
```
Checks if current hive script was executed with non-default namenode

Returns:
True/False

isPerfOrAboveLogging
```
public static boolean isPerfOrAboveLogging(HiveConf conf)
```
Checks if the current HiveServer2 logging operation level is >= PERFORMANCE.

Parameters:
conf - Hive configuration.

Returns:
true if current HiveServer2 logging operation level is >= PERFORMANCE. Else, false.

Class Utilities

Nested Class Summary

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

HADOOP_LOCAL_FS

MAP_PLAN_NAME

REDUCE_PLAN_NAME

MERGE_PLAN_NAME

INPUT_NAME

MAPRED_MAPPER_CLASS

MAPRED_REDUCER_CLASS

HIVE_ADDED_JARS

MAPNAME

REDUCENAME

reduceFieldNameList

runtimeSerializationKryo

sparkSerializationKryo

defaultTd

carriageReturnCode

newLineCode

tabCode

ctrlaCode

INDENT

nullStringStorage

nullStringOutput

randGen

NSTR

suffix

sqlEscapeChar

Method Detail

removeValueTag

clearWork

getMapRedWork

cacheMapWork

setMapWork

getMapWork

setReduceWork

getReduceWork

setMergeWork

getMergeWork

getMergeWork

cacheBaseWork

setBaseWork

getMapWorkVectorScratchColumnTypeMap

setWorkflowAdjacencies

getFieldSchemaString

setMapRedWork

setMapWork

setReduceWork

getPlanPath

serializeExpressionToKryo

deserializeExpressionFromKryo

serializeExpression

deserializeExpression

serializeObject

deserializeObject

cloneOperatorTree

serializePlan

deserializePlan

clonePlan

cloneBaseWork

removeField

getTaskId

makeMap

makeProperties

makeList

getTableDesc

getTableDesc

getPartitionDesc

getPartitionDescFromTableDesc

getOpTreeSkel

contentsEqual

abbreviate

readColumn

createCompressedStream

createCompressedStream

getFileExtension

getFileExtension