SparkUtilities (Hive 2.1.1 API)

java.lang.Object
- org.apache.hadoop.hive.ql.exec.spark.SparkUtilities

```
public class SparkUtilities
extends Object
```
Contains utilities methods used as part of Spark tasks.

Constructor Summary

Constructors
Constructor and Description

SparkUtilities()

Constructors
Constructor and Description
`SparkUtilities()`

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static void`	`collectOp(Collection<Operator<?>> result, Operator<?> root, Class<?> clazz)` Recursively find all operators under root, that are of class clazz, and put them in result.
`static org.apache.hadoop.io.BytesWritable`	`copyBytesWritable(org.apache.hadoop.io.BytesWritable bw)`
`static HiveKey`	`copyHiveKey(HiveKey key)`
`static SparkTask`	`createSparkTask(HiveConf conf)`
`static SparkTask`	`createSparkTask(SparkWork work, HiveConf conf)`
`static org.apache.hadoop.fs.Path`	`generateTmpPathForPartitionPruning(org.apache.hadoop.fs.Path basePath, String id)` Generate a temporary path for dynamic partition pruning in Spark branch TODO: no longer need this if we use accumulator!
`static SparkSession`	`getSparkSession(HiveConf conf, SparkSessionManager sparkSessionManager)`
`static URI`	`getURI(String path)`
`static String`	`getWorkId(BaseWork work)` Return the ID for this BaseWork, in String form.
`static boolean`	`isDedicatedCluster(org.apache.hadoop.conf.Configuration conf)`
`static boolean`	`needUploadToHDFS(URI source, org.apache.spark.SparkConf sparkConf)`
`static String`	`rddGraphToString(org.apache.spark.api.java.JavaPairRDD rdd)`
`static URI`	`uploadToHDFS(URI source, HiveConf conf)` Uploads a local file to HDFS

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail
- SparkUtilities
```
public SparkUtilities()
```

Method Detail

copyHiveKey

public static HiveKey copyHiveKey(HiveKey key)

copyBytesWritable

public static org.apache.hadoop.io.BytesWritable copyBytesWritable(org.apache.hadoop.io.BytesWritable bw)

getURI

public static URI getURI(String path)
                  throws URISyntaxException

Throws:: URISyntaxException

uploadToHDFS

public static URI uploadToHDFS(URI source,
                               HiveConf conf)
                        throws IOException

Uploads a local file to HDFS

Parameters:: source -; conf -
Returns:
Throws:: IOException

needUploadToHDFS

public static boolean needUploadToHDFS(URI source,
                                       org.apache.spark.SparkConf sparkConf)

isDedicatedCluster

public static boolean isDedicatedCluster(org.apache.hadoop.conf.Configuration conf)

getSparkSession

public static SparkSession getSparkSession(HiveConf conf,
                                           SparkSessionManager sparkSessionManager)
                                    throws HiveException

Throws:: HiveException

rddGraphToString

public static String rddGraphToString(org.apache.spark.api.java.JavaPairRDD rdd)

generateTmpPathForPartitionPruning

public static org.apache.hadoop.fs.Path generateTmpPathForPartitionPruning(org.apache.hadoop.fs.Path basePath,
                                                                           String id)

Generate a temporary path for dynamic partition pruning in Spark branch TODO: no longer need this if we use accumulator!

Parameters:: basePath -; id -
Returns:

getWorkId
```
public static String getWorkId(BaseWork work)
```
Return the ID for this BaseWork, in String form.

Parameters:

work - the input BaseWork

Returns:

the unique ID for this BaseWork

createSparkTask

public static SparkTask createSparkTask(HiveConf conf)

createSparkTask

public static SparkTask createSparkTask(SparkWork work,
                                        HiveConf conf)

collectOp
```
public static void collectOp(Collection<Operator<?>> result,
                             Operator<?> root,
                             Class<?> clazz)
```
Recursively find all operators under root, that are of class clazz, and put them in result.

Parameters:

result - all operators under root that are of class clazz

root - the root operator under which all operators will be examined

clazz - clas to collect. Must NOT be null.

Class SparkUtilities

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

SparkUtilities

Method Detail

copyHiveKey

copyBytesWritable

getURI

uploadToHDFS

needUploadToHDFS

isDedicatedCluster

getSparkSession

rddGraphToString

generateTmpPathForPartitionPruning

getWorkId

createSparkTask

createSparkTask

collectOp