PREHOOK: query: CREATE TABLE tbl1(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS PREHOOK: type: CREATETABLE PREHOOK: Output: database:default PREHOOK: Output: default@tbl1 POSTHOOK: query: CREATE TABLE tbl1(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS POSTHOOK: type: CREATETABLE POSTHOOK: Output: database:default POSTHOOK: Output: default@tbl1 PREHOOK: query: CREATE TABLE tbl2(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS PREHOOK: type: CREATETABLE PREHOOK: Output: database:default PREHOOK: Output: default@tbl2 POSTHOOK: query: CREATE TABLE tbl2(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS POSTHOOK: type: CREATETABLE POSTHOOK: Output: database:default POSTHOOK: Output: default@tbl2 PREHOOK: query: CREATE TABLE tbl3(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS PREHOOK: type: CREATETABLE PREHOOK: Output: database:default PREHOOK: Output: default@tbl3 POSTHOOK: query: CREATE TABLE tbl3(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS POSTHOOK: type: CREATETABLE POSTHOOK: Output: database:default POSTHOOK: Output: default@tbl3 PREHOOK: query: CREATE TABLE tbl4(key int, value string) CLUSTERED BY (value) SORTED BY (value) INTO 2 BUCKETS PREHOOK: type: CREATETABLE PREHOOK: Output: database:default PREHOOK: Output: default@tbl4 POSTHOOK: query: CREATE TABLE tbl4(key int, value string) CLUSTERED BY (value) SORTED BY (value) INTO 2 BUCKETS POSTHOOK: type: CREATETABLE POSTHOOK: Output: database:default POSTHOOK: Output: default@tbl4 PREHOOK: query: insert overwrite table tbl1 select * from src PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: default@tbl1 POSTHOOK: query: insert overwrite table tbl1 select * from src POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Output: default@tbl1 POSTHOOK: Lineage: tbl1.key EXPRESSION [(src)src.FieldSchema(name:key, type:string, comment:default), ] POSTHOOK: Lineage: tbl1.value SIMPLE [(src)src.FieldSchema(name:value, type:string, comment:default), ] PREHOOK: query: insert overwrite table tbl2 select * from src PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: default@tbl2 POSTHOOK: query: insert overwrite table tbl2 select * from src POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Output: default@tbl2 POSTHOOK: Lineage: tbl2.key EXPRESSION [(src)src.FieldSchema(name:key, type:string, comment:default), ] POSTHOOK: Lineage: tbl2.value SIMPLE [(src)src.FieldSchema(name:value, type:string, comment:default), ] PREHOOK: query: insert overwrite table tbl3 select * from src PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: default@tbl3 POSTHOOK: query: insert overwrite table tbl3 select * from src POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Output: default@tbl3 POSTHOOK: Lineage: tbl3.key EXPRESSION [(src)src.FieldSchema(name:key, type:string, comment:default), ] POSTHOOK: Lineage: tbl3.value SIMPLE [(src)src.FieldSchema(name:value, type:string, comment:default), ] PREHOOK: query: insert overwrite table tbl4 select * from src PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: default@tbl4 POSTHOOK: query: insert overwrite table tbl4 select * from src POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Output: default@tbl4 POSTHOOK: Lineage: tbl4.key EXPRESSION [(src)src.FieldSchema(name:key, type:string, comment:default), ] POSTHOOK: Lineage: tbl4.value SIMPLE [(src)src.FieldSchema(name:value, type:string, comment:default), ] PREHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on a different key -- Three tests below are all the same query with different alias, which changes dispatch order of GenMapRedWalker -- This is dependent to iteration order of HashMap, so can be meaningless in non-sun jdk -- b = TS[0]-OP[13]-MAPJOIN[11]-RS[6]-JOIN[8]-SEL[9]-FS[10] -- c = TS[1]-RS[7]-JOIN[8] -- a = TS[2]-MAPJOIN[11] explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on a different key -- Three tests below are all the same query with different alias, which changes dispatch order of GenMapRedWalker -- This is dependent to iteration order of HashMap, so can be meaningless in non-sun jdk -- b = TS[0]-OP[13]-MAPJOIN[11]-RS[6]-JOIN[8]-SEL[9]-FS[10] -- c = TS[1]-RS[7]-JOIN[8] -- a = TS[2]-MAPJOIN[11] explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key is not null and value is not null) (type: boolean) Statistics: Num rows: 125 Data size: 1328 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: int) 1 key (type: int) outputColumnNames: _col1 Reduce Output Operator key expressions: _col1 (type: string) sort order: + Map-reduce partition columns: _col1 (type: string) TableScan alias: c Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: string) 1 value (type: string) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 #### A masked pattern was here #### 2654 PREHOOK: query: -- d = TS[0]-RS[7]-JOIN[8]-SEL[9]-FS[10] -- b = TS[1]-OP[13]-MAPJOIN[11]-RS[6]-JOIN[8] -- a = TS[2]-MAPJOIN[11] explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src d on d.value = a.value PREHOOK: type: QUERY POSTHOOK: query: -- d = TS[0]-RS[7]-JOIN[8]-SEL[9]-FS[10] -- b = TS[1]-OP[13]-MAPJOIN[11]-RS[6]-JOIN[8] -- a = TS[2]-MAPJOIN[11] explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src d on d.value = a.value POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key is not null and value is not null) (type: boolean) Statistics: Num rows: 125 Data size: 1328 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: int) 1 key (type: int) outputColumnNames: _col1 Reduce Output Operator key expressions: _col1 (type: string) sort order: + Map-reduce partition columns: _col1 (type: string) TableScan alias: d Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: string) 1 value (type: string) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src d on d.value = a.value PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src d on d.value = a.value POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 #### A masked pattern was here #### 2654 PREHOOK: query: -- b = TS[0]-OP[13]-MAPJOIN[11]-RS[6]-JOIN[8]-SEL[9]-FS[10] -- a = TS[1]-MAPJOIN[11] -- h = TS[2]-RS[7]-JOIN[8] explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on h.value = a.value PREHOOK: type: QUERY POSTHOOK: query: -- b = TS[0]-OP[13]-MAPJOIN[11]-RS[6]-JOIN[8]-SEL[9]-FS[10] -- a = TS[1]-MAPJOIN[11] -- h = TS[2]-RS[7]-JOIN[8] explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on h.value = a.value POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key is not null and value is not null) (type: boolean) Statistics: Num rows: 125 Data size: 1328 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: int) 1 key (type: int) outputColumnNames: _col1 Reduce Output Operator key expressions: _col1 (type: string) sort order: + Map-reduce partition columns: _col1 (type: string) TableScan alias: h Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: string) 1 value (type: string) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on h.value = a.value PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on h.value = a.value POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 #### A masked pattern was here #### 2654 PREHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-8 is a root stage , consists of Stage-9, Stage-10, Stage-11, Stage-1 Stage-9 has a backup stage: Stage-1 Stage-5 depends on stages: Stage-9 Stage-2 depends on stages: Stage-1, Stage-5, Stage-6, Stage-7 Stage-10 has a backup stage: Stage-1 Stage-6 depends on stages: Stage-10 Stage-11 has a backup stage: Stage-1 Stage-7 depends on stages: Stage-11 Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-8 Conditional Operator Stage: Stage-9 Map Reduce Local Work Alias -> Map Local Tables: b Fetch Operator limit: -1 c Fetch Operator limit: -1 Alias -> Map Local Operator Tree: b TableScan alias: b Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) c TableScan alias: c Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Stage: Stage-5 Map Reduce Map Operator Tree: TableScan alias: a Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Local Work: Map Reduce Local Work Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-10 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 c Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) c TableScan alias: c Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Stage: Stage-6 Map Reduce Map Operator Tree: TableScan alias: b Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Local Work: Map Reduce Local Work Stage: Stage-11 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 b Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) b TableScan alias: b Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Stage: Stage-7 Map Reduce Map Operator Tree: TableScan alias: c Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Local Work: Map Reduce Local Work Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: UDFToDouble(key) (type: double) sort order: + Map-reduce partition columns: UDFToDouble(key) (type: double) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE TableScan alias: b Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: UDFToDouble(key) (type: double) sort order: + Map-reduce partition columns: UDFToDouble(key) (type: double) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE TableScan alias: c Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: UDFToDouble(key) (type: double) sort order: + Map-reduce partition columns: UDFToDouble(key) (type: double) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 #### A masked pattern was here #### 2654 PREHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key PREHOOK: type: QUERY PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 PREHOOK: Input: default@tbl3 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key POSTHOOK: type: QUERY POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 POSTHOOK: Input: default@tbl3 #### A masked pattern was here #### 2654 PREHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on a different key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on a different key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key is not null and value is not null) (type: boolean) Statistics: Num rows: 125 Data size: 1328 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: int) 1 key (type: int) outputColumnNames: _col1 Reduce Output Operator key expressions: _col1 (type: string) sort order: + Map-reduce partition columns: _col1 (type: string) TableScan alias: c Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: string) 1 value (type: string) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value PREHOOK: type: QUERY PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 PREHOOK: Input: default@tbl4 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value POSTHOOK: type: QUERY POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 POSTHOOK: Input: default@tbl4 #### A masked pattern was here #### 2654 PREHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on a different key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on a different key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key is not null and value is not null) (type: boolean) Statistics: Num rows: 125 Data size: 1328 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: int) 1 key (type: int) outputColumnNames: _col1 Reduce Output Operator key expressions: _col1 (type: string) sort order: + Map-reduce partition columns: _col1 (type: string) TableScan alias: c Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: string) 1 value (type: string) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.value = a.value POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 #### A masked pattern was here #### 2654 PREHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a non-bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-8 is a root stage , consists of Stage-9, Stage-10, Stage-11, Stage-1 Stage-9 has a backup stage: Stage-1 Stage-5 depends on stages: Stage-9 Stage-2 depends on stages: Stage-1, Stage-5, Stage-6, Stage-7 Stage-10 has a backup stage: Stage-1 Stage-6 depends on stages: Stage-10 Stage-11 has a backup stage: Stage-1 Stage-7 depends on stages: Stage-11 Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-8 Conditional Operator Stage: Stage-9 Map Reduce Local Work Alias -> Map Local Tables: b Fetch Operator limit: -1 c Fetch Operator limit: -1 Alias -> Map Local Operator Tree: b TableScan alias: b Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) c TableScan alias: c Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Stage: Stage-5 Map Reduce Map Operator Tree: TableScan alias: a Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Local Work: Map Reduce Local Work Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-10 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 c Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) c TableScan alias: c Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Stage: Stage-6 Map Reduce Map Operator Tree: TableScan alias: b Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Local Work: Map Reduce Local Work Stage: Stage-11 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 b Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) b TableScan alias: b Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) HashTable Sink Operator keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Stage: Stage-7 Map Reduce Map Operator Tree: TableScan alias: c Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Local Work: Map Reduce Local Work Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: UDFToDouble(key) (type: double) sort order: + Map-reduce partition columns: UDFToDouble(key) (type: double) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE TableScan alias: b Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: UDFToDouble(key) (type: double) sort order: + Map-reduce partition columns: UDFToDouble(key) (type: double) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE TableScan alias: c Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: UDFToDouble(key) is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: UDFToDouble(key) (type: double) sort order: + Map-reduce partition columns: UDFToDouble(key) (type: double) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 UDFToDouble(key) (type: double) 1 UDFToDouble(key) (type: double) 2 UDFToDouble(key) (type: double) Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src c on c.key = a.key POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 #### A masked pattern was here #### 2654 PREHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on the same key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-7 is a root stage , consists of Stage-8, Stage-9, Stage-10, Stage-1 Stage-8 has a backup stage: Stage-1 Stage-4 depends on stages: Stage-8 Stage-9 has a backup stage: Stage-1 Stage-5 depends on stages: Stage-9 Stage-10 has a backup stage: Stage-1 Stage-6 depends on stages: Stage-10 Stage-1 Stage-0 depends on stages: Stage-4, Stage-5, Stage-6, Stage-1 STAGE PLANS: Stage: Stage-7 Conditional Operator Stage: Stage-8 Map Reduce Local Work Alias -> Map Local Tables: b Fetch Operator limit: -1 c Fetch Operator limit: -1 Alias -> Map Local Operator Tree: b TableScan alias: b Filter Operator predicate: key is not null (type: boolean) HashTable Sink Operator keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) c TableScan alias: c Filter Operator predicate: key is not null (type: boolean) HashTable Sink Operator keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Stage: Stage-4 Map Reduce Map Operator Tree: TableScan alias: a Filter Operator predicate: key is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-9 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 c Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Filter Operator predicate: key is not null (type: boolean) HashTable Sink Operator keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) c TableScan alias: c Filter Operator predicate: key is not null (type: boolean) HashTable Sink Operator keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Stage: Stage-5 Map Reduce Map Operator Tree: TableScan alias: b Filter Operator predicate: key is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-10 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 b Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Filter Operator predicate: key is not null (type: boolean) HashTable Sink Operator keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) b TableScan alias: b Filter Operator predicate: key is not null (type: boolean) HashTable Sink Operator keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Stage: Stage-6 Map Reduce Map Operator Tree: TableScan alias: c Filter Operator predicate: key is not null (type: boolean) Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 key (type: int) 1 key (type: int) 2 key (type: int) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key PREHOOK: type: QUERY PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 PREHOOK: Input: default@tbl3 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl3 c on c.key = a.key POSTHOOK: type: QUERY POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 POSTHOOK: Input: default@tbl3 #### A masked pattern was here #### 2654 PREHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on a different key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value PREHOOK: type: QUERY POSTHOOK: query: -- A SMB join is being followed by a regular join on a bucketed table on a different key explain select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key is not null and value is not null) (type: boolean) Statistics: Num rows: 125 Data size: 1328 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: int) 1 key (type: int) outputColumnNames: _col1 Reduce Output Operator key expressions: _col1 (type: string) sort order: + Map-reduce partition columns: _col1 (type: string) TableScan alias: c Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: string) 1 value (type: string) Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value PREHOOK: type: QUERY PREHOOK: Input: default@tbl1 PREHOOK: Input: default@tbl2 PREHOOK: Input: default@tbl4 #### A masked pattern was here #### POSTHOOK: query: select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join tbl4 c on c.value = a.value POSTHOOK: type: QUERY POSTHOOK: Input: default@tbl1 POSTHOOK: Input: default@tbl2 POSTHOOK: Input: default@tbl4 #### A masked pattern was here #### 2654