PREHOOK: query: -- When Correlation Optimizer is turned off, 6 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery xx, subquery yy, and xx join yy. EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 6 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery xx, subquery yy, and xx join yy. EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-3 depends on stages: Stage-2, Stage-7 Stage-4 depends on stages: Stage-3 Stage-6 is a root stage Stage-7 depends on stages: Stage-6 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col0 type: string expr: _col1 type: bigint $INTNAME1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col0 type: string expr: _col1 type: bigint Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint sort order: ++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-6 Map Reduce Alias -> Map Operator Tree: xx:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string xx:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-7 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 9 146 1 146 4 150 1 150 1 213 1 213 4 224 1 224 4 238 1 238 4 255 1 255 4 273 1 273 9 278 1 278 4 311 1 311 9 369 1 369 9 401 1 401 25 406 1 406 16 66 1 66 1 98 1 98 4 PREHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: xx:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string xx:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 3 Reduce Operator Tree: Demux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Mux Operator Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Mux Operator Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint sort order: ++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 9 146 1 146 4 150 1 150 1 213 1 213 4 224 1 224 4 238 1 238 4 255 1 255 4 273 1 273 9 278 1 278 4 311 1 311 9 369 1 369 9 401 1 401 25 406 1 406 16 66 1 66 1 98 1 98 4 PREHOOK: query: -- Enable hive.auto.convert.join. EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- Enable hive.auto.convert.join. EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-9 is a root stage Stage-2 depends on stages: Stage-9 Stage-3 depends on stages: Stage-2 Stage-0 is a root stage STAGE PLANS: Stage: Stage-9 Map Reduce Local Work Alias -> Map Local Tables: xx:y Fetch Operator limit: -1 yy:y Fetch Operator limit: -1 Alias -> Map Local Operator Tree: xx:y TableScan alias: y HashTable Sink Operator condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 0 yy:y TableScan alias: y HashTable Sink Operator condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 0 Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: xx:x TableScan alias: x Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0 Position of Big Table: 0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint yy:x TableScan alias: x Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0 Position of Big Table: 0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Local Work: Map Reduce Local Work Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint sort order: ++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 9 146 1 146 4 150 1 150 1 213 1 213 4 224 1 224 4 238 1 238 4 255 1 255 4 273 1 273 9 278 1 278 4 311 1 311 9 369 1 369 9 401 1 401 25 406 1 406 16 66 1 66 1 98 1 98 4 PREHOOK: query: -- When Correlation Optimizer is turned off, 3 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 3 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME src) x)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-3 depends on stages: Stage-2 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Select Operator expressions: expr: key type: string outputColumnNames: key Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: key type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col0 type: string expr: _col1 type: bigint xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 3 146 146 2 150 150 1 213 213 2 224 224 2 238 238 2 255 255 2 273 273 3 278 278 2 311 311 3 369 369 3 401 401 5 406 406 4 66 66 1 98 98 2 PREHOOK: query: EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME src) x)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:x TableScan alias: x Select Operator expressions: expr: key type: string outputColumnNames: key Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: key type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Demux Operator Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x GROUP BY x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 3 146 146 2 150 150 1 213 213 2 224 224 2 238 238 2 255 255 2 273 273 3 278 278 2 311 311 3 369 369 3 401 401 5 406 406 4 66 66 1 98 98 2 PREHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-3 depends on stages: Stage-2 Stage-4 depends on stages: Stage-3 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col0 type: string expr: _col1 type: bigint xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 9 146 146 4 150 150 1 213 213 4 224 224 4 238 238 4 255 255 4 273 273 9 278 278 4 311 311 9 369 369 9 401 401 25 406 406 16 66 66 1 98 98 4 PREHOOK: query: EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 Reduce Operator Tree: Demux Operator Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Mux Operator Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 9 146 146 4 150 150 1 213 213 4 224 224 4 238 238 4 255 255 4 273 273 9 278 278 4 311 311 9 369 369 9 401 401 25 406 406 16 66 66 1 98 98 4 PREHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery xx and xx join yy. EXPLAIN SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery xx and xx join yy. EXPLAIN SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_TABREF (TOK_TABNAME src) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key))))) STAGE DEPENDENCIES: Stage-3 is a root stage Stage-4 depends on stages: Stage-3 Stage-1 depends on stages: Stage-4 Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: xx:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string xx:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col0 type: string expr: _col1 type: bigint yy TableScan alias: yy Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 value expressions: expr: key type: string Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 128 1 128 128 1 128 146 1 146 146 1 146 150 1 150 213 1 213 213 1 213 224 1 224 224 1 224 238 1 238 238 1 238 255 1 255 255 1 255 273 1 273 273 1 273 273 1 273 278 1 278 278 1 278 311 1 311 311 1 311 311 1 311 369 1 369 369 1 369 369 1 369 401 1 401 401 1 401 401 1 401 401 1 401 401 1 401 406 1 406 406 1 406 406 1 406 406 1 406 66 1 66 98 1 98 98 1 98 PREHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_TABREF (TOK_TABNAME src) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: xx:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string xx:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 yy TableScan alias: yy Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 value expressions: expr: key type: string Reduce Operator Tree: Demux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Mux Operator Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN src yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 128 1 128 128 1 128 146 1 146 146 1 146 150 1 150 213 1 213 213 1 213 224 1 224 224 1 224 238 1 238 238 1 238 255 1 255 255 1 255 273 1 273 273 1 273 273 1 273 278 1 278 278 1 278 311 1 311 311 1 311 311 1 311 369 1 369 369 1 369 369 1 369 401 1 401 401 1 401 401 1 401 401 1 401 401 1 401 406 1 406 406 1 406 406 1 406 406 1 406 66 1 66 98 1 98 98 1 98 PREHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery xx and xx join yy join zz. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery xx and xx join yy join zz. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_TABREF (TOK_TABNAME src) zz) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL zz) key))) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL zz) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-3 is a root stage Stage-4 depends on stages: Stage-3 Stage-1 depends on stages: Stage-4 Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 2 value expressions: expr: _col0 type: string expr: _col1 type: bigint xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string zz TableScan alias: zz Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 Inner Join 1 to 2 condition expressions: 0 {VALUE._col0} 1 2 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col8, _col9 Select Operator expressions: expr: _col0 type: string expr: _col8 type: string expr: _col9 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 9 128 128 9 128 128 9 146 146 4 146 146 4 150 150 1 213 213 4 213 213 4 224 224 4 224 224 4 238 238 4 238 238 4 255 255 4 255 255 4 273 273 9 273 273 9 273 273 9 278 278 4 278 278 4 311 311 9 311 311 9 311 311 9 369 369 9 369 369 9 369 369 9 401 401 25 401 401 25 401 401 25 401 401 25 401 401 25 406 406 16 406 406 16 406 406 16 406 406 16 66 66 1 98 98 4 98 98 4 PREHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy join zz. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy join zz. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_TABREF (TOK_TABNAME src) zz) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL zz) key))) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL zz) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 3 zz TableScan alias: zz Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Demux Operator Mux Operator Join Operator condition map: Inner Join 0 to 1 Inner Join 1 to 2 condition expressions: 0 {VALUE._col0} 1 2 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col8, _col9 Select Operator expressions: expr: _col0 type: string expr: _col8 type: string expr: _col9 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Mux Operator Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 Inner Join 1 to 2 condition expressions: 0 {VALUE._col0} 1 2 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col8, _col9 Select Operator expressions: expr: _col0 type: string expr: _col8 type: string expr: _col9 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN src zz ON xx.key=zz.key JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON zz.key=yy.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 9 128 128 9 128 128 9 146 146 4 146 146 4 150 150 1 213 213 4 213 213 4 224 224 4 224 224 4 238 238 4 238 238 4 255 255 4 255 255 4 273 273 9 273 273 9 273 273 9 278 278 4 278 278 4 311 311 9 311 311 9 311 311 9 369 369 9 369 369 9 369 369 9 401 401 25 401 401 25 401 401 25 401 401 25 401 401 25 406 406 16 406 406 16 406 406 16 406 406 16 66 66 1 98 98 4 98 98 4 PREHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy join zz. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 4 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery yy and xx join yy join zz. EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key))) (TOK_TABREF (TOK_TABNAME src) zz) (= (. (TOK_TABLE_OR_COL yy) key) (. (TOK_TABLE_OR_COL zz) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-3 is a root stage Stage-4 depends on stages: Stage-3 Stage-1 depends on stages: Stage-4 Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col0 type: string expr: _col1 type: bigint xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string zz TableScan alias: zz Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 Inner Join 1 to 2 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} 2 handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 9 128 128 9 128 128 9 146 146 4 146 146 4 150 150 1 213 213 4 213 213 4 224 224 4 224 224 4 238 238 4 238 238 4 255 255 4 255 255 4 273 273 9 273 273 9 273 273 9 278 278 4 278 278 4 311 311 9 311 311 9 311 311 9 369 369 9 369 369 9 369 369 9 401 401 25 401 401 25 401 401 25 401 401 25 401 401 25 406 406 16 406 406 16 406 406 16 406 406 16 66 66 1 98 98 4 98 98 4 PREHOOK: query: EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key))) (TOK_TABREF (TOK_TABNAME src) zz) (= (. (TOK_TABLE_OR_COL yy) key) (. (TOK_TABLE_OR_COL zz) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: xx TableScan alias: xx Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 value expressions: expr: key type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 zz TableScan alias: zz Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 3 Reduce Operator Tree: Demux Operator Mux Operator Join Operator condition map: Inner Join 0 to 1 Inner Join 1 to 2 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} 2 handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Mux Operator Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 Inner Join 1 to 2 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} 2 handleSkewJoin: false outputColumnNames: _col0, _col4, _col5 Select Operator expressions: expr: _col0 type: string expr: _col4 type: string expr: _col5 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, yy.key, yy.cnt FROM src1 xx JOIN (SELECT x.key as key, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key) yy ON xx.key=yy.key JOIN src zz ON yy.key=zz.key ORDER BY xx.key, yy.key, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 128 9 128 128 9 128 128 9 146 146 4 146 146 4 150 150 1 213 213 4 213 213 4 224 224 4 224 224 4 238 238 4 238 238 4 255 255 4 255 255 4 273 273 9 273 273 9 273 273 9 278 278 4 278 278 4 311 311 9 311 311 9 311 311 9 369 369 9 369 369 9 369 369 9 401 401 25 401 401 25 401 401 25 401 401 25 401 401 25 406 406 16 406 406 16 406 406 16 406 406 16 66 66 1 98 98 4 98 98 4 PREHOOK: query: -- When Correlation Optimizer is turned off, 6 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery tmp and tmp join z. EXPLAIN SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 6 MR jobs are needed. -- When Correlation Optimizer is turned on, 2 MR jobs are needed. -- The first job will evaluate subquery tmp and tmp join z. EXPLAIN SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME src) x)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTIONSTAR count) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME src1) y)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL y) key) key) (TOK_SELEXPR (TOK_FUNCTIONSTAR count) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL y) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key) key) (TOK_SELEXPR (TOK_FUNCTION sum (. (TOK_TABLE_OR_COL xx) cnt)) sum1) (TOK_SELEXPR (TOK_FUNCTION sum (. (TOK_TABLE_OR_COL yy) cnt)) sum2)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL xx) key)))) tmp) (TOK_TABREF (TOK_TABNAME src) z) (= (. (TOK_TABLE_OR_COL tmp) key) (. (TOK_TABLE_OR_COL z) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL tmp) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL tmp) sum1)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL tmp) sum2)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL z) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL z) value))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL tmp) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL tmp) sum1)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL tmp) sum2)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL z) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL z) value))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1, Stage-7 Stage-3 depends on stages: Stage-2 Stage-4 depends on stages: Stage-3 Stage-5 depends on stages: Stage-4 Stage-7 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: tmp:yy:y TableScan alias: y Select Operator expressions: expr: key type: string outputColumnNames: key Group By Operator aggregations: expr: count() bucketGroup: false keys: expr: key type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint $INTNAME1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col0 type: string expr: _col1 type: bigint Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col3 Group By Operator aggregations: expr: sum(_col1) expr: sum(_col3) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint expr: _col2 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: sum(VALUE._col0) expr: sum(VALUE._col1) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint z TableScan alias: z Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 value expressions: expr: key type: string expr: value type: string Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} {VALUE._col2} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-5 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string sort order: +++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-7 Map Reduce Alias -> Map Operator Tree: tmp:xx:x TableScan alias: x Select Operator expressions: expr: key type: string outputColumnNames: key Group By Operator aggregations: expr: count() bucketGroup: false keys: expr: key type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 3 1 128 val_128 128 3 1 128 val_128 128 3 1 128 val_128 146 2 1 146 val_146 146 2 1 146 val_146 150 1 1 150 val_150 213 2 1 213 val_213 213 2 1 213 val_213 224 2 1 224 val_224 224 2 1 224 val_224 238 2 1 238 val_238 238 2 1 238 val_238 255 2 1 255 val_255 255 2 1 255 val_255 273 3 1 273 val_273 273 3 1 273 val_273 273 3 1 273 val_273 278 2 1 278 val_278 278 2 1 278 val_278 311 3 1 311 val_311 311 3 1 311 val_311 311 3 1 311 val_311 369 3 1 369 val_369 369 3 1 369 val_369 369 3 1 369 val_369 401 5 1 401 val_401 401 5 1 401 val_401 401 5 1 401 val_401 401 5 1 401 val_401 401 5 1 401 val_401 406 4 1 406 val_406 406 4 1 406 val_406 406 4 1 406 val_406 406 4 1 406 val_406 66 1 1 66 val_66 98 2 1 98 val_98 98 2 1 98 val_98 PREHOOK: query: EXPLAIN SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME src) x)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTIONSTAR count) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME src1) y)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL y) key) key) (TOK_SELEXPR (TOK_FUNCTIONSTAR count) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL y) key)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key) key) (TOK_SELEXPR (TOK_FUNCTION sum (. (TOK_TABLE_OR_COL xx) cnt)) sum1) (TOK_SELEXPR (TOK_FUNCTION sum (. (TOK_TABLE_OR_COL yy) cnt)) sum2)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL xx) key)))) tmp) (TOK_TABREF (TOK_TABNAME src) z) (= (. (TOK_TABLE_OR_COL tmp) key) (. (TOK_TABLE_OR_COL z) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL tmp) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL tmp) sum1)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL tmp) sum2)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL z) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL z) value))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL tmp) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL tmp) sum1)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL tmp) sum2)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL z) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL z) value))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: tmp:xx:x TableScan alias: x Select Operator expressions: expr: key type: string outputColumnNames: key Group By Operator aggregations: expr: count() bucketGroup: false keys: expr: key type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint tmp:yy:y TableScan alias: y Select Operator expressions: expr: key type: string outputColumnNames: key Group By Operator aggregations: expr: count() bucketGroup: false keys: expr: key type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint z TableScan alias: z Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 value expressions: expr: key type: string expr: value type: string Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col3 Mux Operator Group By Operator aggregations: expr: sum(_col1) expr: sum(_col3) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} {VALUE._col2} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col3 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col3 type: bigint outputColumnNames: _col0, _col1, _col3 Mux Operator Group By Operator aggregations: expr: sum(_col1) expr: sum(_col3) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} {VALUE._col2} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} {VALUE._col2} 1 {VALUE._col0} {VALUE._col1} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string sort order: +++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: bigint expr: _col3 type: string expr: _col4 type: string Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT tmp.key, tmp.sum1, tmp.sum2, z.key, z.value FROM (SELECT xx.key as key, sum(xx.cnt) as sum1, sum(yy.cnt) as sum2 FROM (SELECT x.key as key, count(*) AS cnt FROM src x group by x.key) xx JOIN (SELECT y.key as key, count(*) AS cnt FROM src1 y group by y.key) yy ON (xx.key=yy.key) GROUP BY xx.key) tmp JOIN src z ON tmp.key=z.key ORDER BY tmp.key, tmp.sum1, tmp.sum2, z.key, z.value POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 3 1 128 val_128 128 3 1 128 val_128 128 3 1 128 val_128 146 2 1 146 val_146 146 2 1 146 val_146 150 1 1 150 val_150 213 2 1 213 val_213 213 2 1 213 val_213 224 2 1 224 val_224 224 2 1 224 val_224 238 2 1 238 val_238 238 2 1 238 val_238 255 2 1 255 val_255 255 2 1 255 val_255 273 3 1 273 val_273 273 3 1 273 val_273 273 3 1 273 val_273 278 2 1 278 val_278 278 2 1 278 val_278 311 3 1 311 val_311 311 3 1 311 val_311 311 3 1 311 val_311 369 3 1 369 val_369 369 3 1 369 val_369 369 3 1 369 val_369 401 5 1 401 val_401 401 5 1 401 val_401 401 5 1 401 val_401 401 5 1 401 val_401 401 5 1 401 val_401 406 4 1 406 val_406 406 4 1 406 val_406 406 4 1 406 val_406 406 4 1 406 val_406 66 1 1 66 val_66 98 2 1 98 val_98 98 2 1 98 val_98 PREHOOK: query: -- When Correlation Optimizer is turned off, 6 MR jobs are needed. -- When Correlation Optimizer is turned on, 4 MR jobs are needed. -- 2 MR jobs are used to evaluate yy, 1 MR is used to evaluate xx and xx join yy. -- The last MR is used for ordering. EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: -- When Correlation Optimizer is turned off, 6 MR jobs are needed. -- When Correlation Optimizer is turned on, 4 MR jobs are needed. -- 2 MR jobs are used to evaluate yy, 1 MR is used to evaluate xx and xx join yy. -- The last MR is used for ordering. EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) value) value) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL x) value)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) value)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) value)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-3 depends on stages: Stage-2, Stage-7 Stage-4 depends on stages: Stage-3 Stage-6 is a root stage Stage-7 depends on stages: Stage-6 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string expr: value type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 handleSkewJoin: false outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string outputColumnNames: _col0, _col1 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string expr: _col1 type: string mode: hash outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string sort order: ++ Map-reduce partition columns: expr: _col0 type: string expr: _col1 type: string tag: -1 value expressions: expr: _col2 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string expr: KEY._col1 type: string mode: mergepartial outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint $INTNAME1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col0 type: string expr: _col1 type: bigint Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} {VALUE._col2} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint sort order: +++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-6 Map Reduce Alias -> Map Operator Tree: xx:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string xx:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-7 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: -1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 val_128 9 146 1 146 val_146 4 150 1 150 val_150 1 213 1 213 val_213 4 224 1 224 val_224 4 238 1 238 val_238 4 255 1 255 val_255 4 273 1 273 val_273 9 278 1 278 val_278 4 311 1 311 val_311 9 369 1 369 val_369 9 401 1 401 val_401 25 406 1 406 val_406 16 66 1 66 val_66 1 98 1 98 val_98 4 PREHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) value) value) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL x) value)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) value)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) value)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-3 depends on stages: Stage-2 Stage-4 depends on stages: Stage-3 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string expr: value type: string yy:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 handleSkewJoin: false outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string outputColumnNames: _col0, _col1 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string expr: _col1 type: string mode: hash outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string sort order: ++ Map-reduce partition columns: expr: _col0 type: string expr: _col1 type: string tag: -1 value expressions: expr: _col2 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string expr: KEY._col1 type: string mode: mergepartial outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 2 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint xx:x TableScan alias: x Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string xx:y TableScan alias: y Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 Reduce Operator Tree: Demux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} 1 handleSkewJoin: false outputColumnNames: _col0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Mux Operator Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} {VALUE._col2} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} {VALUE._col2} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint sort order: +++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 val_128 9 146 1 146 val_146 4 150 1 150 val_150 1 213 1 213 val_213 4 224 1 224 val_224 4 238 1 238 val_238 4 255 1 255 val_255 4 273 1 273 val_273 9 278 1 278 val_278 4 311 1 311 val_311 9 369 1 369 val_369 9 401 1 401 val_401 25 406 1 406 val_406 16 66 1 66 val_66 1 98 1 98 val_98 4 PREHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src1) x) (TOK_TABREF (TOK_TABNAME src1) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key)))) xx) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) x) (TOK_TABREF (TOK_TABNAME src) y) (= (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL y) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) key) key) (TOK_SELEXPR (. (TOK_TABLE_OR_COL x) value) value) (TOK_SELEXPR (TOK_FUNCTION count 1) cnt)) (TOK_GROUPBY (. (TOK_TABLE_OR_COL x) key) (. (TOK_TABLE_OR_COL x) value)))) yy) (= (. (TOK_TABLE_OR_COL xx) key) (. (TOK_TABLE_OR_COL yy) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) key)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) value)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL yy) cnt))) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL xx) cnt)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) value)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL yy) cnt))))) STAGE DEPENDENCIES: Stage-11 is a root stage Stage-2 depends on stages: Stage-11 Stage-10 depends on stages: Stage-2 Stage-3 depends on stages: Stage-10 Stage-4 depends on stages: Stage-3 Stage-0 is a root stage STAGE PLANS: Stage: Stage-11 Map Reduce Local Work Alias -> Map Local Tables: yy:y Fetch Operator limit: -1 Alias -> Map Local Operator Tree: yy:y TableScan alias: y HashTable Sink Operator condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 0 Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: yy:x TableScan alias: x Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0, _col1 Position of Big Table: 0 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string outputColumnNames: _col0, _col1 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string expr: _col1 type: string mode: hash outputColumnNames: _col0, _col1, _col2 Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string sort order: ++ Map-reduce partition columns: expr: _col0 type: string expr: _col1 type: string tag: -1 value expressions: expr: _col2 type: bigint Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string expr: KEY._col1 type: string mode: mergepartial outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-10 Map Reduce Local Work Alias -> Map Local Tables: xx:y Fetch Operator limit: -1 Alias -> Map Local Operator Tree: xx:y TableScan alias: y HashTable Sink Operator condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 0 Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint xx:x TableScan alias: x Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0 Position of Big Table: 0 Select Operator expressions: expr: _col0 type: string outputColumnNames: _col0 Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: _col0 type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint Local Work: Map Reduce Local Work Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} {VALUE._col2} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Mux Operator Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} {VALUE._col2} handleSkewJoin: false outputColumnNames: _col0, _col1, _col2, _col3, _col4 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint outputColumnNames: _col0, _col1, _col2, _col3, _col4 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint sort order: +++++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: bigint expr: _col2 type: string expr: _col3 type: string expr: _col4 type: bigint Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT xx.key, xx.cnt, yy.key, yy.value, yy.cnt FROM (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = y.key) group by x.key) xx JOIN (SELECT x.key as key, x.value as value, count(1) as cnt FROM src x JOIN src y ON (x.key = y.key) group by x.key, x.value) yy ON xx.key=yy.key ORDER BY xx.key, xx.cnt, yy.key, yy.value, yy.cnt POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### 128 1 128 val_128 9 146 1 146 val_146 4 150 1 150 val_150 1 213 1 213 val_213 4 224 1 224 val_224 4 238 1 238 val_238 4 255 1 255 val_255 4 273 1 273 val_273 9 278 1 278 val_278 4 311 1 311 val_311 9 369 1 369 val_369 9 401 1 401 val_401 25 406 1 406 val_406 16 66 1 66 val_66 1 98 1 98 val_98 4