PREHOOK: query: explain select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 PREHOOK: type: QUERY POSTHOOK: query: explain select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string), value (type: string) outputColumnNames: key, value Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(value) keys: key (type: string) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1 Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col1 > 1) (type: boolean) Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: bigint) sort order: +- Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: bigint) outputColumnNames: _col0, _col1 Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 PREHOOK: type: QUERY PREHOOK: Input: default@srcpart PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11 #### A masked pattern was here #### POSTHOOK: query: select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 POSTHOOK: type: QUERY POSTHOOK: Input: default@srcpart POSTHOOK: Input: default@srcpart@ds=2008-04-08/hr=11 #### A masked pattern was here #### 0 3 100 2 103 2 104 2 113 2 118 2 119 3 12 2 120 2 125 2 128 3 129 2 134 2 137 2 138 4 146 2 149 2 15 2 152 2 164 2 165 2 167 3 169 4 172 2 174 2 175 2 176 2 179 2 18 2 187 3 191 2 193 3 195 2 197 2 199 3 200 2 203 2 205 2 207 2 208 3 209 2 213 2 216 2 217 2 219 2 221 2 223 2 224 2 229 2 230 5 233 2 237 2 238 2 239 2 24 2 242 2 255 2 256 2 26 2 265 2 272 2 273 3 277 4 278 2 280 2 281 2 282 2 288 2 298 3 307 2 309 2 311 3 316 3 317 2 318 3 321 2 322 2 325 2 327 3 331 2 333 2 342 2 344 2 348 5 35 3 353 2 367 2 369 3 37 2 382 2 384 3 395 2 396 3 397 2 399 2 401 5 403 3 404 2 406 4 409 3 413 2 414 2 417 3 42 2 424 2 429 2 430 3 431 3 438 3 439 2 454 3 458 2 459 2 462 2 463 2 466 3 468 4 469 5 478 2 480 3 489 4 492 2 498 3 5 3 51 2 58 2 67 2 70 3 72 2 76 2 83 2 84 2 90 3 95 2 97 2 98 2 PREHOOK: query: EXPLAIN SELECT user_id FROM ( SELECT CAST(key AS INT) AS user_id ,CASE WHEN (value LIKE 'aaa%' OR value LIKE 'vvv%') THEN 1 ELSE 0 END AS tag_student FROM srcpart ) sub WHERE sub.tag_student > 0 PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT user_id FROM ( SELECT CAST(key AS INT) AS user_id ,CASE WHEN (value LIKE 'aaa%' OR value LIKE 'vvv%') THEN 1 ELSE 0 END AS tag_student FROM srcpart ) sub WHERE sub.tag_student > 0 POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: srcpart Statistics: Num rows: 2000 Data size: 21248 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (CASE WHEN (((value like 'aaa%') or (value like 'vvv%'))) THEN (1) ELSE (0) END > 0) (type: boolean) Statistics: Num rows: 666 Data size: 7075 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToInteger(key) (type: int) outputColumnNames: _col0 Statistics: Num rows: 666 Data size: 7075 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 666 Data size: 7075 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: EXPLAIN SELECT x.key, x.value as v1, y.key FROM SRC x JOIN SRC y ON (x.key = y.key) where x.key = 20 CLUSTER BY v1 PREHOOK: type: QUERY POSTHOOK: query: EXPLAIN SELECT x.key, x.value as v1, y.key FROM SRC x JOIN SRC y ON (x.key = y.key) where x.key = 20 CLUSTER BY v1 POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: x Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key = 20) (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: '20' (type: string) sort order: + Map-reduce partition columns: '20' (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE value expressions: value (type: string) TableScan alias: y Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (key = 20) (type: boolean) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: '20' (type: string) sort order: + Map-reduce partition columns: '20' (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 key (type: string) 1 key (type: string) outputColumnNames: _col1 Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator key expressions: _col1 (type: string) sort order: + Map-reduce partition columns: _col1 (type: string) Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Select Operator expressions: '20' (type: string), KEY.reducesinkkey0 (type: string), '20' (type: string) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: explain select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 PREHOOK: type: QUERY POSTHOOK: query: explain select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string), value (type: string) outputColumnNames: key, value Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(value) keys: key (type: string) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1 Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: bigint) sort order: +- Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: bigint) outputColumnNames: _col0, _col1 Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col1 > 1) (type: boolean) Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), _col1 (type: bigint) outputColumnNames: _col0, _col1 Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink PREHOOK: query: select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 PREHOOK: type: QUERY PREHOOK: Input: default@srcpart PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11 #### A masked pattern was here #### POSTHOOK: query: select b.key,b.cc from ( select a.* from ( select key, count(value) as cc from srcpart a where a.ds = '2008-04-08' and a.hr = '11' group by key )a distribute by a.key sort by a.key,a.cc desc) b where b.cc>1 POSTHOOK: type: QUERY POSTHOOK: Input: default@srcpart POSTHOOK: Input: default@srcpart@ds=2008-04-08/hr=11 #### A masked pattern was here #### 0 3 100 2 103 2 104 2 113 2 118 2 119 3 12 2 120 2 125 2 128 3 129 2 134 2 137 2 138 4 146 2 149 2 15 2 152 2 164 2 165 2 167 3 169 4 172 2 174 2 175 2 176 2 179 2 18 2 187 3 191 2 193 3 195 2 197 2 199 3 200 2 203 2 205 2 207 2 208 3 209 2 213 2 216 2 217 2 219 2 221 2 223 2 224 2 229 2 230 5 233 2 237 2 238 2 239 2 24 2 242 2 255 2 256 2 26 2 265 2 272 2 273 3 277 4 278 2 280 2 281 2 282 2 288 2 298 3 307 2 309 2 311 3 316 3 317 2 318 3 321 2 322 2 325 2 327 3 331 2 333 2 342 2 344 2 348 5 35 3 353 2 367 2 369 3 37 2 382 2 384 3 395 2 396 3 397 2 399 2 401 5 403 3 404 2 406 4 409 3 413 2 414 2 417 3 42 2 424 2 429 2 430 3 431 3 438 3 439 2 454 3 458 2 459 2 462 2 463 2 466 3 468 4 469 5 478 2 480 3 489 4 492 2 498 3 5 3 51 2 58 2 67 2 70 3 72 2 76 2 83 2 84 2 90 3 95 2 97 2 98 2