PREHOOK: query: --HIVE-2101 mapjoin sometimes gives wrong results if there is a filter in the on condition SELECT * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: --HIVE-2101 mapjoin sometimes gives wrong results if there is a filter in the on condition SELECT * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### NULL NULL 128 128 val_128 NULL NULL 128 128 val_128 NULL NULL 128 128 val_128 NULL NULL 146 val_146 146 val_146 NULL NULL 146 val_146 146 val_146 NULL NULL 150 val_150 150 val_150 NULL NULL 213 val_213 213 val_213 NULL NULL 213 val_213 213 val_213 NULL NULL 224 224 val_224 NULL NULL 224 224 val_224 NULL NULL 238 val_238 238 val_238 NULL NULL 238 val_238 238 val_238 NULL NULL 255 val_255 255 val_255 NULL NULL 255 val_255 255 val_255 NULL NULL 273 val_273 273 val_273 NULL NULL 273 val_273 273 val_273 NULL NULL 273 val_273 273 val_273 NULL NULL 278 val_278 278 val_278 NULL NULL 278 val_278 278 val_278 NULL NULL 66 val_66 66 val_66 NULL NULL 98 val_98 98 val_98 NULL NULL 98 val_98 98 val_98 PREHOOK: query: explain SELECT /*+ mapjoin(src1, src2) */ * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key PREHOOK: type: QUERY POSTHOOK: query: explain SELECT /*+ mapjoin(src1, src2) */ * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_RIGHTOUTERJOIN (TOK_TABREF (TOK_TABNAME src1)) (TOK_TABREF (TOK_TABNAME src1) src2) (AND (AND (= (. (TOK_TABLE_OR_COL src1) key) (. (TOK_TABLE_OR_COL src2) key)) (< (. (TOK_TABLE_OR_COL src1) key) 10)) (> (. (TOK_TABLE_OR_COL src2) key) 10))) (TOK_TABREF (TOK_TABNAME src) src3) (AND (= (. (TOK_TABLE_OR_COL src2) key) (. (TOK_TABLE_OR_COL src3) key)) (< (. (TOK_TABLE_OR_COL src3) key) 300)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_HINTLIST (TOK_HINT TOK_MAPJOIN (TOK_HINTARGLIST src1 src2))) (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_SORTBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL src1) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL src2) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL src3) key))))) STAGE DEPENDENCIES: Stage-5 is a root stage Stage-1 depends on stages: Stage-5 Stage-2 depends on stages: Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias -> Map Local Tables: src1 Fetch Operator limit: -1 src2 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: src1 TableScan alias: src1 Filter Operator predicate: expr: ((key < 10.0) and (key < 300.0)) type: boolean HashTable Sink Operator condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] Position of Big Table: 2 src2 TableScan alias: src2 Filter Operator predicate: expr: (key < 300.0) type: boolean HashTable Sink Operator condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] Position of Big Table: 2 Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: src3 TableScan alias: src3 Filter Operator predicate: expr: (key < 300.0) type: boolean Map Join Operator condition map: Right Outer Join0 to 1 Inner Join 1 to 2 condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] outputColumnNames: _col0, _col1, _col4, _col5, _col8, _col9 Position of Big Table: 2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Local Work: Map Reduce Local Work Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col4 type: string expr: _col5 type: string expr: _col8 type: string expr: _col9 type: string outputColumnNames: _col0, _col1, _col4, _col5, _col8, _col9 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col4 type: string expr: _col5 type: string expr: _col8 type: string expr: _col9 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Reduce Output Operator key expressions: expr: _col0 type: string expr: _col2 type: string expr: _col4 type: string sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: string expr: _col3 type: string expr: _col4 type: string expr: _col5 type: string Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT /*+ mapjoin(src1, src2) */ * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT /*+ mapjoin(src1, src2) */ * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### NULL NULL 128 128 val_128 NULL NULL 128 128 val_128 NULL NULL 128 128 val_128 NULL NULL 146 val_146 146 val_146 NULL NULL 146 val_146 146 val_146 NULL NULL 150 val_150 150 val_150 NULL NULL 213 val_213 213 val_213 NULL NULL 213 val_213 213 val_213 NULL NULL 224 224 val_224 NULL NULL 224 224 val_224 NULL NULL 238 val_238 238 val_238 NULL NULL 238 val_238 238 val_238 NULL NULL 255 val_255 255 val_255 NULL NULL 255 val_255 255 val_255 NULL NULL 273 val_273 273 val_273 NULL NULL 273 val_273 273 val_273 NULL NULL 273 val_273 273 val_273 NULL NULL 278 val_278 278 val_278 NULL NULL 278 val_278 278 val_278 NULL NULL 66 val_66 66 val_66 NULL NULL 98 val_98 98 val_98 NULL NULL 98 val_98 98 val_98 PREHOOK: query: explain SELECT * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key PREHOOK: type: QUERY POSTHOOK: query: explain SELECT * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key POSTHOOK: type: QUERY ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_RIGHTOUTERJOIN (TOK_TABREF (TOK_TABNAME src1)) (TOK_TABREF (TOK_TABNAME src1) src2) (AND (AND (= (. (TOK_TABLE_OR_COL src1) key) (. (TOK_TABLE_OR_COL src2) key)) (< (. (TOK_TABLE_OR_COL src1) key) 10)) (> (. (TOK_TABLE_OR_COL src2) key) 10))) (TOK_TABREF (TOK_TABNAME src) src3) (AND (= (. (TOK_TABLE_OR_COL src2) key) (. (TOK_TABLE_OR_COL src3) key)) (< (. (TOK_TABLE_OR_COL src3) key) 300)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_SORTBY (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL src1) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL src2) key)) (TOK_TABSORTCOLNAMEASC (. (TOK_TABLE_OR_COL src3) key))))) STAGE DEPENDENCIES: Stage-7 is a root stage , consists of Stage-8, Stage-9, Stage-1 Stage-8 has a backup stage: Stage-1 Stage-5 depends on stages: Stage-8 Stage-2 depends on stages: Stage-1, Stage-5, Stage-6 Stage-9 has a backup stage: Stage-1 Stage-6 depends on stages: Stage-9 Stage-1 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Conditional Operator Stage: Stage-8 Map Reduce Local Work Alias -> Map Local Tables: src1 Fetch Operator limit: -1 src3 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: src1 TableScan alias: src1 Filter Operator predicate: expr: ((key < 10.0) and (key < 300.0)) type: boolean HashTable Sink Operator condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] Position of Big Table: 1 src3 TableScan alias: src3 Filter Operator predicate: expr: (key < 300.0) type: boolean HashTable Sink Operator condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] Position of Big Table: 1 Stage: Stage-5 Map Reduce Alias -> Map Operator Tree: src2 TableScan alias: src2 Filter Operator predicate: expr: (key < 300.0) type: boolean Map Join Operator condition map: Right Outer Join0 to 1 Inner Join 1 to 2 condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] outputColumnNames: _col0, _col1, _col4, _col5, _col8, _col9 Position of Big Table: 1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col4 type: string expr: _col5 type: string expr: _col8 type: string expr: _col9 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Local Work: Map Reduce Local Work Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: #### A masked pattern was here #### Reduce Output Operator key expressions: expr: _col0 type: string expr: _col2 type: string expr: _col4 type: string sort order: +++ tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: string expr: _col3 type: string expr: _col4 type: string expr: _col5 type: string Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-9 Map Reduce Local Work Alias -> Map Local Tables: src1 Fetch Operator limit: -1 src2 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: src1 TableScan alias: src1 Filter Operator predicate: expr: ((key < 10.0) and (key < 300.0)) type: boolean HashTable Sink Operator condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] Position of Big Table: 2 src2 TableScan alias: src2 Filter Operator predicate: expr: (key < 300.0) type: boolean HashTable Sink Operator condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] Position of Big Table: 2 Stage: Stage-6 Map Reduce Alias -> Map Operator Tree: src3 TableScan alias: src3 Filter Operator predicate: expr: (key < 300.0) type: boolean Map Join Operator condition map: Right Outer Join0 to 1 Inner Join 1 to 2 condition expressions: 0 {key} {value} 1 {key} {value} 2 {key} {value} filter predicates: 0 1 {(key > 10.0)} 2 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] 2 [Column[key]] outputColumnNames: _col0, _col1, _col4, _col5, _col8, _col9 Position of Big Table: 2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col4 type: string expr: _col5 type: string expr: _col8 type: string expr: _col9 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Local Work: Map Reduce Local Work Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: src1 TableScan alias: src1 Filter Operator predicate: expr: ((key < 10.0) and (key < 300.0)) type: boolean Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 0 value expressions: expr: key type: string expr: value type: string src2 TableScan alias: src2 Filter Operator predicate: expr: (key < 300.0) type: boolean Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 1 value expressions: expr: key type: string expr: value type: string src3 TableScan alias: src3 Filter Operator predicate: expr: (key < 300.0) type: boolean Reduce Output Operator key expressions: expr: key type: string sort order: + Map-reduce partition columns: expr: key type: string tag: 2 value expressions: expr: key type: string expr: value type: string Reduce Operator Tree: Join Operator condition map: Right Outer Join0 to 1 Inner Join 1 to 2 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} {VALUE._col1} 2 {VALUE._col0} {VALUE._col1} filter predicates: 0 1 {(VALUE._col0 > 10.0)} 2 handleSkewJoin: false outputColumnNames: _col0, _col1, _col4, _col5, _col8, _col9 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col4 type: string expr: _col5 type: string expr: _col8 type: string expr: _col9 type: string outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-0 Fetch Operator limit: -1 PREHOOK: query: SELECT * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Input: default@src1 #### A masked pattern was here #### POSTHOOK: query: SELECT * FROM src1 RIGHT OUTER JOIN src1 src2 ON (src1.key = src2.key AND src1.key < 10 AND src2.key > 10) JOIN src src3 ON (src2.key = src3.key AND src3.key < 300) SORT BY src1.key, src2.key, src3.key POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Input: default@src1 #### A masked pattern was here #### NULL NULL 128 128 val_128 NULL NULL 128 128 val_128 NULL NULL 128 128 val_128 NULL NULL 146 val_146 146 val_146 NULL NULL 146 val_146 146 val_146 NULL NULL 150 val_150 150 val_150 NULL NULL 213 val_213 213 val_213 NULL NULL 213 val_213 213 val_213 NULL NULL 224 224 val_224 NULL NULL 224 224 val_224 NULL NULL 238 val_238 238 val_238 NULL NULL 238 val_238 238 val_238 NULL NULL 255 val_255 255 val_255 NULL NULL 255 val_255 255 val_255 NULL NULL 273 val_273 273 val_273 NULL NULL 273 val_273 273 val_273 NULL NULL 273 val_273 273 val_273 NULL NULL 278 val_278 278 val_278 NULL NULL 278 val_278 278 val_278 NULL NULL 66 val_66 66 val_66 NULL NULL 98 val_98 98 val_98 NULL NULL 98 val_98 98 val_98