For example, given two tables, t1 and t2:
c1
--
1
2
3
c1
--
2
2
1
and the following query:
SELECT c1 FROM t1 WHERE c1 IN (SELECT c1 FROM t2)
the results would be
1
2
Simply selecting t1.c1 when simply joining those tables has
different results:
SELECT t1.c1 FROM t1, t2 WHERE t1.c1 = t2.c1
1
2
2
Statements that include such subqueries can be flattened into joins only if
the subquery does not introduce any duplicates into the result set (in our
example, the subquery introduced a duplicate and so cannot simply be flattened
into a join). If this requirement and other requirements (listed below) are met,
however, the statement is flattened such that the tables in the subquery's FROM
list are treated as if they were inner to the tables in the outer FROM list.
For example, the query could have been flattened into a join if
c1 in t2 had a unique index on it. It would
not have introduced any duplicate values into the result set.
The requirements for flattening into a normal join are as follows:
- The subquery is not under an OR.
- The subquery type is EXISTS, IN, or ANY, or it is an expression subquery on
the right side of a comparison operator.
- The subquery is not in the SELECT list of the outer query block.
- There are no aggregates in the SELECT list of the subquery.
- The subquery does not have a GROUP BY clause.
- The subquery does not have an ORDER BY, result offset, or fetch first
clause.
- There is a uniqueness condition that ensures that the subquery does not
introduce any duplicates if it is flattened into the outer query block.
- Each table in the subquery's FROM list (after any view, derived table, or
subquery flattening) must be a
base
table.
- If there is a WHERE clause in the subquery, there is at least one table in
the subquery whose columns are in
equality
predicates with expressions that do not include any column references
from the subquery block. These columns must be a superset of the key columns for
any unique index on the table. For all other tables in the subquery, the columns
in equality predicates with expressions that do not include columns from the
same table are a superset of the unique columns for any unique index on the
table.
Flattening into a normal join gives the optimizer more options for choosing
the best query plan. For example, if the following statement:
SELECT huge.* FROM huge
WHERE c1 IN (SELECT c1 FROM tiny)
can be flattened into
SELECT huge.* FROM huge, tiny
WHERE huge.c1 = tiny.c1
the optimizer can choose a query plan that will scan tiny
and do a few probes into the huge table instead of scanning the huge table and
doing a large number of probes into the tiny table.
Here is an expansion of the example used earlier in this section. Given
CREATE TABLE t1 (c1 INT)
CREATE TABLE t2 (c1 INT NOT NULL PRIMARY KEY)
CREATE TABLE t3 (c1 INT NOT NULL PRIMARY KEY)
INSERT INTO t1 VALUES (1), (2), (3)
INSERT INTO t2 VALUES (1), (2), (3)
INSERT INTO t3 VALUES (2), (3), (4)
this query
SELECT t1.* FROM t1 WHERE t1.c1 IN
(SELECT t2.c1 FROM t2, t3 WHERE t2.c1 = t3.c1)
should return the following results:
2
3
The query satisfies all the requirements for flattening into a join, and the
statement can be transformed into the following one:
SELECT t1.*
FROM t1, t2, t3
WHERE t1.c1 = t2.c1
AND t2.c1 = t3.c1
AND t1.c1 = t3.c1
The following query:
SELECT t1.*
FROM t1 WHERE EXISTS
(SELECT * FROM t2, t3 WHERE t2.c1 = t3.c1 AND t2.c1 = t1.c1)
can be transformed into
SELECT t1.*
FROM t1, t2, t3
WHERE t1.c1 = t2.c1
AND t2.c1 = t3.c1
AND t1.c1 = t3.c1