Probe predicates
transforms each IN
list predicate into an equality predicate whose right operand is a parameter
marker that is created internally. This internal equality predicate is called a
probe predicate. Each of the above examples of simple IN predicates
is transformed into the following probe predicate:
orig_airport = ?
Probe predicates are treated differently than normal equality predicates.
Probe predicates are processed in a special way during query optimization and
execution.
During optimization,
analyzes the probe predicate to determine if the probe predicate is useful for
limiting the number of rows retrieved from disk. For a probe predicate to be
useful, both of the following requirements must be true:
- There must be an index defined on the table that the column reference
belongs to, and the column reference must be the first column in the index. In
the example above, orig_airport is the column reference.
- The estimated cost of an access path that uses the probe predicate and one
of the corresponding indexes must be less than the estimated cost of any other
access paths calculated by the optimizer. Typically, this means that the number
of values in the IN list is significantly fewer than the number of rows in the
table that the column reference belongs to.
If both of these requirements are met,
will use the probe
predicate at query execution to probe the underlying index for values in
the IN list. In other words, the right operand of the probe predicate (the
parameter) becomes a place-holder into which
plugs the different
values from the IN list. Then for each value,
reads the matching rows
from the index.
If either of the two requirements is not satisfied,
discards the internal
probe predicate and executes the query using the original IN list predicate.
Examples
The following query is submitted to
:
SELECT flights.orig_airport, cities.city_name
FROM flights, cities
WHERE flights.orig_airport IN ('ABQ', 'DSM', 'YYZ')
AND flights.orig_airport = cities.airport
The optimizer
transforms this query internally into:
SELECT flights.orig_airport, cities.city_name
FROM flights, cities
WHERE flights.orig_airport = ?
AND flights.orig_airport = cities.airport
In this transformed query, flights.orig_airport = ? is an
internal probe predicate.
There is an index on the org_airport column in the
flights table. If the estimated cost of probing that index for
the three values (ABQ, DSM, YYZ) is less than the cost of accessing the
flights table in some other way,
will perform probing on
the index at query execution. This approach ensures that
reads only the
necessary rows from the
table.
At a higher level, the approach by
to use index probing
for IN lists is an internal way of evaluating the transformed predicate multiple
times. The predicate is evaluated one time for each value in the IN list.
From a JDBC perspective,
is logically (but not
actually) performing the following statements and then combining the three
result sets (rs1, rs2, and
rs3):
PreparedStatement ps = conn.prepareStatement(
"select flights.orig_airport, cities.city_name " +
"from flights, cities " +
"where flights.orig_airport = ? " +
"and flights.orig_airport = cities.airport ");
ps.setString(1, "ABQ");
rs1 = ps.executeQuery();
ps.setString(1, "DSM");
rs2 = ps.executeQuery();
ps.setString(1, "YYZ");
rs3 = ps.executeQuery();
From an SQL perspective,
is logically (but not
actually) performing the following statement:
SELECT flights.orig_airport, cities.city_name
FROM flights, cities
WHERE flights.orig_airport = 'ABQ'
AND flights.orig_airport = cities.airport
UNION ALL
SELECT flights.orig_airport, cities.city_name
FROM flights, cities
WHERE flights.orig_airport = 'DSM'
AND flights.orig_airport = cities.airport
UNION ALL
SELECT flights.orig_airport, cities.city_name
FROM flights, cities
WHERE flights.orig_airport = 'YYZ'
AND flights.orig_airport = cities.airport
In the above SQL example, for each subquery the equality predicate limits the
number of rows read from the flights table so that the process
avoids having to read unnecessary rows from disk.
The larger the flights table, the more time
will save by probing
the index for the relatively few IN list values.
By using probe predicates, regardless of how large the base table is,
only has to probe the
index a maximum of N times, where N is the size of the IN list. If N is
significantly less than the number of rows in the table, or is significantly
less than the number of rows between the minimum value and the maximum value in
the IN list, selective probing ensures that
does not spend time
reading unnecessary rows from disk.