In most cases, needs to perform two separate steps for statements
that use DISTINCT or GROUP BY: first sorting the selected columns, then either
discarding duplicate rows or aggregating grouped rows. Sometimes it is able
to avoid sorting for these statements with tuple filtering. Tuple filtering means that the rows are already in
a useful order. For DISTINCT, can simply filter out duplicate
values when they are found and return results to the user sooner. For GROUP
BY, can aggregate a group of rows until a new set of rows is detected
and return results to the user sooner.
These are non-cost-based optimizations; the optimizer does not yet consider
the cost of these optimizations.
The examples in this section refer to the following tables:
CREATE TABLE t1(c1 INT, c2 INT, c3 INT, c4 INT)
CREATE INDEX i1 ON t1(c1)
CREATE INDEX i1_2_3_4 ON t1(c1, c2, c3, c4)