Package org.apache.calcite.rel.rules
Class LoptSemiJoinOptimizer
- java.lang.Object
-
- org.apache.calcite.rel.rules.LoptSemiJoinOptimizer
-
public class LoptSemiJoinOptimizer extends java.lang.Object
Implements the logic for determining the optimal semi-joins to be used in processing joins in a query.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private class
LoptSemiJoinOptimizer.FactorCostComparator
Compares factors.private static class
LoptSemiJoinOptimizer.FemLocalIndex
Dummy class to allow code to compile.private static class
LoptSemiJoinOptimizer.LcsIndexOptimizer
Dummy class to allow code to compile.private static class
LoptSemiJoinOptimizer.LcsTable
Dummy class to allow code to compile.private static class
LoptSemiJoinOptimizer.LcsTableScan
Dummy class to allow code to compile.private static class
LoptSemiJoinOptimizer.LucidDbSpecialOperators
Dummy class to allow code to compile.
-
Field Summary
Fields Modifier and Type Field Description private RelNode[]
chosenSemiJoins
Semijoins corresponding to each join factor, if they are going to be filtered by semijoins.private com.google.common.collect.Ordering<java.lang.Integer>
factorCostOrdering
private RelMetadataQuery
mq
private java.util.Map<java.lang.Integer,java.util.Map<java.lang.Integer,SemiJoin>>
possibleSemiJoins
Associates potential semijoins with each fact table factor.private RexBuilder
rexBuilder
private static int
THRESHOLD_SCORE
-
Constructor Summary
Constructors Constructor Description LoptSemiJoinOptimizer(RelMetadataQuery mq, LoptMultiJoin multiJoin, RexBuilder rexBuilder)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private RexNode
adjustSemiJoinCondition(LoptMultiJoin multiJoin, int leftAdjustment, RexNode semiJoinCondition, int leftIdx, int rightIdx)
Modifies the semijoin condition to reflect the fact that the RHS is now the second factor into a join and the LHS is the firstboolean
chooseBestSemiJoin(LoptMultiJoin multiJoin)
Finds the optimal semijoin for filtering the least costly fact table from among the remaining possible semijoins to choose from.private double
computeScore(RelNode factRel, RelNode dimRel, SemiJoin semiJoin)
Computes a score relevant to applying a set of semijoins on a fact table.private SemiJoin
findSemiJoinIndexByCost(LoptMultiJoin multiJoin, java.util.List<RexNode> joinFilters, int factIdx, int dimIdx)
Given a list of possible filters on a fact table, determine if there is an index that can be used, provided all the fact table keys originate from the same underlying table.RelNode
getChosenSemiJoin(int factIdx)
private int
isSuitableFilter(LoptMultiJoin multiJoin, RexNode joinFilter, int factIdx)
Determines if a join filter can be used with a semijoin against a specified fact table.void
makePossibleSemiJoins(LoptMultiJoin multiJoin)
Determines all possible semijoins that can be used by dimension tables to filter fact tables.private RexNode
removeExtraFilters(java.util.List<java.lang.Integer> keys, int nFields, RexNode condition)
Removes from an expression any sub-expressions that reference key values that aren't contained in a key list passed in.private void
removeJoin(LoptMultiJoin multiJoin, SemiJoin semiJoin, int factIdx, int dimIdx)
Determines whether a join of the dimension table in a semijoin can be removed.private void
removePossibleSemiJoin(java.util.Map<java.lang.Integer,SemiJoin> possibleDimensions, java.lang.Integer factIdx, java.lang.Integer dimIdx)
Removes a dimension table from a fact table's list of possible semijoinsprivate LoptSemiJoinOptimizer.LcsTable
validateKeys(RelNode factRel, java.util.List<java.lang.Integer> leftKeys, java.util.List<java.lang.Integer> rightKeys, java.util.List<java.lang.Integer> actualLeftKeys)
Validates the candidate semijoin keys corresponding to the fact table.
-
-
-
Field Detail
-
THRESHOLD_SCORE
private static final int THRESHOLD_SCORE
- See Also:
- Constant Field Values
-
rexBuilder
private final RexBuilder rexBuilder
-
mq
private final RelMetadataQuery mq
-
chosenSemiJoins
private RelNode[] chosenSemiJoins
Semijoins corresponding to each join factor, if they are going to be filtered by semijoins. Otherwise, the entry is the original join factor.
-
possibleSemiJoins
private java.util.Map<java.lang.Integer,java.util.Map<java.lang.Integer,SemiJoin>> possibleSemiJoins
Associates potential semijoins with each fact table factor. The first parameter in the map corresponds to the fact table. The second corresponds to the dimension table and a SemiJoin that captures all the necessary semijoin data between that fact and dimension table
-
factorCostOrdering
private final com.google.common.collect.Ordering<java.lang.Integer> factorCostOrdering
-
-
Constructor Detail
-
LoptSemiJoinOptimizer
public LoptSemiJoinOptimizer(RelMetadataQuery mq, LoptMultiJoin multiJoin, RexBuilder rexBuilder)
-
-
Method Detail
-
makePossibleSemiJoins
public void makePossibleSemiJoins(LoptMultiJoin multiJoin)
Determines all possible semijoins that can be used by dimension tables to filter fact tables. Constructs SemiJoinRels corresponding to potential dimension table filters and stores them in the member field "possibleSemiJoins"- Parameters:
multiJoin
- join factors being optimized
-
isSuitableFilter
private int isSuitableFilter(LoptMultiJoin multiJoin, RexNode joinFilter, int factIdx)
Determines if a join filter can be used with a semijoin against a specified fact table. A suitable filter is of the form "factable.col1 = dimTable.col2".- Parameters:
multiJoin
- join factors being optimizedjoinFilter
- filter to be analyzedfactIdx
- index corresponding to the fact table- Returns:
- index of corresponding dimension table if the filter is appropriate; otherwise -1 is returned
-
findSemiJoinIndexByCost
private SemiJoin findSemiJoinIndexByCost(LoptMultiJoin multiJoin, java.util.List<RexNode> joinFilters, int factIdx, int dimIdx)
Given a list of possible filters on a fact table, determine if there is an index that can be used, provided all the fact table keys originate from the same underlying table.- Parameters:
multiJoin
- join factors being optimizedjoinFilters
- filters to be used on the fact tablefactIdx
- index in join factors corresponding to the fact tabledimIdx
- index in join factors corresponding to the dimension table- Returns:
- SemiJoin containing information regarding the semijoin that can be used to filter the fact table
-
adjustSemiJoinCondition
private RexNode adjustSemiJoinCondition(LoptMultiJoin multiJoin, int leftAdjustment, RexNode semiJoinCondition, int leftIdx, int rightIdx)
Modifies the semijoin condition to reflect the fact that the RHS is now the second factor into a join and the LHS is the first- Parameters:
multiJoin
- join factors being optimizedleftAdjustment
- amount the left RexInputRefs need to be adjusted bysemiJoinCondition
- condition to be adjustedleftIdx
- index of the join factor corresponding to the LHS of the semijoin,rightIdx
- index of the join factor corresponding to the RHS of the semijoin- Returns:
- modified semijoin condition
-
validateKeys
private LoptSemiJoinOptimizer.LcsTable validateKeys(RelNode factRel, java.util.List<java.lang.Integer> leftKeys, java.util.List<java.lang.Integer> rightKeys, java.util.List<java.lang.Integer> actualLeftKeys)
Validates the candidate semijoin keys corresponding to the fact table. Ensure the keys all originate from the same underlying table, and they all correspond to simple column references. If unsuitable keys are found, they're removed from the key list and a new list corresponding to the remaining valid keys is returned.- Parameters:
factRel
- fact table RelNodeleftKeys
- fact table semijoin keysrightKeys
- dimension table semijoin keysactualLeftKeys
- the remaining valid fact table semijoin keys- Returns:
- the underlying fact table if the semijoin keys are valid; otherwise null
-
removeExtraFilters
private RexNode removeExtraFilters(java.util.List<java.lang.Integer> keys, int nFields, RexNode condition)
Removes from an expression any sub-expressions that reference key values that aren't contained in a key list passed in. The keys represent join keys on one side of a join. The subexpressions are all assumed to be of the form "tab1.col1 = tab2.col2".- Parameters:
keys
- join keys from one side of the joinnFields
- number of fields in the side of the join for which the keys correspondcondition
- original expression- Returns:
- modified expression with filters that don't reference specified keys removed
-
chooseBestSemiJoin
public boolean chooseBestSemiJoin(LoptMultiJoin multiJoin)
Finds the optimal semijoin for filtering the least costly fact table from among the remaining possible semijoins to choose from. The chosen semijoin is stored in the chosenSemiJoins array- Parameters:
multiJoin
- join factors being optimized- Returns:
- true if a suitable semijoin is found; false otherwise
-
computeScore
private double computeScore(RelNode factRel, RelNode dimRel, SemiJoin semiJoin)
Computes a score relevant to applying a set of semijoins on a fact table. The higher the score, the better.- Parameters:
factRel
- fact table being filtereddimRel
- dimension table that participates in semijoinsemiJoin
- semijoin between fact and dimension tables- Returns:
- computed score of applying the dimension table filters on the fact table
-
removeJoin
private void removeJoin(LoptMultiJoin multiJoin, SemiJoin semiJoin, int factIdx, int dimIdx)
Determines whether a join of the dimension table in a semijoin can be removed. It can be if the dimension keys are unique and the only fields referenced from the dimension table are its semijoin keys. The semijoin keys can be mapped to the corresponding keys from the fact table (because of the equality condition associated with the semijoin keys). Therefore, that's why the dimension table can be removed even though those fields are referenced elsewhere in the query tree.- Parameters:
multiJoin
- join factors being optimizedsemiJoin
- semijoin under considerationfactIdx
- id of the fact table in the semijoindimIdx
- id of the dimension table in the semijoin
-
removePossibleSemiJoin
private void removePossibleSemiJoin(java.util.Map<java.lang.Integer,SemiJoin> possibleDimensions, java.lang.Integer factIdx, java.lang.Integer dimIdx)
Removes a dimension table from a fact table's list of possible semijoins- Parameters:
possibleDimensions
- possible dimension tables associated with the fact tablefactIdx
- index corresponding to fact tabledimIdx
- index corresponding to dimension table
-
getChosenSemiJoin
public RelNode getChosenSemiJoin(int factIdx)
- Parameters:
factIdx
- index corresponding to the desired factor- Returns:
- optimal semijoin for the specified factor; may be the factor itself if semijoins are not chosen for the factor
-
-