public class LimitPushdownOptimizer extends Transform
Operator.acceptLimitPushdown()
If RS is only for limiting rows, RSHash counts row with same key separately.
But if RS is for GBY, RSHash should forward all the rows with the same key.
Legend : A(a) --> key A, value a, row A(a)
If each RS in mapper tasks is forwarded rows like this
MAP1(RS) : 40(a)-10(b)-30(c)-10(d)-70(e)-80(f)
MAP2(RS) : 90(g)-80(h)-60(i)-40(j)-30(k)-20(l)
MAP3(RS) : 40(m)-50(n)-30(o)-30(p)-60(q)-70(r)
OBY or GBY makes result like this,
REDUCER : 10(b,d)-20(l)-30(c,k,o,p)-40(a,j,m)-50(n)-60(i,q)-70(e,r)-80(f,h)-90(g)
LIMIT 3 for GBY: 10(b,d)-20(l)-30(c,k,o,p)
LIMIT 3 for OBY: 10(b,d)-20(l)
with the optimization, the amount of shuffling can be reduced, making identical result
For GBY,
MAP1 : 40(a)-10(b)-30(c)-10(d)
MAP2 : 40(j)-30(k)-20(l)
MAP3 : 40(m)-50(n)-30(o)-30(p)
REDUCER : 10(b,d)-20(l)-30(c,k,o,p)-40(a,j,m)-50(n)
LIMIT 3 : 10(b,d)-20(l)-30(c,k,o,p)
For OBY,
MAP1 : 10(b)-30(c)-10(d)
MAP2 : 40(j)-30(k)-20(l)
MAP3 : 40(m)-50(n)-30(o)
REDUCER : 10(b,d)-20(l)-30(c,k,o)-40(j,m)-50(n)
LIMIT 3 : 10(b,d)-20(l)Constructor and Description |
---|
LimitPushdownOptimizer() |
Modifier and Type | Method and Description |
---|---|
ParseContext |
transform(ParseContext pctx)
All transformation steps implement this interface.
|
beginPerfLogging, endPerfLogging, endPerfLogging
public ParseContext transform(ParseContext pctx) throws SemanticException
Transform
transform
in class Transform
pctx
- input parse contextSemanticException
Copyright © 2022 The Apache Software Foundation. All rights reserved.