org.apache.crunch.lib
Class Cogroup

java.lang.Object
  extended by org.apache.crunch.lib.Cogroup

public class Cogroup
extends Object


Constructor Summary
Cogroup()
           
 
Method Summary
static
<K,U,V> PTable<K,TupleN>
cogroup(int numReducers, PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
cogroup(int numReducers, PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K> PTable<K,TupleN>
cogroup(PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments.
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
cogroup(PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments.
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Cogroup

public Cogroup()
Method Detail

cogroup

public static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> cogroup(PTable<K,U> left,
                                                                          PTable<K,V> right)
Co-groups the two PTable arguments.

Parameters:
left - The left (smaller) PTable
right - The right (larger) PTable
Returns:
a PTable representing the co-grouped tables

cogroup

public static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> cogroup(int numReducers,
                                                                          PTable<K,U> left,
                                                                          PTable<K,V> right)
Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)

Parameters:
numReducers - The number of reducers to use
left - The left (smaller) PTable
right - The right (larger) PTable
Returns:
A new PTable representing the co-grouped tables

cogroup

public static <K,V1,V2,V3> PTable<K,Tuple3.Collect<V1,V2,V3>> cogroup(PTable<K,V1> first,
                                                                      PTable<K,V2> second,
                                                                      PTable<K,V3> third)
Co-groups the three PTable arguments.

Parameters:
first - The smallest PTable
second - The second-smallest PTable
third - The largest PTable
Returns:
a PTable representing the co-grouped tables

cogroup

public static <K,V1,V2,V3> PTable<K,Tuple3.Collect<V1,V2,V3>> cogroup(int numReducers,
                                                                      PTable<K,V1> first,
                                                                      PTable<K,V2> second,
                                                                      PTable<K,V3> third)
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)

Parameters:
numReducers - The number of reducers to use
first - The smallest PTable
second - The second-smallest PTable
third - The largest PTable
Returns:
A new PTable representing the co-grouped tables

cogroup

public static <K,V1,V2,V3,V4> PTable<K,Tuple4.Collect<V1,V2,V3,V4>> cogroup(PTable<K,V1> first,
                                                                            PTable<K,V2> second,
                                                                            PTable<K,V3> third,
                                                                            PTable<K,V4> fourth)
Co-groups the three PTable arguments.

Parameters:
first - The smallest PTable
second - The second-smallest PTable
third - The largest PTable
Returns:
a PTable representing the co-grouped tables

cogroup

public static <K,V1,V2,V3,V4> PTable<K,Tuple4.Collect<V1,V2,V3,V4>> cogroup(int numReducers,
                                                                            PTable<K,V1> first,
                                                                            PTable<K,V2> second,
                                                                            PTable<K,V3> third,
                                                                            PTable<K,V4> fourth)
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)

Parameters:
numReducers - The number of reducers to use
first - The smallest PTable
second - The second-smallest PTable
third - The largest PTable
Returns:
A new PTable representing the co-grouped tables

cogroup

public static <K> PTable<K,TupleN> cogroup(PTable<K,?> first,
                                           PTable<K,?>... rest)
Co-groups an arbitrary number of PTable arguments. The largest table should come last in the ordering.

Parameters:
first - The first (smallest) PTable to co-group
rest - The other (larger) PTables to co-group
Returns:
a PTable representing the co-grouped tables

cogroup

public static <K,U,V> PTable<K,TupleN> cogroup(int numReducers,
                                               PTable<K,?> first,
                                               PTable<K,?>... rest)
Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.

Parameters:
numReducers - The number of reducers to use
first - The first (smallest) PTable to co-group
rest - The other (larger) PTables to co-group
Returns:
A new PTable representing the co-grouped tables


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.