org.apache.crunch.lib.join
Class OneToManyJoin
java.lang.Object
org.apache.crunch.lib.join.OneToManyJoin
public class OneToManyJoin
- extends Object
Optimized join for situations where exactly one value is being joined with
any other number of values based on a common key.
OneToManyJoin
public OneToManyJoin()
oneToManyJoin
public static <K,U,V,T> PCollection<T> oneToManyJoin(PTable<K,U> left,
PTable<K,V> right,
DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
PType<T> ptype)
- Performs a join on two tables, where the left table only contains a single
value per key.
This method accepts a DoFn, which is responsible for converting the single
left-side value and the iterable of right-side values into output values.
This method of joining is useful when there is a single context value that
contains a large number of related values, and all related values must be
brought together, with the quantity of the right-side values being too big
to fit in memory.
If there are multiple values for the same key in the left-side table, only
a single one will be used.
- Parameters:
left
- left-side table to joinright
- right-side table to joinpostProcessFn
- DoFn to process the results of the joinptype
- type of the output of the postProcessFn
- Returns:
- the post-processed output of the join
Copyright © 2013 The Apache Software Foundation. All Rights Reserved.