org.apache.crunch.contrib.text
Class Parse

java.lang.Object
  extended by org.apache.crunch.contrib.text.Parse

public final class Parse
extends Object

Methods for parsing instances of PCollection<String> into PCollection's of strongly-typed tuples.


Method Summary
static
<T> PCollection<T>
parse(String groupName, PCollection<String> input, Extractor<T> extractor)
          Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T>.
static
<T> PCollection<T>
parse(String groupName, PCollection<String> input, PTypeFamily ptf, Extractor<T> extractor)
          Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T> that uses the given PTypeFamily.
static
<K,V> PTable<K,V>
parseTable(String groupName, PCollection<String> input, Extractor<Pair<K,V>> extractor)
          Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>>.
static
<K,V> PTable<K,V>
parseTable(String groupName, PCollection<String> input, PTypeFamily ptf, Extractor<Pair<K,V>> extractor)
          Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>> that uses the given PTypeFamily.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

parse

public static <T> PCollection<T> parse(String groupName,
                                       PCollection<String> input,
                                       Extractor<T> extractor)
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T>.

Parameters:
groupName - A label to use for tracking errors related to the parsing process
input - The input PCollection<String> to convert
extractor - The Extractor<T> that converts each line
Returns:
A PCollection<T>

parse

public static <T> PCollection<T> parse(String groupName,
                                       PCollection<String> input,
                                       PTypeFamily ptf,
                                       Extractor<T> extractor)
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T> that uses the given PTypeFamily.

Parameters:
groupName - A label to use for tracking errors related to the parsing process
input - The input PCollection<String> to convert
ptf - The PTypeFamily of the returned PCollection<T>
extractor - The Extractor<T> that converts each line
Returns:
A PCollection<T>

parseTable

public static <K,V> PTable<K,V> parseTable(String groupName,
                                           PCollection<String> input,
                                           Extractor<Pair<K,V>> extractor)
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>>.

Parameters:
groupName - A label to use for tracking errors related to the parsing process
input - The input PCollection<String> to convert
extractor - The Extractor<Pair<K, V>> that converts each line
Returns:
A PTable<K, V>

parseTable

public static <K,V> PTable<K,V> parseTable(String groupName,
                                           PCollection<String> input,
                                           PTypeFamily ptf,
                                           Extractor<Pair<K,V>> extractor)
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>> that uses the given PTypeFamily.

Parameters:
groupName - A label to use for tracking errors related to the parsing process
input - The input PCollection<String> to convert
ptf - The PTypeFamily of the returned PTable<K, V>
extractor - The Extractor<Pair<K, V>> that converts each line
Returns:
A PTable<K, V>


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.