org.apache.nutch.parse
Class ParseResult

java.lang.Object
  extended by org.apache.nutch.parse.ParseResult
All Implemented Interfaces:
Iterable<Map.Entry<Text,Parse>>

public class ParseResult
extends Object
implements Iterable<Map.Entry<Text,Parse>>

A utility class that stores result of a parse. Internally a ParseResult stores <Text, Parse> pairs.

Parsers may return multiple results, which correspond to parts or other associated documents related to the original URL.

There will be usually one parse result that corresponds directly to the original URL, and possibly many (or none) results that correspond to derived URLs (or sub-URLs).


Field Summary
static org.slf4j.Logger LOG
           
 
Constructor Summary
ParseResult(String originalUrl)
          Create a container for parse results.
 
Method Summary
static ParseResult createParseResult(String url, Parse parse)
          Convenience method for obtaining ParseResult from a single Parse output.
 void filter()
          Remove all results where status is not successful (as determined by ParseStatus#isSuccess()).
 Parse get(String key)
          Retrieve a single parse output.
 Parse get(Text key)
          Retrieve a single parse output.
 boolean isEmpty()
          Checks whether the result is empty.
 boolean isSuccess()
          A convenience method which returns true only if all parses are successful.
 Iterator<Map.Entry<Text,Parse>> iterator()
          Iterate over all entries in the <url, Parse> map.
 void put(String key, ParseText text, ParseData data)
          Store a result of parsing.
 void put(Text key, ParseText text, ParseData data)
          Store a result of parsing.
 int size()
          Return the number of parse outputs (both successful and failed)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.slf4j.Logger LOG
Constructor Detail

ParseResult

public ParseResult(String originalUrl)
Create a container for parse results.

Parameters:
originalUrl - the original url from which all parse results have been obtained.
Method Detail

createParseResult

public static ParseResult createParseResult(String url,
                                            Parse parse)
Convenience method for obtaining ParseResult from a single Parse output.

Parameters:
url - canonical url.
parse - single parse output.
Returns:
result containing the single parse output.

isEmpty

public boolean isEmpty()
Checks whether the result is empty.

Returns:

size

public int size()
Return the number of parse outputs (both successful and failed)


get

public Parse get(String key)
Retrieve a single parse output.

Parameters:
key - sub-url under which the parse output is stored.
Returns:
parse output corresponding to this sub-url, or null.

get

public Parse get(Text key)
Retrieve a single parse output.

Parameters:
key - sub-url under which the parse output is stored.
Returns:
parse output corresponding to this sub-url, or null.

put

public void put(Text key,
                ParseText text,
                ParseData data)
Store a result of parsing.

Parameters:
key - URL or sub-url of this parse result
text - plain text result
data - corresponding parse metadata of this result

put

public void put(String key,
                ParseText text,
                ParseData data)
Store a result of parsing.

Parameters:
key - URL or sub-url of this parse result
text - plain text result
data - corresponding parse metadata of this result

iterator

public Iterator<Map.Entry<Text,Parse>> iterator()
Iterate over all entries in the <url, Parse> map.

Specified by:
iterator in interface Iterable<Map.Entry<Text,Parse>>

filter

public void filter()
Remove all results where status is not successful (as determined by ParseStatus#isSuccess()). Note that effects of this operation cannot be reversed.


isSuccess

public boolean isSuccess()
A convenience method which returns true only if all parses are successful. Parse success is determined by ParseStatus#isSuccess().



Copyright © 2012 The Apache Software Foundation