org.apache.any23.extractor
Class ExtractionResultImpl

java.lang.Object
  extended by org.apache.any23.extractor.ExtractionResultImpl
All Implemented Interfaces:
ErrorReporter, ExtractionResult, TagSoupExtractionResult

public class ExtractionResultImpl
extends Object
implements TagSoupExtractionResult

A default implementation of ExtractionResult; it receives extraction output from one Extractor working on one document, and passes the output on to a TripleHandler. It deals with details such as creation of ExtractionContext objects and closing any open contexts at the end of extraction.

The close() method must be invoked after the extractor has finished processing.

There is usually no need to provide additional implementations of the ExtractionWriter interface.

Author:
Richard Cyganiak (richard@cyganiak.de), Michele Mostarda (michele.mostarda@gmail.com)
See Also:
TripleHandler, ExtractionContext

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.any23.extractor.TagSoupExtractionResult
TagSoupExtractionResult.PropertyPath, TagSoupExtractionResult.ResourceRoot
 
Nested classes/interfaces inherited from interface org.apache.any23.extractor.ErrorReporter
ErrorReporter.Error, ErrorReporter.ErrorLevel
 
Constructor Summary
ExtractionResultImpl(ExtractionContext context, Extractor<?> extractor, TripleHandler tripleHandler)
           
 
Method Summary
 void addPropertyPath(Class<? extends MicroformatExtractor> extractor, org.openrdf.model.Resource propertySubject, org.openrdf.model.Resource property, org.openrdf.model.BNode object, String[] path)
          Adds a property path to the list of the extracted data.
 void addResourceRoot(String[] path, org.openrdf.model.Resource root, Class<? extends MicroformatExtractor> extractor)
          Adds a root property to the extraction result, specifying also the path corresponding to the root of data which generated the property and the extractor responsible for such addition.
 void close()
          Close the result.
 Collection<ErrorReporter.Error> getErrors()
          Returns all the collected errors.
 int getErrorsCount()
           
 ExtractionContext getExtractionContext()
           
 List<TagSoupExtractionResult.PropertyPath> getPropertyPaths()
          Returns all the collected property paths.
 List<TagSoupExtractionResult.ResourceRoot> getResourceRoots()
          Returns all the collected property roots.
 boolean hasErrors()
           
 void notifyError(ErrorReporter.ErrorLevel level, String msg, int row, int col)
          Notifies an error occurred while performing an extraction on an input stream.
 ExtractionResult openSubResult(ExtractionContext context)
          Open a result nested in the current one.
 void printErrorsReport(PrintStream ps)
          Prints out an errors report.
 String toString()
           
 void writeNamespace(String prefix, String uri)
          Write a namespace.
 void writeTriple(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o)
          Write a triple.
 void writeTriple(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o, org.openrdf.model.URI g)
          Writes a triple.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

ExtractionResultImpl

public ExtractionResultImpl(ExtractionContext context,
                            Extractor<?> extractor,
                            TripleHandler tripleHandler)
Method Detail

hasErrors

public boolean hasErrors()

getErrorsCount

public int getErrorsCount()

printErrorsReport

public void printErrorsReport(PrintStream ps)
Description copied from interface: ErrorReporter
Prints out an errors report.

Specified by:
printErrorsReport in interface ErrorReporter

getErrors

public Collection<ErrorReporter.Error> getErrors()
Description copied from interface: ErrorReporter
Returns all the collected errors.

Specified by:
getErrors in interface ErrorReporter
Returns:
a collection of ErrorReporter.Errors.

openSubResult

public ExtractionResult openSubResult(ExtractionContext context)
Description copied from interface: ExtractionResult
Open a result nested in the current one.

Specified by:
openSubResult in interface ExtractionResult
Parameters:
context - the context to be used to open the sub result.
Returns:
the instance of the nested extraction result.

getExtractionContext

public ExtractionContext getExtractionContext()

writeTriple

public void writeTriple(org.openrdf.model.Resource s,
                        org.openrdf.model.URI p,
                        org.openrdf.model.Value o,
                        org.openrdf.model.URI g)
Description copied from interface: ExtractionResult
Writes a triple. Parameters can be null, then the triple will be silently ignored.

Specified by:
writeTriple in interface ExtractionResult
Parameters:
s - subject
p - predicate
o - object
g - graph

writeTriple

public void writeTriple(org.openrdf.model.Resource s,
                        org.openrdf.model.URI p,
                        org.openrdf.model.Value o)
Description copied from interface: ExtractionResult
Write a triple. Parameters can be null, then the triple will be silently ignored.

Specified by:
writeTriple in interface ExtractionResult
Parameters:
s - subject
p - predicate
o - object

writeNamespace

public void writeNamespace(String prefix,
                           String uri)
Description copied from interface: ExtractionResult
Write a namespace.

Specified by:
writeNamespace in interface ExtractionResult
Parameters:
prefix - the prefix of the namespace
uri - the long URI identifying the namespace

notifyError

public void notifyError(ErrorReporter.ErrorLevel level,
                        String msg,
                        int row,
                        int col)
Description copied from interface: ErrorReporter
Notifies an error occurred while performing an extraction on an input stream.

Specified by:
notifyError in interface ErrorReporter
Parameters:
level - error level.
msg - error message.
row - error row.
col - error column.

close

public void close()
Description copied from interface: ExtractionResult
Close the result.

Extractors should close their results as soon as possible, but don't have to, the environment will close any remaining ones. Implementations should be robust against multiple close() invocations.

Specified by:
close in interface ExtractionResult

addResourceRoot

public void addResourceRoot(String[] path,
                            org.openrdf.model.Resource root,
                            Class<? extends MicroformatExtractor> extractor)
Description copied from interface: TagSoupExtractionResult
Adds a root property to the extraction result, specifying also the path corresponding to the root of data which generated the property and the extractor responsible for such addition.

Specified by:
addResourceRoot in interface TagSoupExtractionResult
Parameters:
path - the path from the document root to the local root of the data generating the property.
root - the property root node.
extractor - the extractor responsible of such extraction.

getResourceRoots

public List<TagSoupExtractionResult.ResourceRoot> getResourceRoots()
Description copied from interface: TagSoupExtractionResult
Returns all the collected property roots.

Specified by:
getResourceRoots in interface TagSoupExtractionResult
Returns:
an unmodifiable list of TagSoupExtractionResult.ResourceRoots.

addPropertyPath

public void addPropertyPath(Class<? extends MicroformatExtractor> extractor,
                            org.openrdf.model.Resource propertySubject,
                            org.openrdf.model.Resource property,
                            org.openrdf.model.BNode object,
                            String[] path)
Description copied from interface: TagSoupExtractionResult
Adds a property path to the list of the extracted data.

Specified by:
addPropertyPath in interface TagSoupExtractionResult
Parameters:
extractor - the identifier of the extractor responsible for retrieving such property.
propertySubject - the subject of the property.
property - the property URI.
object - the property object if any, null otherwise.
path - the path of the HTML node from which the property literal has been extracted.

getPropertyPaths

public List<TagSoupExtractionResult.PropertyPath> getPropertyPaths()
Description copied from interface: TagSoupExtractionResult
Returns all the collected property paths.

Specified by:
getPropertyPaths in interface TagSoupExtractionResult
Returns:
a valid list of property paths.

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.