org.apache.any23.extractor.rdf
Class BaseRDFExtractor

java.lang.Object
  extended by org.apache.any23.extractor.rdf.BaseRDFExtractor
All Implemented Interfaces:
Extractor<InputStream>, Extractor.ContentExtractor
Direct Known Subclasses:
NQuadsExtractor, NTriplesExtractor, RDFXMLExtractor, TriXExtractor, TurtleExtractor

public abstract class BaseRDFExtractor
extends Object
implements Extractor.ContentExtractor

Base class for a generic RDF Extractor.ContentExtractor.

Author:
Michele Mostarda (mostarda@fbk.eu)

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor
 
Constructor Summary
BaseRDFExtractor()
           
BaseRDFExtractor(boolean verifyDataType, boolean stopAtFirstError)
          Constructor, allows to specify the validation and error handling policies.
 
Method Summary
abstract  ExtractorDescription getDescription()
          Returns a ExtractorDescription of this extractor.
protected abstract  org.openrdf.rio.helpers.RDFParserBase getParser(ExtractionContext extractionContext, ExtractionResult extractionResult)
           
 boolean isStopAtFirstError()
           
 boolean isVerifyDataType()
           
 void run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, InputStream in, ExtractionResult extractionResult)
          Executes the extractor.
 void setStopAtFirstError(boolean b)
          If true, the extractor will stop at first parsing error, iffalse the extractor will attempt to ignore all parsing errors.
 void setVerifyDataType(boolean verifyDataType)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BaseRDFExtractor

public BaseRDFExtractor(boolean verifyDataType,
                        boolean stopAtFirstError)
Constructor, allows to specify the validation and error handling policies.

Parameters:
verifyDataType - if true the data types will be verified, if false will be ignored.
stopAtFirstError - if true the parser will stop at first parsing error, if false will ignore non blocking errors.

BaseRDFExtractor

public BaseRDFExtractor()
Method Detail

getDescription

public abstract ExtractorDescription getDescription()
Description copied from interface: Extractor
Returns a ExtractorDescription of this extractor.

Specified by:
getDescription in interface Extractor<InputStream>
Returns:
the object representing the extractor description.

getParser

protected abstract org.openrdf.rio.helpers.RDFParserBase getParser(ExtractionContext extractionContext,
                                                                   ExtractionResult extractionResult)

isVerifyDataType

public boolean isVerifyDataType()

setVerifyDataType

public void setVerifyDataType(boolean verifyDataType)

isStopAtFirstError

public boolean isStopAtFirstError()

setStopAtFirstError

public void setStopAtFirstError(boolean b)
Description copied from interface: Extractor.ContentExtractor
If true, the extractor will stop at first parsing error, iffalse the extractor will attempt to ignore all parsing errors.

Specified by:
setStopAtFirstError in interface Extractor.ContentExtractor
Parameters:
b - tolerance flag.

run

public void run(ExtractionParameters extractionParameters,
                ExtractionContext extractionContext,
                InputStream in,
                ExtractionResult extractionResult)
         throws IOException,
                ExtractionException
Description copied from interface: Extractor
Executes the extractor. Will be invoked only once, extractors are not reusable.

Specified by:
run in interface Extractor<InputStream>
Parameters:
extractionParameters - the parameters to be applied during the extraction.
extractionContext - The document context.
in - The extractor input data.
extractionResult - the collector for the extracted data.
Throws:
IOException - On error while reading from the input stream.
ExtractionException - On other error, such as parse errors.


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.