org.apache.any23.extractor.rdfa
Class RDFa11Extractor

java.lang.Object
  extended by org.apache.any23.extractor.rdfa.RDFa11Extractor
All Implemented Interfaces:
Extractor<Document>, Extractor.TagSoupDOMExtractor

public class RDFa11Extractor
extends Object
implements Extractor.TagSoupDOMExtractor

Extractor implementation for RDFa 1.1 specification.

Author:
Michele Mostarda (mostarda@fbk.eu)

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor
 
Field Summary
static ExtractorFactory<RDFa11Extractor> factory
           
static String NAME
           
 
Constructor Summary
RDFa11Extractor()
          Default constructor, with no verification of data types and not stop at first error.
RDFa11Extractor(boolean verifyDataType, boolean stopAtFirstError)
          Constructor, allows to specify the validation and error handling policies.
 
Method Summary
 ExtractorDescription getDescription()
          Returns a ExtractorDescription of this extractor.
 boolean isStopAtFirstError()
           
 boolean isVerifyDataType()
           
 void run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, Document in, ExtractionResult out)
          Executes the extractor.
 void setStopAtFirstError(boolean stopAtFirstError)
           
 void setVerifyDataType(boolean verifyDataType)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NAME

public static final String NAME
See Also:
Constant Field Values

factory

public static final ExtractorFactory<RDFa11Extractor> factory
Constructor Detail

RDFa11Extractor

public RDFa11Extractor(boolean verifyDataType,
                       boolean stopAtFirstError)
Constructor, allows to specify the validation and error handling policies.

Parameters:
verifyDataType - if true the data types will be verified, if false will be ignored.
stopAtFirstError - if true the parser will stop at first parsing error, if false will ignore non blocking errors.

RDFa11Extractor

public RDFa11Extractor()
Default constructor, with no verification of data types and not stop at first error.

Method Detail

isVerifyDataType

public boolean isVerifyDataType()

setVerifyDataType

public void setVerifyDataType(boolean verifyDataType)

isStopAtFirstError

public boolean isStopAtFirstError()

setStopAtFirstError

public void setStopAtFirstError(boolean stopAtFirstError)

run

public void run(ExtractionParameters extractionParameters,
                ExtractionContext extractionContext,
                Document in,
                ExtractionResult out)
         throws IOException,
                ExtractionException
Description copied from interface: Extractor
Executes the extractor. Will be invoked only once, extractors are not reusable.

Specified by:
run in interface Extractor<Document>
Parameters:
extractionParameters - the parameters to be applied during the extraction.
extractionContext - The document context.
in - The extractor input data.
out - the collector for the extracted data.
Throws:
IOException - On error while reading from the input stream.
ExtractionException - On other error, such as parse errors.

getDescription

public ExtractorDescription getDescription()
Description copied from interface: Extractor
Returns a ExtractorDescription of this extractor.

Specified by:
getDescription in interface Extractor<Document>
Returns:
the ExtractorDescription of this extractor


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.