org.apache.any23.extractor.html
Class HReviewExtractor

java.lang.Object
  extended by org.apache.any23.extractor.html.MicroformatExtractor
      extended by org.apache.any23.extractor.html.EntityBasedMicroformatExtractor
          extended by org.apache.any23.extractor.html.HReviewExtractor
All Implemented Interfaces:
Extractor<Document>, Extractor.TagSoupDOMExtractor

public class HReviewExtractor
extends EntityBasedMicroformatExtractor

Extractor for the hReview microformat.

Author:
Gabriele Renzi

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor
 
Field Summary
static ExtractorFactory<HReviewExtractor> factory
           
 
Fields inherited from class org.apache.any23.extractor.html.MicroformatExtractor
BEGIN_SCRIPT, END_SCRIPT, valueFactory
 
Constructor Summary
HReviewExtractor()
           
 
Method Summary
protected  boolean extractEntity(Node node, ExtractionResult out)
          Extracts an entity from a DOM node.
protected  String getBaseClassName()
          Returns the base class name for the extractor.
 ExtractorDescription getDescription()
          Returns the description of this extractor.
protected  void resetExtractor()
          Resets the internal status of the extractor to prepare it to a new extraction section.
 
Methods inherited from class org.apache.any23.extractor.html.EntityBasedMicroformatExtractor
extract, getBlankNodeFor
 
Methods inherited from class org.apache.any23.extractor.html.MicroformatExtractor
addBNodeProperty, addBNodeProperty, addURIProperty, conditionallyAddLiteralProperty, conditionallyAddResourceProperty, conditionallyAddStringProperty, fixLink, fixLink, getCurrentExtractionResult, getDocumentURI, getExtractionContext, getHTMLDocument, includes, openSubResult, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

factory

public static final ExtractorFactory<HReviewExtractor> factory
Constructor Detail

HReviewExtractor

public HReviewExtractor()
Method Detail

getDescription

public ExtractorDescription getDescription()
Description copied from class: MicroformatExtractor
Returns the description of this extractor.

Specified by:
getDescription in interface Extractor<Document>
Specified by:
getDescription in class MicroformatExtractor
Returns:
a human readable description.

getBaseClassName

protected String getBaseClassName()
Description copied from class: EntityBasedMicroformatExtractor
Returns the base class name for the extractor.

Specified by:
getBaseClassName in class EntityBasedMicroformatExtractor
Returns:
a string containing the base of the extractor.

resetExtractor

protected void resetExtractor()
Description copied from class: EntityBasedMicroformatExtractor
Resets the internal status of the extractor to prepare it to a new extraction section.

Specified by:
resetExtractor in class EntityBasedMicroformatExtractor

extractEntity

protected boolean extractEntity(Node node,
                                ExtractionResult out)
                         throws ExtractionException
Description copied from class: EntityBasedMicroformatExtractor
Extracts an entity from a DOM node.

Specified by:
extractEntity in class EntityBasedMicroformatExtractor
Parameters:
node - the DOM node.
out - the extraction result collector.
Returns:
true if the extraction has produces something, false otherwise.
Throws:
ExtractionException


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.