org.apache.any23.extractor.html
Class SpeciesExtractor
java.lang.Object
org.apache.any23.extractor.html.MicroformatExtractor
org.apache.any23.extractor.html.EntityBasedMicroformatExtractor
org.apache.any23.extractor.html.SpeciesExtractor
- All Implemented Interfaces:
- Extractor<Document>, Extractor.TagSoupDOMExtractor
public class SpeciesExtractor
- extends EntityBasedMicroformatExtractor
Extractor able to extract the Species Microformat.
The data are represented using the
BBC Wildlife Ontology.
- Author:
- Davide Palmisano (dpalmisano@gmail.com)
- See Also:
WO
Methods inherited from class org.apache.any23.extractor.html.MicroformatExtractor |
addBNodeProperty, addBNodeProperty, addURIProperty, conditionallyAddLiteralProperty, conditionallyAddResourceProperty, conditionallyAddStringProperty, fixLink, fixLink, getCurrentExtractionResult, getDocumentURI, getExtractionContext, getHTMLDocument, includes, openSubResult, run |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
factory
public static final ExtractorFactory<SpeciesExtractor> factory
SpeciesExtractor
public SpeciesExtractor()
getDescription
public ExtractorDescription getDescription()
- Returns the description of this extractor.
- Specified by:
getDescription
in interface Extractor<Document>
- Specified by:
getDescription
in class MicroformatExtractor
- Returns:
- a human readable description.
getBaseClassName
protected String getBaseClassName()
- Returns the base class name for the extractor.
- Specified by:
getBaseClassName
in class EntityBasedMicroformatExtractor
- Returns:
- a string containing the base of the extractor.
resetExtractor
protected void resetExtractor()
- Resets the internal status of the extractor to prepare it to a new extraction section.
- Specified by:
resetExtractor
in class EntityBasedMicroformatExtractor
extractEntity
protected boolean extractEntity(Node node,
ExtractionResult out)
throws ExtractionException
- Extracts an entity from a DOM node.
- Specified by:
extractEntity
in class EntityBasedMicroformatExtractor
- Parameters:
node
- the DOM node.out
- the extraction result collector.
- Returns:
true
if the extraction has produces something, false
otherwise.
- Throws:
ExtractionException
Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.