org.apache.any23.extractor.xpath
Class XPathExtractor
java.lang.Object
org.apache.any23.extractor.xpath.XPathExtractor
- All Implemented Interfaces:
- Extractor<Document>, Extractor.TagSoupDOMExtractor
public class XPathExtractor
- extends Object
- implements Extractor.TagSoupDOMExtractor
Implementation of an Extractor.TagSoupDOMExtractor
able to
apply XPathExtractionRule
s and generate quads.
- Author:
- Michele Mostarda (mostarda@fbk.eu)
- See Also:
XPathExtractionRule
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
NAME
public static final String NAME
- See Also:
- Constant Field Values
factory
public static final ExtractorFactory<XPathExtractor> factory
XPathExtractor
public XPathExtractor(List<XPathExtractionRule> rules)
add
public void add(XPathExtractionRule rule)
remove
public void remove(XPathExtractionRule rule)
contains
public boolean contains(XPathExtractionRule rule)
run
public void run(ExtractionParameters extractionParameters,
ExtractionContext extractionContext,
Document in,
ExtractionResult out)
throws IOException,
ExtractionException
- Description copied from interface:
Extractor
- Executes the extractor. Will be invoked only once, extractors are
not reusable.
- Specified by:
run
in interface Extractor<Document>
- Parameters:
extractionParameters
- the parameters to be applied during the extraction.extractionContext
- The document context.in
- The extractor input data.out
- the collector for the extracted data.
- Throws:
IOException
- On error while reading from the input stream.
ExtractionException
- On other error, such as parse errors.
getDescription
public ExtractorDescription getDescription()
- Description copied from interface:
Extractor
- Returns a
ExtractorDescription
of this extractor.
- Specified by:
getDescription
in interface Extractor<Document>
- Returns:
- the object representing the extractor description.
Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.