Class POIXMLTextExtractorDecorator
java.lang.Object
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
- All Implemented Interfaces:
OOXMLExtractor
-
Field Summary
Fields inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
config, EMBEDDED_RELATIONSHIPS, extractor
-
Constructor Summary
ConstructorDescriptionPOIXMLTextExtractorDecorator
(ParseContext context, org.apache.poi.ooxml.extractor.POIXMLTextExtractor extractor) -
Method Summary
Modifier and TypeMethodDescriptionprotected void
buildXHTML
(XHTMLContentHandler xhtml) Populates theXHTMLContentHandler
object received as parameter.protected List<org.apache.poi.openxml4j.opc.PackagePart>
Return a list of the main parts of the document, used when searching for embedded resources.Methods inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
getDocument, getEmbeddedPartMetadataMap, getJustFileName, getMetadataExtractor, getXHTML, handleEmbeddedFile, loadLinkedRelationships
-
Constructor Details
-
POIXMLTextExtractorDecorator
public POIXMLTextExtractorDecorator(ParseContext context, org.apache.poi.ooxml.extractor.POIXMLTextExtractor extractor)
-
-
Method Details
-
buildXHTML
Description copied from class:AbstractOOXMLExtractor
Populates theXHTMLContentHandler
object received as parameter.- Specified by:
buildXHTML
in classAbstractOOXMLExtractor
- Throws:
SAXException
-
getMainDocumentParts
Description copied from class:AbstractOOXMLExtractor
Return a list of the main parts of the document, used when searching for embedded resources. This should be all the parts of the document that end up with things embedded into them.- Specified by:
getMainDocumentParts
in classAbstractOOXMLExtractor
-