Class RecursiveParserWrapperHandler

All Implemented Interfaces:
Serializable, ContentHandler, DTDHandler, EntityResolver, ErrorHandler

public class RecursiveParserWrapperHandler extends AbstractRecursiveParserWrapperHandler
This is the default implementation of AbstractRecursiveParserWrapperHandler. See its documentation for more details.

This caches the a metadata object for each embedded file and for the container file. It places the extracted content in the metadata object, with this key: TikaCoreProperties.TIKA_CONTENT If memory is a concern, subclass AbstractRecursiveParserWrapperHandler to handle each embedded document.

NOTE: This handler must only be used with the RecursiveParserWrapper

See Also:
  • Field Details

    • metadataList

      protected final List<Metadata> metadataList
  • Constructor Details

    • RecursiveParserWrapperHandler

      public RecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory)
      Create a handler with no limit on the number of embedded resources
    • RecursiveParserWrapperHandler

      public RecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources)
      Create a handler that limits the number of embedded resources that will be parsed
      Parameters:
      maxEmbeddedResources - number of embedded resources that will be parsed
    • RecursiveParserWrapperHandler

      public RecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources, MetadataFilter metadataFilter)
  • Method Details