Class FSDocumentSelector

java.lang.Object
org.apache.tika.batch.fs.FSDocumentSelector
All Implemented Interfaces:
DocumentSelector

public class FSDocumentSelector extends Object implements DocumentSelector
Selector that chooses files based on their file name and their size, as determined by TikaCoreProperties.RESOURCE_NAME_KEY and Metadata.CONTENT_LENGTH.

The excludeFileName pattern is applied first (if it isn't null). Then the includeFileName pattern is applied (if it isn't null), and finally, the size limit is applied if it is above 0.

  • Constructor Details

    • FSDocumentSelector

      public FSDocumentSelector(Pattern includeFileName, Pattern excludeFileName, long minFileSizeBytes, long maxFileSizeBytes)
  • Method Details

    • select

      public boolean select(Metadata metadata)
      Description copied from interface: DocumentSelector
      Checks if a document with the given metadata matches the specified selection criteria.
      Specified by:
      select in interface DocumentSelector
      Parameters:
      metadata - document metadata
      Returns:
      true if the document matches the selection criteria, false otherwise