Package org.apache.tika.parser.pdf
Class PDFParserConfig.OCRStrategyAuto
java.lang.Object
org.apache.tika.parser.pdf.PDFParserConfig.OCRStrategyAuto
- All Implemented Interfaces:
Serializable
- Enclosing class:
- PDFParserConfig
Encapsulate the numbers used to control OCR Strategy when set to auto
If the total characters on the page < this.totalCharsPerPage or total unmapped unicode characters on the page > this.unmappedUnicodeCharsPerPage then we will perform OCR on the page
If unamppedUnicodeCharsPerPage is an integer > 0, then we compare absolute number of characters. If it is a float < 1, then we assume it is a percentage and we compare it to the percentage of unmappedCharactersPerPage/totalCharsPerPage
- See Also:
-
Constructor Summary
-
Method Summary
-
Constructor Details
-
OCRStrategyAuto
public OCRStrategyAuto(float unmappedUnicodeCharsPerPage, int totalCharsPerPage)
-
-
Method Details