A C D E F G H I K L M N O P R S T U W X

A

add(String, String) - Method in class org.apache.tika.metadata.Metadata
Add a metadata name/value mapping.
add(String, String) - Method in class org.apache.tika.metadata.SpellCheckedMetadata
 
addAlias(String) - Method in class org.apache.tika.mime.MimeType
Adds an alias name for this media type.
addPattern(MimeType, String) - Method in class org.apache.tika.mime.MimeTypes
Adds a file name pattern for the given media type.
append(String) - Method in class org.apache.tika.parser.microsoft.WordTextBuffer
 
append(char) - Method in class org.apache.tika.sax.AppendableAdaptor
Write a single character to the underling content handler.
append(CharSequence) - Method in class org.apache.tika.sax.AppendableAdaptor
Write a character sequence to the underling content handler.
append(CharSequence, int, int) - Method in class org.apache.tika.sax.AppendableAdaptor
Write the specified characters to the underling content handler.
AppendableAdaptor - Class in org.apache.tika.sax
Adaptor which turns a ContentHandler into an Appendable.
AppendableAdaptor(ContentHandler) - Constructor for class org.apache.tika.sax.AppendableAdaptor
Creates a adaptor for the given SAX event handler.
APPLICATION_NAME - Static variable in interface org.apache.tika.metadata.MSOffice
 
AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
 
AutoDetectParser - Class in org.apache.tika.parser
 
AutoDetectParser() - Constructor for class org.apache.tika.parser.AutoDetectParser
Creates an auto-detecting parser instance using the default Tika configuration.
AutoDetectParser(TikaConfig) - Constructor for class org.apache.tika.parser.AutoDetectParser
 

C

CauseIOException - Exception in org.apache.tika.exception
IOException subclass with the Throwable constructor that's missing before Java 6.
CauseIOException(String) - Constructor for exception org.apache.tika.exception.CauseIOException
Creates an IOException with the given message.
CauseIOException(String, Throwable) - Constructor for exception org.apache.tika.exception.CauseIOException
Creates an IOException with the given message and root cause.
CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
characters(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
characters(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
 
characters(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
close() - Method in class org.apache.tika.utils.RereadableInputStream
Closes the input stream and removes the temporary file if one was created.
COMMENTS - Static variable in interface org.apache.tika.metadata.MSOffice
 
compareTo(MimeType) - Method in class org.apache.tika.mime.MimeType
 
concatOccurrence(Object, String, String, Appendable) - Method in class org.apache.tika.parser.xml.XMLParser
 
CONTENT_DISPOSITION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_ENCODING - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LANGUAGE - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_MD5 - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
ContentHandlerDecorator - Class in org.apache.tika.sax
Decorator base class for the ContentHandler interface.
ContentHandlerDecorator(ContentHandler) - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
Creates a decorator for the given SAX event handler.
CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity responsible for making contributions to the content of the resource.
copyInputStream(InputStream, OutputStream) - Method in class org.apache.tika.parser.opendocument.OpenOfficeParser
 
COVERAGE - Static variable in interface org.apache.tika.metadata.DublinCore
The extent or scope of the content of the resource.
create() - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates an empty instance; same as calling new MimeTypes().
create(Document) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified document.
create(InputStream) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified input stream.
create(URL) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the resource at the location specified by the URL.
create(String) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified file path, as interpreted by the class loader in getResource().
CreativeCommons - Interface in org.apache.tika.metadata
A collection of Creative Commons properties names.
CREATOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity primarily responsible for making the content of the resource.

D

DATE - Static variable in interface org.apache.tika.metadata.DublinCore
A date associated with an event in the life cycle of the resource.
decode(String) - Static method in class org.apache.tika.mime.HexCoDec
Decode a hex string
decode(char[]) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars
decode(char[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars.
DEFAULT - Static variable in class org.apache.tika.mime.MimeTypes
The default application/octet-stream MimeType
DEFAULT_CONFIG_LOCATION - Static variable in class org.apache.tika.config.TikaConfig
 
DESCRIPTION - Static variable in interface org.apache.tika.metadata.DublinCore
An account of the content of the resource.
DublinCore - Interface in org.apache.tika.metadata
A collection of Dublin Core metadata names.

E

element(String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
EmptyParser - Class in org.apache.tika.parser
Dummy parser that always produces an empty XHTML document without even attempting to parse the given document stream.
EmptyParser() - Constructor for class org.apache.tika.parser.EmptyParser
 
encode(byte[]) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
encode(byte[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
endDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endDocument() - Method in class org.apache.tika.sax.TeeContentHandler
 
endDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
Ends the XHTML document by writing the following footer and clearing the namespace mappings:
endElement(String, String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endElement(String, String, String) - Method in class org.apache.tika.sax.TeeContentHandler
 
endElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
endPrefixMapping(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endPrefixMapping(String) - Method in class org.apache.tika.sax.TeeContentHandler
 
equals(Object) - Method in class org.apache.tika.metadata.Metadata
 
ErrorParser - Class in org.apache.tika.parser
Dummy parser that always throws a TikaException without even attempting to parse the given document stream.
ErrorParser() - Constructor for class org.apache.tika.parser.ErrorParser
 
ExcelEventParser - Class in org.apache.tika.parser.microsoft
Excel parser implementation which uses POI's Event API to handle the contents of a Workbook.
ExcelEventParser() - Constructor for class org.apache.tika.parser.microsoft.ExcelEventParser
Create an instance which only listens for the specified records (i.e.
ExcelEventParser(boolean) - Constructor for class org.apache.tika.parser.microsoft.ExcelEventParser
Create an instance specifying whether to listen for all records or just for the specified few.
ExcelParser - Class in org.apache.tika.parser.microsoft
Excel parser
ExcelParser() - Constructor for class org.apache.tika.parser.microsoft.ExcelParser
 
extractContent(Document, String, String, Metadata) - Method in class org.apache.tika.parser.xml.XMLParser
 
extractLinks(String) - Static method in class org.apache.tika.utils.RegexUtils
Extract urls from plain text.
extractText(POIFSFileSystem, Appendable) - Method in class org.apache.tika.parser.microsoft.ExcelEventParser
Extracts text from an Excel Workbook writing the extracted content to the specified Appendable.
extractText(POIFSFileSystem, Appendable) - Method in class org.apache.tika.parser.microsoft.ExcelParser
 
extractText(POIFSFileSystem, Appendable) - Method in class org.apache.tika.parser.microsoft.OfficeParser
Extracts the text content from a Microsoft document input stream.
extractText(POIFSFileSystem, Appendable) - Method in class org.apache.tika.parser.microsoft.PowerPointParser
 
extractText(POIFSFileSystem, Appendable) - Method in class org.apache.tika.parser.microsoft.WordParser
Gets the text from a Word document.

F

FilteredStringWriter - Class in org.apache.tika.parser.microsoft
Writes to optimize ASCII output.
FilteredStringWriter() - Constructor for class org.apache.tika.parser.microsoft.FilteredStringWriter
 
FilteredStringWriter(int) - Constructor for class org.apache.tika.parser.microsoft.FilteredStringWriter
 
FORMAT - Static variable in interface org.apache.tika.metadata.DublinCore
Typically, Format may include the media-type or dimensions of the resource.
forName(String) - Method in class org.apache.tika.mime.MimeTypes
Returns the registered media type with the given name (or alias).
fromHexString(String) - Static method in class org.apache.tika.utils.StringUtil
Convert a String containing consecutive (no inside whitespace) hexadecimal digits into a corresponding byte array.

G

get(String) - Method in class org.apache.tika.metadata.Metadata
Get the value associated to a metadata name.
get(String) - Method in class org.apache.tika.metadata.SpellCheckedMetadata
 
getAliases() - Method in class org.apache.tika.mime.MimeType
Returns the aliases of this media type.
getAllDocumentNs(Document) - Method in class org.apache.tika.parser.xml.XMLParser
 
getConfig() - Method in class org.apache.tika.parser.AutoDetectParser
 
getContentType() - Method in class org.apache.tika.parser.microsoft.ExcelEventParser
Return the content type handled by this parser.
getContentType() - Method in class org.apache.tika.parser.microsoft.ExcelParser
 
getContentType() - Method in class org.apache.tika.parser.microsoft.OfficeParser
The content type of the document being parsed.
getContentType() - Method in class org.apache.tika.parser.microsoft.PowerPointParser
 
getContentType() - Method in class org.apache.tika.parser.microsoft.WordParser
 
getDefaultConfig() - Static method in class org.apache.tika.config.TikaConfig
Provides a default configuration (TikaConfig).
getDescription() - Method in class org.apache.tika.mime.MimeType
Returns the description of this media type.
getHandler() - Method in class org.apache.tika.sax.AppendableAdaptor
Return the content handler.
getMimeRepository() - Method in class org.apache.tika.config.TikaConfig
 
getMimeType(File) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a file.
getMimeType(URL) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a document from its URL.
getMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a document from its name.
getMimeType(byte[]) - Method in class org.apache.tika.mime.MimeTypes
Returns the MIME type that best matches the given first few bytes of a document stream.
getMimeType(InputStream) - Method in class org.apache.tika.mime.MimeTypes
Returns the MIME type that best matches the first few bytes of the given document stream.
getMimeType(String, byte[]) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a document from its name and its content.
getMimeType(String, InputStream) - Method in class org.apache.tika.mime.MimeTypes
Returns the MIME type that best matches the given document name and the first few bytes of the given document stream.
getMinLength() - Method in class org.apache.tika.mime.MimeTypes
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
getName() - Method in class org.apache.tika.mime.MimeType
Returns the name of this media type.
getNormalizedName(String) - Static method in class org.apache.tika.metadata.SpellCheckedMetadata
Get the normalized name of metadata attribute name.
getParser(String) - Method in class org.apache.tika.config.TikaConfig
Returns the parser instance configured for the given MIME type.
getParser(String, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a parser that can handle the specified MIME type, and is set to receive input from a stream opened from the specified URL.
getParser(URL, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a parser that can handle the specified MIME type, and is set to receive input from a stream opened from the specified URL.
getParser(File, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a parser that can handle the specified MIME type, and is set to receive input from a stream opened from the specified URL.
getParsersFromZip(File, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a list of parsers from zip File
getParsersFromZip(URL, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a list of parsers from URL
getSize() - Method in class org.apache.tika.utils.RereadableInputStream
Returns the number of bytes read from the original stream.
getStringContent(InputStream, TikaConfig, String) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(URL, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(URL, TikaConfig, String) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(File, TikaConfig, String) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(File, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getSubTypes() - Method in class org.apache.tika.mime.MimeType
 
getSuperType() - Method in class org.apache.tika.mime.MimeType
Returns the parent of this media type.
getTextRuns() - Method in class org.apache.tika.parser.microsoft.Word6CHPBinTable
 
getType(String, String, byte[]) - Method in class org.apache.tika.mime.MimeTypes
 
getType(URL) - Method in class org.apache.tika.mime.MimeTypes
Determines the MIME type of the resource pointed to by the specified URL.
getUTF8Reader(InputStream, Metadata) - Static method in class org.apache.tika.utils.Utils
Try to detect encoding from inputstream and return a UTF-8 Reader.
getValues(String) - Method in class org.apache.tika.metadata.Metadata
Get the values associated to a metadata name.
getValues(String) - Method in class org.apache.tika.metadata.SpellCheckedMetadata
 

H

hasMagic() - Method in class org.apache.tika.mime.MimeType
 
HexCoDec - Class in org.apache.tika.mime
A set of Hex encoding and decoding utility methods.
HexCoDec() - Constructor for class org.apache.tika.mime.HexCoDec
 
HtmlParser - Class in org.apache.tika.parser.html
Simple HTML parser that extracts title.
HtmlParser() - Constructor for class org.apache.tika.parser.html.HtmlParser
 
HttpHeaders - Interface in org.apache.tika.metadata
A collection of HTTP header names.

I

IDENTIFIER - Static variable in interface org.apache.tika.metadata.DublinCore
Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
 
isDescendantOf(MimeType) - Method in class org.apache.tika.mime.MimeType
 
isEmpty(String) - Static method in class org.apache.tika.utils.StringUtil
Checks if a string is empty (ie is null or empty).
isMultiValued(String) - Method in class org.apache.tika.metadata.Metadata
Returns true if named value is multivalued.
isValid(String) - Static method in class org.apache.tika.mime.MimeType
Checks that the given string is a valid Internet media type name based on rules from RFC 2054 section 5.3.

K

KEYWORDS - Static variable in interface org.apache.tika.metadata.MSOffice
 

L

LANGUAGE - Static variable in interface org.apache.tika.metadata.DublinCore
A language of the intellectual content of the resource.
LAST_AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
 
LAST_MODIFIED - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
LAST_PRINTED - Static variable in interface org.apache.tika.metadata.MSOffice
 
LAST_SAVED - Static variable in interface org.apache.tika.metadata.MSOffice
 
leftPad(String, int) - Static method in class org.apache.tika.utils.StringUtil
Returns a copy of s padded with leading spaces so that it's length is length.
LICENSE_LOCATION - Static variable in interface org.apache.tika.metadata.CreativeCommons
 
LICENSE_URL - Static variable in interface org.apache.tika.metadata.CreativeCommons
 
LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 

M

main(String[]) - Static method in class org.apache.tika.utils.StringUtil
 
matches(byte[]) - Method in class org.apache.tika.mime.MimeType
 
matchesMagic(byte[]) - Method in class org.apache.tika.mime.MimeType
 
Metadata - Class in org.apache.tika.metadata
A multi-valued metadata container.
Metadata() - Constructor for class org.apache.tika.metadata.Metadata
Constructs a new, empty metadata.
MIME_TYPE_MAGIC - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
 
MimeType - Class in org.apache.tika.mime
Internet media type.
MimeTypeException - Exception in org.apache.tika.mime
A class to encapsulate MimeType related exceptions.
MimeTypeException() - Constructor for exception org.apache.tika.mime.MimeTypeException
Constructs a MimeTypeException with no specified detail message.
MimeTypeException(String) - Constructor for exception org.apache.tika.mime.MimeTypeException
Constructs a MimeTypeException with the specified detail message.
MimeTypeException(Throwable) - Constructor for exception org.apache.tika.mime.MimeTypeException
Constructs a MimeTypeException with the specified cause.
MimeTypes - Class in org.apache.tika.mime
This class is a MimeType repository.
MimeTypes() - Constructor for class org.apache.tika.mime.MimeTypes
 
MimeTypesFactory - Class in org.apache.tika.mime
Creates instances of MimeTypes.
MimeTypesFactory() - Constructor for class org.apache.tika.mime.MimeTypesFactory
 
MODIFIED - Static variable in interface org.apache.tika.metadata.DublinCore
Date on which the resource was changed.
MSOffice - Interface in org.apache.tika.metadata
A collection of "Office" documents properties names.

N

names() - Method in class org.apache.tika.metadata.Metadata
Returns an array of the names contained in the metadata.

O

OfficeParser - Class in org.apache.tika.parser.microsoft
Defines a Microsoft document content extractor.
OfficeParser() - Constructor for class org.apache.tika.parser.microsoft.OfficeParser
 
OpenOfficeEntityResolver - Class in org.apache.tika.parser.opendocument
OpenOffice entity resolver
OpenOfficeEntityResolver() - Constructor for class org.apache.tika.parser.opendocument.OpenOfficeEntityResolver
 
OpenOfficeParser - Class in org.apache.tika.parser.opendocument
OpenOffice parser
OpenOfficeParser() - Constructor for class org.apache.tika.parser.opendocument.OpenOfficeParser
 
org.apache.tika.config - package org.apache.tika.config
 
org.apache.tika.exception - package org.apache.tika.exception
 
org.apache.tika.metadata - package org.apache.tika.metadata
A Multi-valued Metadata container, and set of constant fields for Tika Metadata.
org.apache.tika.mime - package org.apache.tika.mime
 
org.apache.tika.parser - package org.apache.tika.parser
 
org.apache.tika.parser.html - package org.apache.tika.parser.html
 
org.apache.tika.parser.microsoft - package org.apache.tika.parser.microsoft
 
org.apache.tika.parser.opendocument - package org.apache.tika.parser.opendocument
 
org.apache.tika.parser.pdf - package org.apache.tika.parser.pdf
 
org.apache.tika.parser.rtf - package org.apache.tika.parser.rtf
 
org.apache.tika.parser.txt - package org.apache.tika.parser.txt
 
org.apache.tika.parser.xml - package org.apache.tika.parser.xml
 
org.apache.tika.sax - package org.apache.tika.sax
 
org.apache.tika.utils - package org.apache.tika.utils
 

P

PAGE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AutoDetectParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.EmptyParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.ErrorParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.html.HtmlParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.microsoft.OfficeParser
Extracts properties and text from an MS Document input stream
parse(InputStream) - Method in class org.apache.tika.parser.opendocument.OpenOfficeParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.opendocument.OpenOfficeParser
 
parse(InputStream, ContentHandler, Metadata) - Method in interface org.apache.tika.parser.Parser
Parses a document stream into a sequence of XHTML SAX events.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.ParserDecorator
Delegates the method call to the decorated parser.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.ParserPostProcessor
Forwards the call to the delegated parser and post-processes the results as described above.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.pdf.PDFParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.rtf.RTFParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.txt.TXTParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.xml.XMLParser
 
parse(InputStream) - Static method in class org.apache.tika.utils.Utils
 
parseCharacterEncoding(String) - Static method in class org.apache.tika.utils.StringUtil
Parse the character encoding from the specified content type header.
Parser - Interface in org.apache.tika.parser
Tika parser interface
ParserDecorator - Class in org.apache.tika.parser
Decorator base class for the Parser interface.
ParserDecorator(Parser) - Constructor for class org.apache.tika.parser.ParserDecorator
Creates a decorator for the given parser.
ParserPostProcessor - Class in org.apache.tika.parser
Parser decorator that post-processes the results from a decorated parser.
ParserPostProcessor(Parser) - Constructor for class org.apache.tika.parser.ParserPostProcessor
Creates a post-processing decorator for the given parser.
ParseUtils - Class in org.apache.tika.utils
Contains utility methods for parsing documents.
ParseUtils() - Constructor for class org.apache.tika.utils.ParseUtils
 
PDFParser - Class in org.apache.tika.parser.pdf
PDF parser
PDFParser() - Constructor for class org.apache.tika.parser.pdf.PDFParser
 
PowerPointParser - Class in org.apache.tika.parser.microsoft
Power point parser
PowerPointParser() - Constructor for class org.apache.tika.parser.microsoft.PowerPointParser
 
processingInstruction(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
processingInstruction(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
 
PUBLISHER - Static variable in interface org.apache.tika.metadata.DublinCore
An entity responsible for making the resource available.

R

read() - Method in class org.apache.tika.utils.RereadableInputStream
Reads a byte from the stream, saving it in the store if it is being read from the original stream.
RegexUtils - Class in org.apache.tika.utils
Inspired from Nutch code class OutlinkExtractor.
RegexUtils() - Constructor for class org.apache.tika.utils.RegexUtils
 
RELATION - Static variable in interface org.apache.tika.metadata.DublinCore
A reference to a related resource.
remove(String) - Method in class org.apache.tika.metadata.Metadata
Remove a metadata and all its associated values.
remove(String) - Method in class org.apache.tika.metadata.SpellCheckedMetadata
 
RereadableInputStream - Class in org.apache.tika.utils
Wraps an input stream, reading it only once, but making it available for rereading an arbitrary number of times.
RereadableInputStream(InputStream, int, boolean, boolean) - Constructor for class org.apache.tika.utils.RereadableInputStream
Creates a rereadable input stream.
resolveEncodingAlias(String) - Static method in class org.apache.tika.utils.StringUtil
 
resolveEntity(String, String) - Method in class org.apache.tika.parser.opendocument.OpenOfficeEntityResolver
 
RESOURCE_NAME_KEY - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
 
REVISION_NUMBER - Static variable in interface org.apache.tika.metadata.MSOffice
 
rewind() - Method in class org.apache.tika.utils.RereadableInputStream
"Rewinds" the stream to the beginning for rereading.
rightPad(String, int) - Static method in class org.apache.tika.utils.StringUtil
Returns a copy of s padded with trailing spaces so that it's length is length.
RIGHTS - Static variable in interface org.apache.tika.metadata.DublinCore
Information about rights held in and over the resource.
RTFParser - Class in org.apache.tika.parser.rtf
RTF parser
RTFParser() - Constructor for class org.apache.tika.parser.rtf.RTFParser
 

S

saveInXmlFile(Document, String) - Static method in class org.apache.tika.utils.Utils
 
set(String, String) - Method in class org.apache.tika.metadata.Metadata
Set metadata name/value.
set(String, String) - Method in class org.apache.tika.metadata.SpellCheckedMetadata
 
setAll(Properties) - Method in class org.apache.tika.metadata.Metadata
Copy All key-value pairs from properties.
setConfig(TikaConfig) - Method in class org.apache.tika.parser.AutoDetectParser
 
setDescription(String) - Method in class org.apache.tika.mime.MimeType
Set the description of this media type.
setDocumentLocator(Locator) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TeeContentHandler
 
setSuperType(MimeType) - Method in class org.apache.tika.mime.MimeType
 
size() - Method in class org.apache.tika.metadata.Metadata
Returns the number of metadata names in this metadata.
skippedEntity(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
skippedEntity(String) - Method in class org.apache.tika.sax.TeeContentHandler
 
SOURCE - Static variable in interface org.apache.tika.metadata.DublinCore
A reference to a resource from which the present resource is derived.
SpellCheckedMetadata - Class in org.apache.tika.metadata
A decorator to Metadata that adds spellchecking capabilities to property names.
SpellCheckedMetadata() - Constructor for class org.apache.tika.metadata.SpellCheckedMetadata
 
startDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
startDocument() - Method in class org.apache.tika.sax.TeeContentHandler
 
startDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
Starts an XHTML document by setting up the namespace mappings and writing following header:
startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TeeContentHandler
 
startElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
startPrefixMapping(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
 
StringUtil - Class in org.apache.tika.utils
A collection of String processing utility methods.
StringUtil() - Constructor for class org.apache.tika.utils.StringUtil
 
SUBJECT - Static variable in interface org.apache.tika.metadata.DublinCore
The topic of the content of the resource.

T

TeeContentHandler - Class in org.apache.tika.sax
Content handler decorator that forwards the received SAX events to two underlying content handlers.
TeeContentHandler(ContentHandler, ContentHandler) - Constructor for class org.apache.tika.sax.TeeContentHandler
 
TEMPLATE - Static variable in interface org.apache.tika.metadata.MSOffice
 
TIKA_MIME_FILE - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
 
TikaConfig - Class in org.apache.tika.config
Parse xml config file.
TikaConfig(String) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(File) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(URL) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(InputStream) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(Document) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(Element) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaException - Exception in org.apache.tika.exception
Tika exception
TikaException(String) - Constructor for exception org.apache.tika.exception.TikaException
 
TikaException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaException
 
TikaMetadataKeys - Interface in org.apache.tika.metadata
Contains keys to properties in Metadata instances.
TikaMimeKeys - Interface in org.apache.tika.metadata
A collection of Tika metadata keys used in Mime Type resolution
TITLE - Static variable in interface org.apache.tika.metadata.DublinCore
A name given to the resource.
toHexString(byte[]) - Static method in class org.apache.tika.utils.StringUtil
Convenience call for StringUtil.toHexString(byte[], String, int), where sep = null; lineLen = Integer.MAX_VALUE.
toHexString(byte[], String, int) - Static method in class org.apache.tika.utils.StringUtil
Get a text representation of a byte[] as hexadecimal String, where each pair of hexadecimal digits corresponds to consecutive bytes in the array.
toString() - Method in class org.apache.tika.metadata.Metadata
 
toString() - Method in class org.apache.tika.mime.MimeType
Returns the name of this media type.
TXTParser - Class in org.apache.tika.parser.txt
Text parser
TXTParser() - Constructor for class org.apache.tika.parser.txt.TXTParser
 
TYPE - Static variable in interface org.apache.tika.metadata.DublinCore
The nature or genre of the content of the resource.

U

unzip(InputStream) - Method in class org.apache.tika.parser.opendocument.OpenOfficeParser
 
unzip(InputStream) - Static method in class org.apache.tika.utils.Utils
 
Utils - Class in org.apache.tika.utils
Class util
Utils() - Constructor for class org.apache.tika.utils.Utils
 

W

Word6CHPBinTable - Class in org.apache.tika.parser.microsoft
This class holds all of the character formatting properties from a Word 6.0/95 document.
Word6CHPBinTable(byte[], int, int, int) - Constructor for class org.apache.tika.parser.microsoft.Word6CHPBinTable
Constructor used to read a binTable in from a Word document.
WORD_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
WordParser - Class in org.apache.tika.parser.microsoft
Word parser
WordParser() - Constructor for class org.apache.tika.parser.microsoft.WordParser
 
WordTextBuffer - Class in org.apache.tika.parser.microsoft
This class acts as a StringBuffer for text from a word document.
WordTextBuffer(Appendable) - Constructor for class org.apache.tika.parser.microsoft.WordTextBuffer
 
WORK_TYPE - Static variable in interface org.apache.tika.metadata.CreativeCommons
 
write(int) - Method in class org.apache.tika.parser.microsoft.FilteredStringWriter
Chars which are not useful for Nutch indexing are filtered (ignored) on writing to the writer.
WriteOutContentHandler - Class in org.apache.tika.sax
SAX event handler that writes all character content out to a Writer character stream.
WriteOutContentHandler(Writer) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
 

X

XHTML - Static variable in class org.apache.tika.sax.XHTMLContentHandler
The XHTML namespace URI
XHTMLContentHandler - Class in org.apache.tika.sax
Content handler decorator that simplifies the task of producing XHTML events for Tika content parsers.
XHTMLContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.XHTMLContentHandler
 
XMLParser - Class in org.apache.tika.parser.xml
XML parser
XMLParser() - Constructor for class org.apache.tika.parser.xml.XMLParser
 

A C D E F G H I K L M N O P R S T U W X

Copyright © 2008 The Apache Software Foundation. All Rights Reserved.