|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.jackrabbit.core.query.lucene.JackrabbitTextExtractor
public class JackrabbitTextExtractor
Backwards-compatible Jackrabbit text extractor component. This class implements the following functionality:
TextExtractor
and TextFilter
class names and instantiates the configured classes.
DelegatingTextExtractor
instances.
CompositeTextExtractor
instance that contains
all the configured extractors and to which all text extraction calls
are delegated.
TextFilterExtractor
adapter for a configured
TextFilter
instance when it is first used and adds that adapter
to the composite extractor for use in text extraction.
EmptyTextExtractor
instance
for any unsupported content types when first detected. The dummy
extractor is added to the composite extractor to prevent future
warnings about the same content type.
Constructor Summary | |
---|---|
JackrabbitTextExtractor(String classes)
Creates a Jackrabbit text extractor containing the configured component classes. |
Method Summary | |
---|---|
Reader |
extractText(InputStream stream,
String type,
String encoding)
Extracts the text content from the given binary stream. |
String[] |
getContentTypes()
Returns the content types that the component extractors are known to support. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public JackrabbitTextExtractor(String classes)
classes
- configured TextExtractor
(and TextFilter
)
class names (space- or comma-separated)Method Detail |
---|
public String[] getContentTypes()
getContentTypes
in interface TextExtractor
public Reader extractText(InputStream stream, String type, String encoding) throws IOException
If a matching extractor is not found, then the configured text filters searched for an instance that claims to support the given content type. A text extractor adapter is created for that filter and saved in the extractor map for future use before delegating the request to the adapter.
If not even a text filter is found for the given content type, a warning is logged and an empty text extractor is created for that content type and saved in the extractor map for future use before delegating the request to the empty extractor.
extractText
in interface TextExtractor
stream
- binary streamtype
- content typeencoding
- character encoding, or null
IOException
- if the binary stream can not be read
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |