org.apache.jackrabbit.core.query
Class MsWordTextFilter
java.lang.Object
org.apache.jackrabbit.core.query.MsWordTextFilter
- All Implemented Interfaces:
- org.apache.jackrabbit.core.query.TextFilter
- public class MsWordTextFilter
- extends Object
- implements org.apache.jackrabbit.core.query.TextFilter
Extracts texts from MS Word document binary data.
Taken from Jakarta Slide class
org.apache.slide.extractor.MSPowerPointExtractor
Method Summary |
boolean |
canFilter(String mimeType)
|
Map |
doFilter(org.apache.jackrabbit.core.state.PropertyState data,
String encoding)
Returns a map with a single entry for field FieldNames.FULLTEXT . |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
MsWordTextFilter
public MsWordTextFilter()
canFilter
public boolean canFilter(String mimeType)
- Specified by:
canFilter
in interface org.apache.jackrabbit.core.query.TextFilter
- Returns:
true
for application/vnd.ms-word
or application/msword
, false
otherwise.
doFilter
public Map doFilter(org.apache.jackrabbit.core.state.PropertyState data,
String encoding)
throws RepositoryException
- Returns a map with a single entry for field
FieldNames.FULLTEXT
.
- Specified by:
doFilter
in interface org.apache.jackrabbit.core.query.TextFilter
- Parameters:
data
- object containing MS Word document data.encoding
- text encoding is not used, since it is specified in the data.
- Returns:
- a map with a single Reader value for field
FieldNames.FULLTEXT
.
- Throws:
RepositoryException
- if data is a multi-value property or it does not
contain valid MS Word document.
Copyright © -2006 The Apache Software Foundation. All Rights Reserved.