org.apache.jackrabbit.core.query
Interface TextFilter

All Known Implementing Classes:
TextPlainTextFilter

public interface TextFilter

Defines an interface for extracting text out of binary properties according to their mime-type.

TextFilter implementations are asked if they can handle a certain mime type (canFilter(String) and if one of them returns true the text representation is created with doFilter(PropertyState, String)


Method Summary
 boolean canFilter(String mimeType)
          Returns true if this TextFilter can index content of mimeType; false otherwise.
 Map doFilter(PropertyState data, String encoding)
          Creates an text representation of a binary property data.
 

Method Detail

canFilter

public boolean canFilter(String mimeType)
Returns true if this TextFilter can index content of mimeType; false otherwise.

Parameters:
mimeType - the mime type of the content to index.
Returns:
whether this TextFilter can index content of mimeType.

doFilter

public Map doFilter(PropertyState data,
                    String encoding)
             throws RepositoryException
Creates an text representation of a binary property data. The returned map contains Reader values. Keys to the reader values are Strings that serve as field names.

E.g. a TextFilter for a html document may extract multiple fields: one for the title and one for the whole content.

Parameters:
data - the data property that contains the binary content.
encoding - the encoding of the content or null if data does not use encoding.
Returns:
the extracted text.
Throws:
RepositoryException - if an error occurs while reading from the node or if the data is malformed.


Copyright © 2004-2006 The Apache Software Foundation. All Rights Reserved.