XMLTextExtractor (Apache Jackrabbit 1.3 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.jackrabbit.extractor
Class XMLTextExtractor

java.lang.Object
  org.apache.jackrabbit.extractor.AbstractTextExtractor
      org.apache.jackrabbit.extractor.XMLTextExtractor

All Implemented Interfaces:: TextExtractor

public class XMLTextExtractor
extends AbstractTextExtractor
extends AbstractTextExtractor

Text extractor for XML documents. This class extracts the text content and attribute values from XML documents.

This class can handle any XML-based format (application/xml+something), not just the base XML content types reported by AbstractTextExtractor.getContentTypes(). However, it often makes sense to use more specialized extractors that better understand the specific content type.

Constructor Summary
`XMLTextExtractor()` Creates a new `XMLTextExtractor` instance.

Method Summary
`Reader`	`extractText(InputStream stream, String type, String encoding)` Returns a reader for the text content of the given XML document.

Methods inherited from class org.apache.jackrabbit.extractor.AbstractTextExtractor
`getContentTypes`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

XMLTextExtractor

public XMLTextExtractor()

Creates a new XMLTextExtractor instance.

Method Detail

extractText

public Reader extractText(InputStream stream,
                          String type,
                          String encoding)
                   throws IOException

Returns a reader for the text content of the given XML document. Returns an empty reader if the given encoding is not supported or if the XML document could not be parsed.

Parameters:: stream - XML document; type - XML content type; encoding - character encoding, or null
Returns:: reader for the text content of the given XML document, or an empty reader if the document could not be parsed
Throws:: IOException - if the XML document stream can not be closed