Documents > Cookbook >TextExtractor

Get Text

TextExtractor provides a method to get the display text of a single element. EditableTextExtractor is a sub class of TextExtractor. It provides a method to return all the text that the user can typically edit in a document, including text in cotent.xml, header and footer in styles.xml, meta data in meta.xml.

The following codes use EditableTextExtractor as an example, the text of the document "textExtractor.odt" is extracted for user. For TextExtractor, it can't extract the text from a TextDocument.

        TextDocument textdoc=(TextDocument)TextDocument.loadDocument("textExtractor.odt");

        EditableTextExtractor extractorD = EditableTextExtractor.newOdfEditableTextExtractor(textdoc);

        String output = extractorD.getText();

        System.out.println(output);

In the following codes, the whole document content will be returned. This operation is the same in TextExtractor.

        OdfElement elem=textdoc.getContentRoot();

        EditableTextExtractor extractorE = EditableTextExtractor.newOdfEditableTextExtractor(elem);

        System.out.println(extractorE.getText());

Apache "ODF Toolkit" is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

Copyright © 2011 The Apache Software Foundation Licensed under the Apache License, Version 2.0. Contact Us
Apache and the Apache feather logos are trademarks of The Apache Software Foundation.
Other names appearing on the site may be trademarks of their respective owners.

General¶

Components¶

Community¶

Development¶

PPMC¶

ASF¶