Apache OpenNLP ${pom.version} Release Notes

1. What is Apache OpenNLP?

This component does text relevance assessment. It takes two portions of texts (phrases, sentences, paragraphs) and returns a similarity score. Similarity component can be used on top of search to improve relevance, computing similarity score between a question and all search results (snippets). Also, this component is useful for web mining of images, videos, forums, blogs, and other media with textual descriptions. Such applications as content generation and filtering meaningless speech recognition results are included in the sample applications of this component. Relevance assessment is based on machine learning of syntactic parse trees (constituency trees, http://en.wikipedia.org/wiki/Parse_tree). The similarity score is calculated as the size of all maximal common sub-trees for sentences from a pair of texts ( www.aaai.org/ocs/index.php/WS/AAAIW11/paper/download/3971/4187, www.aaai.org/ocs/index.php/FLAIRS/FLAIRS11/paper/download/2573/3018, www.aaai.org/ocs/index.php/SSS/SSS10/paper/download/1146/1448). The objective of Similarity component is to give an application engineer as tool for text relevance which can be used as a black box, no need to understand computational linguistics or machine learning.

Apache OpenNLP ${pom.version} Release Notes

Contents

1. What is Apache OpenNLP?

This Release

How to Get Involved

How to Report Issues

List of JIRA Issues Fixed in this Release