/[Apache-SVN]
ViewVC logotype

Revision 1643377


Jump to revision: Previous Next
Author: sarowe
Date: Fri Dec 5 18:22:30 2014 UTC (9 years, 3 months ago)
Changed paths: 6
Log Message:
SOLR-3881: Avoid OOMs in LanguageIdentifierUpdateProcessor:
- Added langid.maxFieldValueChars and langid.maxTotalChars params to limit
  input, by default 10k and 20k chars, respectively.
- Moved input concatenation to Tika implementation; the langdetect
  implementation instead appends each input piece via the langdetect API.

Changed paths

Path Details
Directorylucene/dev/trunk/solr/CHANGES.txt modified , text changed
Directorylucene/dev/trunk/solr/contrib/langid/src/java/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessor.java modified , text changed
Directorylucene/dev/trunk/solr/contrib/langid/src/java/org/apache/solr/update/processor/LangIdParams.java modified , text changed
Directorylucene/dev/trunk/solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java modified , text changed
Directorylucene/dev/trunk/solr/contrib/langid/src/java/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessor.java modified , text changed
Directorylucene/dev/trunk/solr/contrib/langid/src/test/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactoryTest.java modified , text changed

infrastructure at apache.org
ViewVC Help
Powered by ViewVC 1.1.26