public class TikaLanguageIdentifierUpdateProcessor extends LanguageIdentifierUpdateProcessor
allMapFieldsSet, docIdField, enabled, enableMapping, enforceSchema, fallbackFields, fallbackValue, inputFields, langField, langPattern, langsField, langWhitelist, lcMap, mapFields, mapIndividual, mapIndividualFieldsSet, mapKeepOrig, mapLcMap, mapOverwrite, mapPattern, mapReplaceStr, maxFieldValueChars, maxTotalChars, overwrite, schema, threshold, tikaSimilarityPattern
next
DOCID_FIELD_DEFAULT, DOCID_LANGFIELD_DEFAULT, DOCID_LANGSFIELD_DEFAULT, DOCID_PARAM, DOCID_THRESHOLD_DEFAULT, ENFORCE_SCHEMA, FALLBACK, FALLBACK_FIELDS, FIELDS_PARAM, LANG_FIELD, LANG_WHITELIST, LANGS_FIELD, LANGUAGE_ID, LCMAP, MAP_ENABLE, MAP_FL, MAP_INDIVIDUAL, MAP_INDIVIDUAL_FL, MAP_KEEP_ORIG, MAP_LCMAP, MAP_OVERWRITE, MAP_PATTERN, MAP_PATTERN_DEFAULT, MAP_REPLACE, MAP_REPLACE_DEFAULT, MAX_FIELD_VALUE_CHARS, MAX_FIELD_VALUE_CHARS_DEFAULT, MAX_TOTAL_CHARS, MAX_TOTAL_CHARS_DEFAULT, OVERWRITE, THRESHOLD
Constructor and Description |
---|
TikaLanguageIdentifierUpdateProcessor(SolrQueryRequest req,
SolrQueryResponse rsp,
UpdateRequestProcessor next) |
Modifier and Type | Method and Description |
---|---|
protected String |
concatFields(SolrInputDocument doc)
Concatenates content from multiple fields
|
protected List<DetectedLanguage> |
detectLanguage(SolrInputDocument doc)
Detects language(s) from a string.
|
getMappedField, isEnabled, normalizeLangCode, process, processAdd, resolveLanguage, resolveLanguage, setEnabled
finish, processCommit, processDelete, processMergeIndexes, processRollback
public TikaLanguageIdentifierUpdateProcessor(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next)
protected List<DetectedLanguage> detectLanguage(SolrInputDocument doc)
LanguageIdentifierUpdateProcessor
detectLanguage
in class LanguageIdentifierUpdateProcessor
doc
- The content to identifyprotected String concatFields(SolrInputDocument doc)
Copyright © 2000-2016 Apache Software Foundation. All Rights Reserved.