public class SolrContentHandler extends DefaultHandler implements ExtractingParams
SolrInputDocument
s.
This class is not thread-safe.
This class cannot be reused, you have to create a new instance per document!
User's may wish to override this class to provide their own functionality.
Modifier and Type | Field and Description |
---|---|
protected boolean |
captureAttribs |
protected StringBuilder |
catchAllBuilder |
static String |
contentFieldName |
protected String |
defaultField |
protected SolrInputDocument |
document |
protected Map<String,StringBuilder> |
fieldBuilders |
protected boolean |
lowerNames |
protected org.apache.tika.metadata.Metadata |
metadata |
protected SolrParams |
params |
protected IndexSchema |
schema |
protected String |
unknownFieldPrefix |
CAPTURE_ATTRIBUTES, CAPTURE_ELEMENTS, DEFAULT_FIELD, EXTRACT_FORMAT, EXTRACT_ONLY, IGNORE_TIKA_EXCEPTION, LITERALS_OVERRIDE, LITERALS_PREFIX, LOWERNAMES, MAP_PREFIX, PASSWORD_MAP_FILE, RESOURCE_NAME, RESOURCE_PASSWORD, STREAM_TYPE, UNKNOWN_FIELD_PREFIX, XPATH_EXPRESSION
Constructor and Description |
---|
SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
SolrParams params,
IndexSchema schema) |
Modifier and Type | Method and Description |
---|---|
protected void |
addCapturedContent()
Add the per field captured content to the Solr Document.
|
protected void |
addContent()
Add in the catch all content to the field.
|
protected void |
addField(String fname,
String fval,
String[] vals) |
protected void |
addLiterals()
Add in the literals to the document using the
params and the ExtractingParams.LITERALS_PREFIX . |
protected void |
addMetadata()
Add in any metadata using
metadata as the source. |
void |
characters(char[] chars,
int offset,
int length) |
void |
endElement(String uri,
String localName,
String qName) |
protected String |
findMappedName(String name)
Get the name mapping
|
void |
ignorableWhitespace(char[] chars,
int offset,
int length)
Treat the same as any other characters
|
SolrInputDocument |
newDocument()
This is called by a consumer when it is ready to deal with a new SolrInputDocument.
|
void |
startElement(String uri,
String localName,
String qName,
Attributes attributes) |
endDocument, endPrefixMapping, error, fatalError, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, unparsedEntityDecl, warning
public static final String contentFieldName
protected final SolrInputDocument document
protected final org.apache.tika.metadata.Metadata metadata
protected final SolrParams params
protected final StringBuilder catchAllBuilder
protected final IndexSchema schema
protected final Map<String,StringBuilder> fieldBuilders
protected final boolean captureAttribs
protected final boolean lowerNames
protected final String unknownFieldPrefix
protected final String defaultField
public SolrContentHandler(org.apache.tika.metadata.Metadata metadata, SolrParams params, IndexSchema schema)
public SolrInputDocument newDocument()
SolrInputDocument
.addMetadata()
,
addCapturedContent()
,
addContent()
,
addLiterals()
protected void addCapturedContent()
fieldBuilders
infoprotected void addContent()
contentFieldName
and the catchAllBuilder
protected void addLiterals()
params
and the ExtractingParams.LITERALS_PREFIX
.protected void addMetadata()
metadata
as the source.public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
startElement
in interface ContentHandler
startElement
in class DefaultHandler
SAXException
public void endElement(String uri, String localName, String qName) throws SAXException
endElement
in interface ContentHandler
endElement
in class DefaultHandler
SAXException
public void characters(char[] chars, int offset, int length) throws SAXException
characters
in interface ContentHandler
characters
in class DefaultHandler
SAXException
public void ignorableWhitespace(char[] chars, int offset, int length) throws SAXException
ignorableWhitespace
in interface ContentHandler
ignorableWhitespace
in class DefaultHandler
SAXException
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.