public class XPathEntityProcessor extends EntityProcessorBase
An implementation of EntityProcessor
which uses a streaming xpath parser to extract values out of XML documents.
It is typically used in conjunction with URLDataSource
or FileDataSource
.
Refer to http://wiki.apache.org/solr/DataImportHandler for more details.
This API is experimental and may change in the future.
XPathRecordReader
Modifier and Type | Field and Description |
---|---|
protected int |
blockingQueueSize |
protected int |
blockingQueueTimeOut |
protected TimeUnit |
blockingQueueTimeOutUnits |
static String |
COMMON_FIELD |
protected List<String> |
commonFields |
protected DataSource<Reader> |
dataSource |
static String |
FOR_EACH |
static String |
HAS_MORE |
static String |
NEXT_URL |
protected List<String> |
placeHolderVariables |
protected Thread |
publisherThread |
protected boolean |
reinitXPathReader |
static String |
STREAM |
protected boolean |
streamRows |
static String |
URL |
static String |
USE_SOLR_ADD_SCHEMA |
protected boolean |
useSolrAddXml |
static String |
XPATH |
static String |
XPATH_FIELD_NAME |
static String |
XSL |
protected Transformer |
xslTransformer |
ABORT, cacheSupport, context, CONTINUE, entityName, isFirstInit, ON_ERROR, onError, query, rowIterator, SKIP, TRANSFORM_ROW, TRANSFORMER
Constructor and Description |
---|
XPathEntityProcessor() |
Modifier and Type | Method and Description |
---|---|
void |
init(Context context)
This method is called when it starts processing an entity.
|
Map<String,Object> |
nextRow()
For a simple implementation, this is the only method that the sub-class should implement.
|
void |
postTransform(Map<String,Object> r)
Invoked after the transformers are invoked.
|
protected Map<String,Object> |
readRow(Map<String,Object> record,
String xpath) |
destroy, firstInit, getNext, initCache, nextDeletedRowKey, nextModifiedParentRowKey, nextModifiedRowKey
close
protected DataSource<Reader> dataSource
protected Transformer xslTransformer
protected boolean useSolrAddXml
protected boolean streamRows
protected int blockingQueueTimeOut
protected TimeUnit blockingQueueTimeOutUnits
protected int blockingQueueSize
protected Thread publisherThread
protected boolean reinitXPathReader
public static final String URL
public static final String HAS_MORE
public static final String NEXT_URL
public static final String XPATH_FIELD_NAME
public static final String FOR_EACH
public static final String XPATH
public static final String COMMON_FIELD
public static final String USE_SOLR_ADD_SCHEMA
public static final String XSL
public static final String STREAM
public void init(Context context)
EntityProcessor
init
in class EntityProcessorBase
context
- The current contextpublic Map<String,Object> nextRow()
EntityProcessorBase
nextRow
in class EntityProcessorBase
public void postTransform(Map<String,Object> r)
EntityProcessor
postTransform
in class EntityProcessor
r
- The transformed rowCopyright © 2000-2017 Apache Software Foundation. All Rights Reserved.