Class FileListEntityProcessor
- java.lang.Object
-
- org.apache.solr.handler.dataimport.EntityProcessor
-
- org.apache.solr.handler.dataimport.EntityProcessorBase
-
- org.apache.solr.handler.dataimport.FileListEntityProcessor
-
public class FileListEntityProcessor extends EntityProcessorBase
An
EntityProcessor
instance which can stream file names found in a given base directory matching patterns and returning rows containing file information.It supports querying a give base directory by matching:
- regular expressions to file names
- excluding certain files based on regular expression
- last modification date (newer or older than a given date or time)
- size (bigger or smaller than size given in bytes)
- recursively iterating through sub-directories
FileDataSource
to read from files in file systems.Refer to http://wiki.apache.org/solr/DataImportHandler for more details.
This API is experimental and may change in the future.
- Since:
- solr 1.3
- See Also:
Pattern
-
-
Field Summary
Fields Modifier and Type Field Description static String
ABSOLUTE_FILE
static String
BASE_DIR
protected String
baseDir
The baseDir given in data-config.xml after resolving any variablesstatic String
BIGGER_THAN
protected long
biggerThan
The biggerThan given in data-config as a long valuestatic String
DIR
protected String
excludes
A Regex pattern of excluded file names as given in data-config.xml after resolving any variablesstatic String
EXCLUDES
static String
FILE
static String
FILE_NAME
protected String
fileName
A regex pattern to identify files given in data-config.xml after resolving any variablesstatic String
LAST_MODIFIED
static String
NEWER_THAN
protected Date
newerThan
The newerThan given in data-config as aDate
static String
OLDER_THAN
protected Date
olderThan
The newerThan given in data-config as aDate
static Pattern
PLACE_HOLDER_PATTERN
protected boolean
recursive
The recursive given in data-config.static String
RECURSIVE
static String
SIZE
static String
SMALLER_THAN
protected long
smallerThan
The smallerThan given in data-config as a long value-
Fields inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
ABORT, cacheSupport, context, CONTINUE, entityName, isFirstInit, ON_ERROR, onError, query, rowIterator, SKIP, TRANSFORM_ROW, TRANSFORMER
-
-
Constructor Summary
Constructors Constructor Description FileListEntityProcessor()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
init(Context context)
This method is called when it starts processing an entity.Map<String,Object>
nextRow()
For a simple implementation, this is the only method that the sub-class should implement.-
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
destroy, firstInit, getNext, initCache, nextDeletedRowKey, nextModifiedParentRowKey, nextModifiedRowKey
-
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessor
close, postTransform
-
-
-
-
Field Detail
-
fileName
protected String fileName
A regex pattern to identify files given in data-config.xml after resolving any variables
-
baseDir
protected String baseDir
The baseDir given in data-config.xml after resolving any variables
-
excludes
protected String excludes
A Regex pattern of excluded file names as given in data-config.xml after resolving any variables
-
newerThan
protected Date newerThan
-
biggerThan
protected long biggerThan
The biggerThan given in data-config as a long valueNote: This variable is resolved just-in-time in the
nextRow()
method.
-
smallerThan
protected long smallerThan
The smallerThan given in data-config as a long valueNote: This variable is resolved just-in-time in the
nextRow()
method.
-
recursive
protected boolean recursive
The recursive given in data-config. Default value is false.
-
PLACE_HOLDER_PATTERN
public static final Pattern PLACE_HOLDER_PATTERN
-
DIR
public static final String DIR
- See Also:
- Constant Field Values
-
FILE
public static final String FILE
- See Also:
- Constant Field Values
-
ABSOLUTE_FILE
public static final String ABSOLUTE_FILE
- See Also:
- Constant Field Values
-
SIZE
public static final String SIZE
- See Also:
- Constant Field Values
-
LAST_MODIFIED
public static final String LAST_MODIFIED
- See Also:
- Constant Field Values
-
FILE_NAME
public static final String FILE_NAME
- See Also:
- Constant Field Values
-
BASE_DIR
public static final String BASE_DIR
- See Also:
- Constant Field Values
-
EXCLUDES
public static final String EXCLUDES
- See Also:
- Constant Field Values
-
NEWER_THAN
public static final String NEWER_THAN
- See Also:
- Constant Field Values
-
OLDER_THAN
public static final String OLDER_THAN
- See Also:
- Constant Field Values
-
BIGGER_THAN
public static final String BIGGER_THAN
- See Also:
- Constant Field Values
-
SMALLER_THAN
public static final String SMALLER_THAN
- See Also:
- Constant Field Values
-
RECURSIVE
public static final String RECURSIVE
- See Also:
- Constant Field Values
-
-
Method Detail
-
init
public void init(Context context)
Description copied from class:EntityProcessor
This method is called when it starts processing an entity. When it comes back to the entity it is called again. So it can reset anything at that point. For a rootmost entity this is called only once for an ingestion. For sub-entities , this is called multiple once for each row from its parent entity- Overrides:
init
in classEntityProcessorBase
- Parameters:
context
- The current context
-
nextRow
public Map<String,Object> nextRow()
Description copied from class:EntityProcessorBase
For a simple implementation, this is the only method that the sub-class should implement. This is intended to stream rows one-by-one. Return null to signal end of rows- Overrides:
nextRow
in classEntityProcessorBase
- Returns:
- a row where the key is the name of the field and value can be any Object or a Collection of objects. Return null to signal end of rows
-
-