Package org.apache.lucene.codecs
Class BlockTreeTermsReader
- java.lang.Object
-
- org.apache.lucene.index.Fields
-
- org.apache.lucene.codecs.FieldsProducer
-
- org.apache.lucene.codecs.BlockTreeTermsReader
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,java.lang.Iterable<java.lang.String>
public class BlockTreeTermsReader extends FieldsProducer
A block-based terms index and dictionary that assigns terms to variable length blocks according to how they share prefixes. The terms index is a prefix trie whose leaves are term blocks. The advantage of this approach is that seekExact is often able to determine a term cannot exist without doing any IO, and intersection with Automata is very fast. Note that this terms dictionary has it's own fixed terms index (ie, it does not support a pluggable terms index implementation).NOTE: this terms dictionary does not support index divisor when opening an IndexReader. Instead, you can change the min/maxItemsPerBlock during indexing.
The data structure used by this implementation is very similar to a burst trie (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499), but with added logic to break up too-large blocks of all terms sharing a given prefix into smaller ones.
Use
CheckIndex
with the-verbose
option to see summary statistics on the blocks in the dictionary. SeeBlockTreeTermsWriter
.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description class
BlockTreeTermsReader.FieldReader
BlockTree's implementation ofTerms
.static class
BlockTreeTermsReader.Stats
BlockTree statistics for a single field returned byBlockTreeTermsReader.FieldReader.computeStats()
.
-
Field Summary
-
Fields inherited from class org.apache.lucene.index.Fields
EMPTY_ARRAY
-
-
Constructor Summary
Constructors Constructor Description BlockTreeTermsReader(Directory dir, FieldInfos fieldInfos, SegmentInfo info, PostingsReaderBase postingsReader, IOContext ioContext, java.lang.String segmentSuffix, int indexDivisor)
Sole constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
java.util.Iterator<java.lang.String>
iterator()
Returns an iterator that will step through all fields names.long
ramBytesUsed()
Returns approximate RAM bytes usedprotected int
readHeader(IndexInput input)
Reads terms file header.protected int
readIndexHeader(IndexInput input)
Reads index file header.protected void
seekDir(IndexInput input, long dirOffset)
Seekinput
to the directory offset.int
size()
Returns the number of fields or -1 if the number of distinct field names is unknown.Terms
terms(java.lang.String field)
Get theTerms
for this field.-
Methods inherited from class org.apache.lucene.index.Fields
getUniqueTermCount
-
-
-
-
Constructor Detail
-
BlockTreeTermsReader
public BlockTreeTermsReader(Directory dir, FieldInfos fieldInfos, SegmentInfo info, PostingsReaderBase postingsReader, IOContext ioContext, java.lang.String segmentSuffix, int indexDivisor) throws java.io.IOException
Sole constructor.- Throws:
java.io.IOException
-
-
Method Detail
-
readHeader
protected int readHeader(IndexInput input) throws java.io.IOException
Reads terms file header.- Throws:
java.io.IOException
-
readIndexHeader
protected int readIndexHeader(IndexInput input) throws java.io.IOException
Reads index file header.- Throws:
java.io.IOException
-
seekDir
protected void seekDir(IndexInput input, long dirOffset) throws java.io.IOException
Seekinput
to the directory offset.- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Specified by:
close
in classFieldsProducer
- Throws:
java.io.IOException
-
iterator
public java.util.Iterator<java.lang.String> iterator()
Description copied from class:Fields
Returns an iterator that will step through all fields names. This will not return null.
-
terms
public Terms terms(java.lang.String field) throws java.io.IOException
Description copied from class:Fields
Get theTerms
for this field. This will return null if the field does not exist.
-
size
public int size()
Description copied from class:Fields
Returns the number of fields or -1 if the number of distinct field names is unknown. If >= 0,Fields.iterator()
will return as many field names.
-
ramBytesUsed
public long ramBytesUsed()
Description copied from class:FieldsProducer
Returns approximate RAM bytes used- Specified by:
ramBytesUsed
in classFieldsProducer
-
-