Package org.apache.lucene.codecs.uniformsplit
package org.apache.lucene.codecs.uniformsplit
Pluggable term index / block terms dictionary implementations.
Structure similar to VariableGapTermsIndexWriter
with additional optimizations.
- Designed to be extensible
- Reduced on-heap memory usage.
- Efficient to seek terms (
TermQuery
,PhraseQuery
) - Quite efficient for
PrefixQuery
- Not efficient for spell-check and
FuzzyQuery
, in this case preferLucene99PostingsFormat
-
ClassDescriptionDecodes the raw bytes of a block when the index is read, according to the
BlockEncoder
used during the writing of the index.Encodes the raw bytes of a block when the index is written.Writable byte buffer.Block header containing block metadata.Reads/writes block header.One term block line.Reads/writes block lines with terms encoded incrementally inside a block.Seeks the block corresponding to a given term, read the block bytes, and scans the block terms.Writes blocks in the block file.TermState
serializer which encodes each file pointer as a delta relative to a base file pointer.Metadata and stats for one field in the index.Reads/writes field metadata.Immutable statelessFST
-based index dictionary kept in memory.Provides statefulFSTDictionary.Browser
to seek in theFSTDictionary
.Builds an immutableFSTDictionary
.Immutable stateless index dictionary kept in RAM.StatefulIndexDictionary.Browser
to seek a term in thisIndexDictionary
and get its corresponding block file pointer in the block file.Supplier for a new statefulIndexDictionary.Browser
created on the immutableIndexDictionary
.Builds an immutableIndexDictionary
.The "intersect"TermsEnum
response toUniformSplitTerms.intersect(CompiledAutomaton, BytesRef)
, intersecting the terms with an automaton.Block iteration order.Utility methods to estimate the RAM usage of objects.Term of a block line.PostingsFormat
based on the Uniform Split technique.Terms
based on the Uniform Split technique.A block-based terms index and dictionary based on the Uniform Split technique.A block-based terms index and dictionary that assigns terms to nearly uniform length blocks.