Lucene.Net
3.0.3
Lucene.Net is a port of the Lucene search engine library, written in C# and targeted at .NET runtime users.
|
Contrib | |||
Regex | |||
CSharpRegexCapabilities | C# Regex based implementation of IRegexCapabilities. | ||
IRegexCapabilities | Defines basic operations needed by RegexQuery for a regular expression implementation. | ||
IRegexQueryCapable | Defines methods for regular expression supporting queries to use. | ||
RegexQuery | Regular expression based query. | ||
RegexTermEnum | Subclass of FilteredTermEnum for enumerating all terms that match the specified regular expression term using the specified regular expression implementation | ||
SpanRegexQuery | A SpanQuery version of RegexQuery allowing regular expression queries to be nested within other SpanQuery subclasses. | ||
Lucene | |||
Net | |||
Analysis | |||
AR | |||
ArabicAnalyzer | |||
ArabicLetterTokenizer | |||
ArabicNormalizationFilter | |||
ArabicNormalizer | |||
ArabicStemFilter | |||
ArabicStemmer | |||
BR | |||
BrazilianAnalyzer | |||
BrazilianStemFilter | |||
BrazilianStemmer | |||
CJK | |||
CJKAnalyzer | Filters CJKTokenizer with StopFilter | ||
CJKTokenizer | |||
Cn | |||
ChineseAnalyzer | An Analyzer that tokenizes text with ChineseTokenizer and filters with ChineseFilter | ||
ChineseFilter | A TokenFilter with a stop word table | ||
ChineseTokenizer | Tokenize Chinese text as individual chinese chars | ||
Compound | |||
CompoundWordTokenFilterBase | |||
DictionaryCompoundWordTokenFilter | |||
Cz | |||
CzechAnalyzer | |||
De | |||
GermanAnalyzer | Analyzer for German language. Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default. | ||
GermanDIN2Stemmer | A stemmer for the german language that uses the DIN-5007-2 "Phone Book" rules for handling umlaut characters. | ||
GermanStemFilter | A filter that stems German words. It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a GermanStemmer). | ||
GermanStemmer | A stemmer for German words. The algorithm is based on the report "A Fast and Simple Stemming Algorithm for German Words" by Jörg Caumanns (joerg.nosp@m..cau.nosp@m.manns.nosp@m.@iss.nosp@m.t.fhg.nosp@m..de). | ||
El | |||
GreekAnalyzer | |||
GreekLowerCaseFilter | |||
Ext | |||
SingleCharTokenAnalyzer | This analyzer targets short fields where word like searches are required. [SomeU.nosp@m.ser@.nosp@m.GMAIL.nosp@m..com 1234567890] will be tokenized as [s.o.m.e.u.s.e.r..g.m.a.i.l..com..1.2.3.4.5.6.7.8.9.0] (read .'s as blank) | ||
UnaccentedWordAnalyzer | Another Analyzer. Every char which is not a letter or digit is treated as a word separator. [Name..nosp@m.Surn.nosp@m.ame@g.nosp@m.mail.nosp@m..com 123.456 ğüşıöçĞÜŞİÖÇ$ΑΒΓΔΕΖ::АБВГДЕ SSß] will be tokenized as [name surname gmail com 123 456 gusioc gusioc αβγδεζ абвгде ssss] | ||
LetterOrDigitTokenizer | if a char is not a letter or digit, it is a word separator | ||
Fa | |||
PersianAnalyzer | |||
PersianNormalizationFilter | |||
PersianNormalizer | |||
Fr | |||
ElisionFilter | |||
FrenchAnalyzer | |||
FrenchStemFilter | |||
FrenchStemmer | |||
Hunspell | |||
HunspellAffix | Wrapper class representing a hunspell affix. | ||
HunspellDictionary | |||
HunspellStem | |||
HunspellStemFilter | TokenFilter that uses hunspell affix rules and words to stem tokens. Since hunspell supports a word having multiple stems, this filter can emit multiple tokens for each consumed token. | ||
HunspellStemmer | HunspellStemmer uses the affix rules declared in the HunspellDictionary to generate one or more stems for a word. It conforms to the algorithm in the original hunspell algorithm, including recursive suffix stripping. | ||
HunspellWord | |||
Miscellaneous | |||
EmptyTokenStream | An always exhausted token stream | ||
InjectablePrefixAwareTokenFilter | |||
PatternAnalyzer | |||
PrefixAndSuffixAwareTokenFilter | Links two PrefixAwareTokenFilter. NOTE: This filter might not behave correctly if used with custom Attributes, i.e. Attributes other than the ones located in Lucene.Net.Analysis.Tokenattributes. | ||
PrefixAwareTokenFilter | Joins two token streams and leaves the last token of the first stream available to be used when updating the token values in the second stream based on that token | ||
SingleTokenTokenStream | A TokenStream containing a single token. | ||
NGram | |||
EdgeNGramTokenFilter | |||
EdgeNGramTokenizer | |||
NGramTokenFilter | |||
NGramTokenizer | |||
Nl | |||
DutchAnalyzer | |||
DutchStemFilter | |||
DutchStemmer | |||
Payloads | |||
AbstractEncoder | Base class for payload encoders. | ||
DelimitedPayloadTokenFilter | Characters before the delimiter are the "token", those after are the payload. For example, if the delimiter is '|', then for the string "foo|bar", foo is the token and "bar" is a payload. Note, you can also include a org.apache.lucene.analysis.payloads.PayloadEncoder to convert the payload in an appropriate way (from characters to bytes). Note make sure your Tokenizer doesn't split on the delimiter, or this won't work | ||
FloatEncoder | Encode a character array Float as a org.apache.lucene.index.Payload. | ||
IdentityEncoder | Does nothing other than convert the char array to a byte array using the specified encoding. | ||
IntegerEncoder | Encode a character array Integer as a org.apache.lucene.index.Payload. | ||
NumericPayloadTokenFilter | Assigns a payload to a token based on the Token.Type() | ||
PayloadEncoder | Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to Payload NOTE: this interface is subject to change | ||
TokenOffsetPayloadTokenFilter | Adds the Token.StartOffset and Token.EndOffset First 4 bytes are the start | ||
TypeAsPayloadTokenFilter | Makes the Token.Type() a payload. Encodes the type using System.Text.Encoding.UTF8 as the encoding | ||
Position | |||
PositionFilter | |||
Query | |||
QueryAutoStopWordAnalyzer | |||
Reverse | |||
ReverseStringFilter | |||
Ru | |||
RussianAnalyzer | Analyzer for Russian language. Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified. | ||
RussianLetterTokenizer | A RussianLetterTokenizer is a Tokenizer that extends LetterTokenizer by also allowing the basic latin digits 0-9. /summary> | ||
RussianLowerCaseFilter | Normalizes token text to lower case. | ||
RussianStemFilter | |||
RussianStemmer | |||
Shingle | |||
Codec | |||
OneDimensionalNonWeightedTokenSettingsCodec | Using this codec makes a ShingleMatrixFilter act like ShingleFilter. It produces the most simple sort of shingles, ignoring token position increments, etc | ||
SimpleThreeDimensionalTokenSettingsCodec | A full featured codec not to be used for something serious | ||
TokenSettingsCodec | Strategy used to code and decode meta data of the tokens from the input stream regarding how to position the tokens in the matrix, set and retreive weight, etc. | ||
TwoDimensionalNonWeightedSynonymTokenSettingsCodec | A codec that creates a two dimensional matrix by treating tokens from the input stream with 0 position increment as new rows to the current column. | ||
Matrix | |||
Column | |||
Matrix | A column focused matrix in three dimensions: | ||
MatrixPermutationIterator | |||
Row | |||
ShingleAnalyzerWrapper | |||
ShingleFilter | |||
ShingleMatrixFilter | |||
TokenPositioner | |||
Sinks | |||
DateRecognizerSinkFilter | |||
TokenRangeSinkFilter | |||
TokenTypeSinkFilter | |||
Snowball | |||
SnowballAnalyzer | Filters StandardTokenizer with StandardFilter, LowerCaseFilter, StopFilter and SnowballFilter | ||
SnowballFilter | A filter that stems words using a Snowball-generated stemmer | ||
Standard | |||
StandardAnalyzer | Filters StandardTokenizer with StandardFilter, LowerCaseFilter and StopFilter, using a list of English stop words | ||
StandardFilter | Normalizes tokens extracted with StandardTokenizer. | ||
StandardTokenizer | A grammar-based tokenizer constructed with JFlex | ||
StandardTokenizerImpl | This class is a scanner generated by JFlex 1.4.1 on 9/4/08 6:49 PM from the specification file /tango/mike/src/lucene.standarddigit/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex | ||
Th | |||
ThaiAnalyzer | |||
ThaiWordFilter | |||
Tokenattributes | |||
FlagsAttribute | This attribute can be used to pass different flags down the tokenizer chain, eg from one TokenFilter to another one. | ||
IFlagsAttribute | This attribute can be used to pass different flags down the Tokenizer chain, eg from one TokenFilter to another one. | ||
IOffsetAttribute | The start and end character offset of a Token. | ||
IPayloadAttribute | The payload of a Token. See also Payload. | ||
IPositionIncrementAttribute | The positionIncrement determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching | ||
ITermAttribute | The term text of a Token. | ||
ITypeAttribute | A Token's lexical type. The Default value is "word". | ||
OffsetAttribute | The start and end character offset of a Token. | ||
PayloadAttribute | The payload of a Token. See also Payload. | ||
PositionIncrementAttribute | The positionIncrement determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching | ||
TermAttribute | The term text of a Token. | ||
TypeAttribute | A Token's lexical type. The Default value is "word". | ||
ChainedFilter | |||
Analyzer | An Analyzer builds TokenStreams, which analyze text. It thus represents a policy for extracting index terms from text. Typical implementations first build a Tokenizer, which breaks the stream of characters from the Reader into raw Tokens. One or more TokenFilters may then be applied to the output of the Tokenizer. | ||
ASCIIFoldingFilter | This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists | ||
BaseCharFilter | |||
CachingTokenFilter | This class can be used if the token attributes of a TokenStream are intended to be consumed more than once. It caches all token attribute states locally in a List | ||
CharArraySet | A simple class that stores Strings as char[]'s in a hash table. Note that this is not a general purpose class. For example, it cannot remove items from the set, nor does it resize its hash table to be smaller, etc. It is designed to be quick to test if a char[] is in the set without the necessity of converting it to a String first. Please note: This class implements System.Collections.Generic.ISet{T} but does not behave like it should in all cases. The generic type is System.Collections.Generic.ICollection{T}, because you can add any object to it, that has a string representation. The add methods will use object.ToString() and store the result using a char buffer. The same behaviour have the Contains(object) methods. The GetEnumerator method returns an string IEnumerable. For type safety also stringIterator() is provided. | ||
CharArraySetEnumerator | The IEnumerator<String> for this set. Strings are constructed on the fly, so use nextCharArray for more efficient access | ||
CharFilter | Subclasses of CharFilter can be chained to filter CharStream. They can be used as System.IO.TextReader with additional offset correction. Tokenizers will automatically use CorrectOffset if a CharFilter/CharStream subclass is used | ||
CharReader | CharReader is a Reader wrapper. It reads chars from Reader and outputs CharStream, defining an identify function CorrectOffset method that simply returns the provided offset. | ||
CharStream | CharStream adds CorrectOffset functionality over System.IO.TextReader. All Tokenizers accept a CharStream instead of System.IO.TextReader as input, which enables arbitrary character based filtering before tokenization. The CorrectOffset method fixed offsets to account for removal or insertion of characters, so that the offsets reported in the tokens match the character offsets of the original Reader. | ||
CharTokenizer | An abstract base class for simple, character-oriented tokenizers. | ||
ISOLatin1AccentFilter | A filter that replaces accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented equivalent. The case will not be altered. For instance, 'À' will be replaced by 'a'. | ||
KeywordAnalyzer | "Tokenizes" the entire stream as a single token. This is useful for data like zip codes, ids, and some product names. | ||
KeywordTokenizer | Emits the entire input as a single token. | ||
LengthFilter | Removes words that are too long or too short from the stream. | ||
LetterTokenizer | A LetterTokenizer is a tokenizer that divides text at non-letters. That's to say, it defines tokens as maximal strings of adjacent letters, as defined by java.lang.Character.isLetter() predicate. Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces. | ||
LowerCaseFilter | Normalizes token text to lower case. | ||
LowerCaseTokenizer | LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. It divides text at non-letters and converts them to lower case. While it is functionally equivalent to the combination of LetterTokenizer and LowerCaseFilter, there is a performance advantage to doing the two tasks at once, hence this (redundant) implementation. Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces. | ||
MappingCharFilter | Simplistic CharFilter that applies the mappings contained in a NormalizeCharMap to the character stream, and correcting the resulting changes to the offsets. | ||
NormalizeCharMap | Holds a map of String input to String output, to be used with MappingCharFilter. | ||
NumericTokenStream | Expert: This class provides a TokenStream for indexing numeric values that can be used by NumericRangeQuery{T} or NumericRangeFilter{T} | ||
PerFieldAnalyzerWrapper | This analyzer is used to facilitate scenarios where different fields require different analysis techniques. Use AddAnalyzer to add a non-default analyzer on a field name basis | ||
PorterStemFilter | Transforms the token stream as per the Porter stemming algorithm. Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther down the Tokenizer chain in order for this to work properly! To use this filter with other analyzers, you'll want to write an Analyzer class that sets up the TokenStream chain as you want it. To use this with LowerCaseTokenizer, for example, you'd write an analyzer like this: | ||
PorterStemmer | Stemmer, implementing the Porter Stemming Algorithm | ||
SimpleAnalyzer | An Analyzer that filters LetterTokenizer with LowerCaseFilter | ||
StopAnalyzer | Filters LetterTokenizer with LowerCaseFilter and StopFilter | ||
StopFilter | Removes stop words from a token stream. | ||
TeeSinkTokenFilter | This TokenFilter provides the ability to set aside attribute states that have already been analyzed. This is useful in situations where multiple fields share many common analysis steps and then go their separate ways. It is also useful for doing things like entity extraction or proper noun analysis as part of the analysis workflow and saving off those tokens for use in another field | ||
AnonymousClassSinkFilter | |||
SinkFilter | A filter that decides which AttributeSource states to store in the sink. | ||
SinkTokenStream | |||
Token | A Token is an occurrence of a term from the text of a field. It consists of a term's text, the start and end offset of the term in the text of the field, and a type string. The start and end offsets permit applications to re-associate a token with its source text, e.g., to display highlighted query terms in a document browser, or to show matching text fragments in a <abbr title="KeyWord In Context">KWIC</abbr> display, etc. The type is a string, assigned by a lexical analyzer (a.k.a. tokenizer), naming the lexical or syntactic class that the token belongs to. For example an end of sentence marker token might be implemented with type "eos". The default token type is "word". A Token can optionally have metadata (a.k.a. Payload) in the form of a variable length byte array. Use TermPositions.PayloadLength and TermPositions.GetPayload(byte[], int) to retrieve the payloads from the index. | ||
TokenAttributeFactory | Expert: Creates an AttributeFactory returning Token as instance for the basic attributes and for all other attributes calls the given delegate factory. | ||
TokenFilter | A TokenFilter is a TokenStream whose input is another TokenStream. This is an abstract class; subclasses must override TokenStream.IncrementToken() | ||
Tokenizer | A Tokenizer is a TokenStream whose input is a Reader. This is an abstract class; subclasses must override TokenStream.IncrementToken() NOTE: Subclasses overriding TokenStream.IncrementToken() must call AttributeSource.ClearAttributes() before setting attributes. | ||
TokenStream | A TokenStream enumerates the sequence of tokens, either from Fields of a Document or from query text. This is an abstract class. Concrete subclasses are:
A new
To make sure that filters and consumers know which attributes are available, the attributes must be added during instantiation. Filters and consumers are not required to check for availability of attributes in IncrementToken(). You can find some example code for the new API in the analysis package level Javadoc. Sometimes it is desirable to capture a current state of a | ||
WhitespaceAnalyzer | An Analyzer that uses WhitespaceTokenizer. | ||
WhitespaceTokenizer | A WhitespaceTokenizer is a tokenizer that divides text at whitespace. Adjacent sequences of non-Whitespace characters form tokens. | ||
WordlistLoader | Loader for text files that represent a list of stopwords. | ||
Demo | |||
Html | |||
Entities | |||
HTMLParser | |||
HTMLParserConstants_Fields | |||
HTMLParserTokenManager | |||
ParseException | This exception is thrown when parse errors are encountered. You can explicitly create objects of this exception type by calling the method generateParseException in the generated parser | ||
ParserThread | |||
SimpleCharStream | An implementation of interface CharStream, where the stream is assumed to contain only ASCII characters (without unicode processing). | ||
Tags | |||
Test | |||
Token | Describes the input token stream. | ||
TokenMgrError | |||
Distributed | |||
Configuration | |||
CurrentIndex | Definition of current index information managed by the LuceneUpdater windows service. The <copy> node within the <indexset> node represents the information needed to load a CurrentIndex object for a given IndexSet | ||
DistributedSearcher | Definition of a configurable set of search indexes made accessible by the LuceneServer windows service for a consuming application. These search indexes are defined in the configuration file of an application. The locations defined in a DistributedSearcher match the exposed object URIs as defined in the LuceneServer service | ||
DistributedSearcherConfigurationHandler | Implementation of custom configuration handler for the definition of search indexes made accessible by the LuceneServer windows service. This configuration resides in the configuration file of an application consuming the search indexes made accessible by the LuceneServer windows service. | ||
DistributedSearchers | Definition of a configurable set of search indexes made accessible by the LuceneServer windows service for a consuming application. These search indexes are defined in the configuration file of an application. The locations defined in a DistributedSearcher match the exposed object URIs as defined in the LuceneServer service | ||
LuceneServerIndex | Definition of a configurable search index made accessible by the LuceneServer windows service | ||
LuceneServerIndexConfigurationHandler | Implementation of custom configuration handler for the definition of search indexes made accessible by the LuceneServer windows service. | ||
LuceneServerIndexes | Definition of configurable search indexes made accessible by the LuceneServer windows service | ||
Indexing | |||
DeleteIndexDocument | |||
FileNameComparer | Summary description for FileNameComparer. | ||
IndexDocument | Base class representing a record to be added to a Lucene index | ||
IndexSet | Definition of configurable search indexes managed by the LuceneUpdater windows service | ||
IndexSetConfigurationHandler | Implementation of custom configuration handler for the definition of master indexes as managed by the LuceneUpdater windows service. | ||
IndexSets | Definition of configurable search indexes managed by the LuceneUpdater windows service | ||
Operations | |||
LuceneMonitor | A Windows service that provides system ping checking against LuceneServer. | ||
Search | |||
DistributedSearchable | An derived implementation of RemoteSearchable, DistributedSearchable provides additional support for integration with .Net remoting objects and constructs. | ||
Documents | |||
AbstractField | |||
CompressionTools | Simple utility class providing static methods to compress and decompress binary data for stored fields. This class uses java.util.zip.Deflater and Inflater classes to compress and decompress. | ||
DateField | Provides support for converting dates to strings and vice-versa. The strings are structured so that lexicographic sorting orders by date, which makes them suitable for use as field values and search terms | ||
DateTools | Provides support for converting dates to strings and vice-versa. The strings are structured so that lexicographic sorting orders them by date, which makes them suitable for use as field values and search terms | ||
Resolution | Specifies the time granularity. | ||
Document | Documents are the unit of indexing and search | ||
Field | A field is a section of a Document. Each field has two parts, a name and a value. Values may be free text, provided as a String or as a Reader, or they may be atomic keywords, which are not further processed. Such keywords may be used to represent dates, urls, etc. Fields are optionally stored in the index, so that they may be returned with hits on the document. | ||
IFieldable | Synonymous with Field | ||
FieldSelector | Similar to a java.io.FileFilter, the FieldSelector allows one to make decisions about what Fields get loaded on a Document by Lucene.Net.Index.IndexReader.Document(int,Lucene.Net.Documents.FieldSelector) | ||
LoadFirstFieldSelector | Load the First field and break. See FieldSelectorResult.LOAD_AND_BREAK | ||
MapFieldSelector | A FieldSelector based on a Map of field names to FieldSelectorResults | ||
NumberTools | Provides support for converting longs to Strings, and back again. The strings are structured so that lexicographic sorting order is preserved | ||
NumericField | This class provides a Field that enables indexing of numeric values for efficient range filtering and sorting. Here's an example usage, adding an int value: | ||
SetBasedFieldSelector | Declare what fields to load normally and what fields to load lazily | ||
Index | |||
Memory | |||
MemoryIndex | High-performance single-document main memory Apache Lucene fulltext search index | ||
TermComparer | |||
TermComparer< T > | |||
FieldEnumerator< T > | |||
TermEnumerator | The enumerator over the terms in an index. | ||
TermDocEnumerator | Class to handle creating a TermDocs and allowing for seeking and enumeration. Used when you have a set of one or moreterms for which you want to enumerate over the documents that contain those terms. | ||
TermDocUsingTermsEnumerator | Class to handle enumeration over the TermDocs that does NOT close them on a call to Dispose! | ||
StringFieldEnumerator | Implementation for enumerating over terms with a string value. | ||
NumericFieldEnum< T > | Base for enumerating over numeric fields. | ||
IntFieldEnumerator | Implementation for enumerating over all of the terms in an int numeric field. | ||
FloatFieldEnumerator | Implementation for enumerating over all of the terms in a float numeric field. | ||
DoubleFieldEnumerator | Implementation for enumerating over all of the terms in a double numeric field. | ||
LongFieldEnumerator | Implementation for enumerating over all of the terms in a long numeric field. | ||
SegmentsGenCommit | Class that will force an index writer to open an index based on the generation in the segments.gen file as opposed to the highest generation found in a directory listing | ||
TermVectorEnumerator | Class to allow for enumerating over the documents in the index to retrieve the term vector for each one. | ||
EmptyVector | A simple TermFreqVector implementation for an empty vector for use with a deleted document or a document that does not have the field that is being enumerated. | ||
AbstractAllTermDocs | Base class for enumerating all but deleted docs | ||
AllTermDocs | |||
BufferedDeletes | Holds buffered deletes, by docID, term or query. We hold two instances of this class: one for the deletes prior to the last flush, the other for deletes after the last flush. This is so if we need to abort (discard all buffered docs) we can also discard the buffered deletes yet keep the deletes done during previously flushed segments. | ||
ByteBlockPool | |||
Allocator | |||
ByteSliceReader | |||
ByteSliceWriter | Class to write byte streams into slices of shared byte[]. This is used by DocumentsWriter to hold the posting list for many terms in RAM. | ||
CharBlockPool | |||
CheckIndex | Basic tool and API to check the health of an index and write a new segments file that removes reference to problematic segments | ||
Status | Returned from CheckIndex_Renamed_Method() detailing the health and status of the index | ||
FieldNormStatus | Status from testing field norms. | ||
SegmentInfoStatus | Holds the status of each segment in the index. See SegmentInfos | ||
StoredFieldStatus | Status from testing stored fields. | ||
TermIndexStatus | Status from testing term index. | ||
TermVectorStatus | Status from testing stored fields. | ||
CompoundFileReader | Class for accessing a compound stream. This class implements a directory, but is limited to only read operations. Directory methods that would normally modify data throw an exception. | ||
CSIndexInput | Implementation of an IndexInput that reads from a portion of the compound file. The visibility is left as "package" only because this helps with testing since JUnit test cases in a different class can then access package fields of this class. | ||
CompoundFileWriter | Combines multiple files into a single compound file. The file format:
| ||
ConcurrentMergeScheduler | A MergeScheduler that runs each merge using a separate thread, up until a maximum number of threads (MaxThreadCount) at which when a merge is needed, the thread(s) that are updating the index will pause until one or more merges completes. This is a simple way to use concurrency in the indexing process without having to create and manage application level threads. | ||
MergeThread | |||
CorruptIndexException | This exception is thrown when Lucene detects an inconsistency in the index. | ||
DefaultSkipListReader | Implements the skip list reader for the default posting list format that stores positions and payloads | ||
DefaultSkipListWriter | Implements the skip list writer for the default posting list format that stores positions and payloads | ||
DirectoryReader | An IndexReader which reads indexes with multiple segments. | ||
DocConsumer | |||
DocConsumerPerThread | |||
DocFieldConsumer | |||
DocFieldConsumerPerField | |||
DocFieldConsumerPerThread | |||
DocFieldConsumers | This is just a "splitter" class: it lets you wrap two DocFieldConsumer instances as a single consumer. | ||
DocFieldConsumersPerField | |||
DocFieldConsumersPerThread | |||
DocFieldProcessor | This is a DocConsumer that gathers all fields under the same name, and calls per-field consumers to process field by field. This class doesn't doesn't do any "real" work of its own: it just forwards the fields to a DocFieldConsumer. | ||
DocFieldProcessorPerField | Holds all per thread, per field state. | ||
DocFieldProcessorPerThread | Gathers all Fieldables for a document under the same name, updates FieldInfos, and calls per-field consumers to process field by field | ||
DocInverter | This is a DocFieldConsumer that inverts each field, separately, from a Document, and accepts a InvertedTermsConsumer to process those terms. | ||
DocInverterPerField | Holds state for inverting all occurrences of a single field in the document. This class doesn't do anything itself; instead, it forwards the tokens produced by analysis to its own consumer (InvertedDocConsumerPerField). It also interacts with an endConsumer (InvertedDocEndConsumerPerField). | ||
DocInverterPerThread | This is a DocFieldConsumer that inverts each field, separately, from a Document, and accepts a InvertedTermsConsumer to process those terms. | ||
DocumentsWriter | This class accepts multiple added documents and directly writes a single segment file. It does this more efficiently than creating a single segment per document (with DocumentWriter) and doing standard merges on those segments | ||
DocumentsWriterThreadState | Used by DocumentsWriter to maintain per-thread state. We keep a separate Posting hash and other state for each thread and then merge postings hashes from all threads when writing the segment. | ||
FieldInfo | |||
FieldInfos | Access to the Fieldable Info file that describes document fields and whether or not they are indexed. Each segment has a separate Fieldable Info file. Objects of this class are thread-safe for multiple readers, but only one thread can be adding documents at a time, with no other reader or writer threads accessing this object. | ||
FieldInvertState | This class tracks the number and position / offset parameters of terms being added to the index. The information collected in this class is also used to calculate the normalization factor for a field | ||
FieldReaderException | |||
FieldSortedTermVectorMapper | For each Field, store a sorted collection of TermVectorEntrys This is not thread-safe. | ||
FieldsReader | Class responsible for access to stored document fields. It uses <segment>.fdt and <segment>.fdx; files | ||
FieldsWriter | |||
FilterIndexReader | A FilterIndexReader contains another IndexReader, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality. The class FilterIndexReader itself simply implements all abstract methods of IndexReader with versions that pass all requests to the contained index reader. Subclasses of FilterIndexReader may further override some of these methods and may also provide additional methods and fields. | ||
FilterTermDocs | Base class for filtering Lucene.Net.Index.TermDocs implementations. | ||
FilterTermEnum | Base class for filtering TermEnum implementations. | ||
FilterTermPositions | Base class for filtering TermPositions implementations. | ||
FormatPostingsDocsConsumer | NOTE: this API is experimental and will likely change | ||
FormatPostingsDocsWriter | Consumes doc and freq, writing them using the current index file format | ||
FormatPostingsFieldsConsumer | Abstract API that consumes terms, doc, freq, prox and payloads postings. Concrete implementations of this actually do "something" with the postings (write it into the index in a specific format) | ||
FormatPostingsFieldsWriter | |||
FormatPostingsPositionsConsumer | |||
FormatPostingsPositionsWriter | |||
FormatPostingsTermsConsumer | NOTE: this API is experimental and will likely change | ||
FormatPostingsTermsWriter | |||
FreqProxFieldMergeState | Used by DocumentsWriter to merge the postings from multiple ThreadStates when creating a segment | ||
FreqProxTermsWriter | |||
FreqProxTermsWriterPerField | |||
FreqProxTermsWriterPerThread | |||
IndexCommit | Expert: represents a single commit into an index as seen by the IndexDeletionPolicy or IndexReader. | ||
IndexDeletionPolicy | Expert: policy for deletion of stale index commits | ||
IndexFileDeleter | |||
IndexFileNameFilter | Filename filter that accept filenames and extensions only created by Lucene. | ||
IndexFileNames | Useful constants representing filenames and extensions used by lucene | ||
IndexReader | IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable. Concrete subclasses of IndexReader are usually constructed with a call to one of the static open() methods, e.g. Open(Lucene.Net.Store.Directory, bool) For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral–they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions. An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then. NOTE: for backwards API compatibility, several methods are not listed as abstract, but have no useful implementations in this base class and instead always throw UnsupportedOperationException. Subclasses are strongly encouraged to override these methods, but in many cases may not need to. NOTE: as of 2.4, it's possible to open a read-only IndexReader using the static open methods that accepts the boolean readOnly parameter. Such a reader has better better concurrency as it's not necessary to synchronize on the isDeleted method. You must explicitly specify false if you want to make changes with the resulting IndexReader. NOTE: IndexReader instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexReader instance; use your own (non-Lucene) objects instead. | ||
FieldOption | Constants describing field properties, for example used for IndexReader.GetFieldNames(FieldOption). | ||
IndexWriter | An IndexWriter creates and maintains an index. The create argument to the constructor determines whether a new index is created, or whether an existing index is opened. Note that you can open an index with create=true even while readers are using the index. The old readers will continue to search the "point in time" snapshot they had opened, and won't see the newly created index until they re-open. There are also constructors with no create argument which will create a new index if there is not already an index at the provided path and otherwise open the existing index.In either case, documents are added with AddDocument(Document) and removed with DeleteDocuments(Term) or DeleteDocuments(Query). A document can be updated with UpdateDocument(Term, Document) (which just deletes and then adds the entire document). When finished adding, deleting and updating documents, Close() should be called. These changes are buffered in memory and periodically flushed to the Directory (during the above method calls). A flush is triggered when there are enough buffered deletes (see SetMaxBufferedDeleteTerms) or enough added documents since the last flush, whichever is sooner. For the added documents, flushing is triggered either by RAM usage of the documents (see SetRAMBufferSizeMB) or the number of added documents. The default is to flush when RAM usage hits 16 MB. For best indexing speed you should flush by RAM usage with a large RAM buffer. Note that flushing just moves the internal buffered state in IndexWriter into the index, but these changes are not visible to IndexReader until either Commit() or Close() is called. A flush may also trigger one or more segment merges which by default run with a background thread so as not to block the addDocument calls (see below for changing the MergeScheduler). If an index will not have more documents added for a while and optimal search performance is desired, then either the full Optimize() method or partial Optimize(int) method should be called before the index is closed. Opening an IndexWriter creates a lock file for the directory in use. Trying to open another IndexWriter on the same directory will lead to a LockObtainFailedException. The LockObtainFailedException is also thrown if an IndexReader on the same directory is used to delete documents from the index. | ||
IndexReaderWarmer | If GetReader() has been called (ie, this writer is in near real-time mode), then after a merge completes, this class can be invoked to warm the reader on the newly merged segment, before the merge commits. This is not required for near real-time search, but will reduce search latency on opening a new near real-time reader after a merge completes | ||
MaxFieldLength | Specifies maximum field length (in number of tokens/terms) in IndexWriter constructors. SetMaxFieldLength(int) overrides the value set by the constructor. | ||
IntBlockPool | |||
InvertedDocConsumer | |||
InvertedDocConsumerPerField | |||
InvertedDocConsumerPerThread | |||
InvertedDocEndConsumer | |||
InvertedDocEndConsumerPerField | |||
InvertedDocEndConsumerPerThread | |||
KeepOnlyLastCommitDeletionPolicy | This IndexDeletionPolicy implementation that keeps only the most recent commit and immediately removes all prior commits after a new commit is done. This is the default deletion policy. | ||
LogByteSizeMergePolicy | This is a LogMergePolicy that measures size of a segment as the total byte size of the segment's files. | ||
LogDocMergePolicy | This is a LogMergePolicy that measures size of a segment as the number of documents (not taking deletions into account). | ||
LogMergePolicy | This class implements a MergePolicy that tries to merge segments into levels of exponentially increasing size, where each level has fewer segments than the value of the merge factor. Whenever extra segments (beyond the merge factor upper bound) are encountered, all segments within the level are merged. You can get or set the merge factor using MergeFactor and MergeFactor respectively. | ||
MergeDocIDRemapper | Remaps docIDs after a merge has completed, where the merged segments had at least one deletion. This is used to renumber the buffered deletes in IndexWriter when a merge of segments with deletions commits. | ||
MergePolicy | Expert: a MergePolicy determines the sequence of primitive merge operations to be used for overall merge and optimize operations. | ||
MergeAbortedException | |||
MergeException | Exception thrown if there are any problems while executing a merge. | ||
MergeSpecification | A MergeSpecification instance provides the information necessary to perform multiple merges. It simply contains a list of OneMerge instances. | ||
OneMerge | OneMerge provides the information necessary to perform an individual primitive merge operation, resulting in a single new segment. The merge spec includes the subset of segments to be merged as well as whether the new segment should use the compound file format. | ||
MergeScheduler | Expert: IndexWriter uses an instance implementing this interface to execute the merges selected by a MergePolicy. The default MergeScheduler is ConcurrentMergeScheduler. | ||
MultiLevelSkipListReader | This abstract class reads skip lists with multiple levels | ||
MultiLevelSkipListWriter | This abstract class writes skip lists with multiple levels | ||
MultipleTermPositions | Allows you to iterate over the TermPositions for multiple Terms as a single TermPositions | ||
MultiReader | An IndexReader which reads multiple indexes, appending their content. | ||
NormsWriter | Writes norms. Each thread X field accumulates the norms for the doc/fields it saw, then the flush method below merges all of these together into a single _X.nrm file. | ||
NormsWriterPerField | Taps into DocInverter, as an InvertedDocEndConsumer, which is called at the end of inverting each field. We just look at the length for the field (docState.length) and record the norm. | ||
NormsWriterPerThread | |||
ParallelReader | An IndexReader which reads multiple, parallel indexes. Each index added must have the same number of documents, but typically each contains different fields. Each document contains the union of the fields of all documents with the same document number. When searching, matches for a query term are from the first index added that has the field | ||
Payload | A Payload is metadata that can be stored together with each occurrence of a term. This metadata is stored inline in the posting list of the specific term. To store payloads in the index a TokenStream has to be used that produces payload data. Use TermPositions.PayloadLength and TermPositions.GetPayload(byte[], int) to retrieve the payloads from the index. | ||
PositionBasedTermVectorMapper | For each Field, store position by position information. It ignores frequency information This is not thread-safe. | ||
TVPositionInfo | Container for a term at a position | ||
RawPostingList | This is the base class for an in-memory posting list, keyed by a Token. TermsHash maintains a hash table holding one instance of this per unique Token. Consumers of TermsHash (TermsHashConsumer) must subclass this class with its own concrete class. FreqProxTermsWriter.PostingList is a private inner class used for the freq/prox postings, and TermVectorsTermsWriter.PostingList is a private inner class used to hold TermVectors postings. | ||
ReadOnlyDirectoryReader | |||
ReadOnlySegmentReader | |||
ReusableStringReader | Used by DocumentsWriter to implemented a StringReader that can be reset to a new string; we use this when tokenizing the string value from a Field. | ||
SegmentInfo | Information about a segment such as it's name, directory, and files related to the segment | ||
SegmentInfos | A collection of segmentInfo objects with methods for operating on those segments in relation to the file system | ||
FindSegmentsFile | Utility class for executing code that needs to do something with the current segments file. This is necessary with lock-less commits because from the time you locate the current segments file name, until you actually open it, read its contents, or check modified time, etc., it could have been deleted due to a writer commit finishing. | ||
SegmentMergeInfo | |||
SegmentMergeQueue | |||
SegmentMerger | The SegmentMerger class combines two or more Segments, represented by an IndexReader (Add, into a single Segment. After adding the appropriate readers, call the merge method to combine the segments. If the compoundFile flag is set, then the segments will be merged into a compound file | ||
SegmentReader | NOTE: This API is new and still experimental (subject to change suddenly in the next release) | ||
CoreReaders | |||
Norm | Byte[] referencing is used because a new norm object needs to be created for each clone, and the byte array is all that is needed for sharing between cloned readers. The current norm referencing is for sharing between readers whereas the byte[] referencing is for copy on write which is independent of reader references (i.e. incRef, decRef). | ||
Ref | |||
SegmentTermPositionVector | |||
SegmentTermVector | |||
SegmentWriteState | |||
SerialMergeScheduler | A MergeScheduler that simply does each merge sequentially, using the current thread. | ||
SnapshotDeletionPolicy | A IndexDeletionPolicy that wraps around any other IndexDeletionPolicy and adds the ability to hold and later release a single "snapshot" of an index. While the snapshot is held, the IndexWriter will not remove any files associated with it even if the index is otherwise being actively, arbitrarily changed. Because we wrap another arbitrary IndexDeletionPolicy, this gives you the freedom to continue using whatever IndexDeletionPolicy you would normally want to use with your index. Note that you can re-use a single instance of SnapshotDeletionPolicy across multiple writers as long as they are against the same index Directory. Any snapshot held when a writer is closed will "survive" when the next writer is opened | ||
SortedTermVectorMapper | Store a sorted collection of Lucene.Net.Index.TermVectorEntrys. Collects all term information into a single, SortedSet. NOTE: This Mapper ignores all Field information for the Document. This means that if you are using offset/positions you will not know what Fields they correlate with. This is not thread-safe | ||
StaleReaderException | This exception is thrown when an IndexReader tries to make changes to the index (via IndexReader.DeleteDocument , IndexReader.UndeleteAll or IndexReader.SetNorm(int,string,float)) but changes have already been committed to the index since this reader was instantiated. When this happens you must open a new reader on the current index to make the changes. | ||
StoredFieldsWriter | This is a DocFieldConsumer that writes stored fields. | ||
StoredFieldsWriterPerThread | |||
Term | A Term represents a word from text. This is the unit of search. It is composed of two elements, the text of the word, as a string, and the name of the field that the text occured in, an interned string. Note that terms may represent more than words from text fields, but also things like dates, email addresses, urls, etc. | ||
TermBuffer | |||
TermDocs | TermDocs provides an interface for enumerating <document, frequency> pairs for a term. The document portion names each document containing the term. Documents are indicated by number. The frequency portion gives the number of times the term occurred in each document. The pairs are ordered by document number. | ||
TermEnum | Abstract class for enumerating terms. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it. | ||
ITermFreqVector | Provides access to stored term vector of a document field. The vector consists of the name of the field, an array of the terms tha occur in the field of the Lucene.Net.Documents.Document and a parallel array of frequencies. Thus, getTermFrequencies()[5] corresponds with the frequency of getTerms()[5], assuming there are at least 5 terms in the Document. | ||
TermInfo | A TermInfo is the record of information stored for a term. | ||
TermInfosReader | This stores a monotonically increasing set of <Term, TermInfo> pairs in a Directory. Pairs are accessed either by Term or by ordinal position the set. | ||
TermInfosWriter | This stores a monotonically increasing set of <Term, TermInfo> pairs in a Directory. A TermInfos can be written once, in order. | ||
TermPositions | TermPositions provides an interface for enumerating the <document, frequency, <position>* > tuples for a term. The document and frequency are the same as for a TermDocs. The positions portion lists the ordinal positions of each occurrence of a term in a document | ||
TermPositionVector | Extends TermFreqVector to provide additional information about positions in which each of the terms is found. A TermPositionVector not necessarily contains both positions and offsets, but at least one of these arrays exists. | ||
TermsHash | This class implements InvertedDocConsumer, which is passed each token produced by the analyzer on each field. It stores these tokens in a hash table, and allocates separate byte streams per token. Consumers of this class, eg FreqProxTermsWriter and TermVectorsTermsWriter , write their own byte streams under each term. | ||
TermsHashConsumer | |||
TermsHashConsumerPerField | Implement this class to plug into the TermsHash processor, which inverts and stores Tokens into a hash table and provides an API for writing bytes into multiple streams for each unique Token. | ||
TermsHashConsumerPerThread | |||
TermsHashPerField | |||
TermsHashPerThread | |||
TermVectorEntry | Convenience class for holding TermVector information. | ||
TermVectorEntryFreqSortedComparator | Compares Lucene.Net.Index.TermVectorEntrys first by frequency and then by the term (case-sensitive) | ||
TermVectorMapper | The TermVectorMapper can be used to map Term Vectors into your own structure instead of the parallel array structure used by Lucene.Net.Index.IndexReader.GetTermFreqVector(int,String). It is up to the implementation to make sure it is thread-safe | ||
TermVectorOffsetInfo | The TermVectorOffsetInfo class holds information pertaining to a Term in a Lucene.Net.Index.TermPositionVector's offset information. This offset information is the character offset as set during the Analysis phase (and thus may not be the actual offset in the original content). | ||
TermVectorsReader | |||
ParallelArrayTermVectorMapper | Models the existing parallel array structure | ||
TermVectorsTermsWriter | |||
TermVectorsTermsWriterPerField | |||
TermVectorsTermsWriterPerThread | |||
TermVectorsWriter | |||
Messages | |||
INLSException | Interface that exceptions should implement to support lazy loading of messages | ||
Message | Message Interface for a lazy loading. For Native Language Support (NLS), system of software internationalization. | ||
MessageImpl | Default implementation of Message interface. For Native Language Support (NLS), system of software internationalization. | ||
NLS | MessageBundles classes extend this class, to implement a bundle | ||
IPriviligedAction | |||
QueryParsers | |||
ICharStream | This interface describes a character stream that maintains line and column number positions of the characters. It also has the capability to backup the stream to some extent. An implementation of this interface is used in the TokenManager implementation generated by JavaCCParser | ||
FastCharStream | An efficient implementation of JavaCC's CharStream interface. Note that this does not do line-number counting, but instead keeps track of the character position of the token in the input, as required by Lucene's Lucene.Net.Analysis.Token API | ||
MultiFieldQueryParser | A QueryParser which constructs queries to search multiple fields | ||
ParseException | This exception is thrown when parse errors are encountered. You can explicitly create objects of this exception type by calling the method generateParseException in the generated parser | ||
QueryParser | This class is generated by JavaCC. The most important method is Parse(String) | ||
QueryParserConstants | Token literal values and constants. Generated by org.javacc.parser.OtherFilesGen::start() | ||
QueryParserTokenManager | Token Manager. | ||
Token | Describes the input token stream. | ||
TokenMgrError | Token Manager Error. | ||
Search | |||
Function | |||
ByteFieldSource | Expert: obtains single byte field values from the FieldCache using getBytes() and makes those values available as other numeric types, casting as needed | ||
CustomScoreProvider | An instance of this subclass should be returned by CustomScoreQuery.GetCustomScoreProvider, if you want to modify the custom score calculation of a CustomScoreQuery | ||
CustomScoreQuery | Query that sets document score as a programmatic function of several (sub) scores:
Subclasses can modify the computation by overriding GetCustomScoreProvider | ||
DocValues | Expert: represents field values as different types. Normally created via a ValueSuorce for a particular field and reader | ||
FieldCacheSource | Expert: A base class for ValueSource implementations that retrieve values for a single field from the FieldCache. Fields used herein nust be indexed (doesn't matter if these fields are stored or not). It is assumed that each such indexed field is untokenized, or at least has a single token in a document. For documents with multiple tokens of the same field, behavior is undefined (It is likely that current code would use the value of one of these tokens, but this is not guaranteed). Document with no tokens in this field are assigned the Zero value | ||
FieldScoreQuery | A query that scores each document as the value of the numeric input field. The query matches all documents, and scores each document according to the numeric value of that field. It is assumed, and expected, that:
| ||
Type | Type of score field, indicating how field values are interpreted/parsed. The type selected at search search time should match the data stored in the field. Different types have different RAM requirements: | ||
FloatFieldSource | Expert: obtains float field values from the FieldCache using getFloats() and makes those values available as other numeric types, casting as needed | ||
IntFieldSource | Expert: obtains int field values from the FieldCache using getInts() and makes those values available as other numeric types, casting as needed | ||
OrdFieldSource | Expert: obtains the ordinal of the field value from the default Lucene Fieldcache using getStringIndex(). The native lucene index order is used to assign an ordinal value for each field value. Field values (terms) are lexicographically ordered by unicode value, and numbered starting at 1. Example: If there were only three field values: "apple","banana","pear" then ord("apple")=1, ord("banana")=2, ord("pear")=3 WARNING: ord() depends on the position in an index and can thus change when other documents are inserted or deleted, or if a MultiSearcher is used | ||
ReverseOrdFieldSource | Expert: obtains the ordinal of the field value from the default Lucene FieldCache using getStringIndex() and reverses the order. The native lucene index order is used to assign an ordinal value for each field value. Field values (terms) are lexicographically ordered by unicode value, and numbered starting at 1. Example of reverse ordinal (rord): If there were only three field values: "apple","banana","pear" then rord("apple")=3, rord("banana")=2, ord("pear")=1 WARNING: rord() depends on the position in an index and can thus change when other documents are inserted or deleted, or if a MultiSearcher is used | ||
ShortFieldSource | Expert: obtains short field values from the FieldCache using getShorts() and makes those values available as other numeric types, casting as needed | ||
ValueSource | Expert: source of values for basic function queries. At its default/simplest form, values - one per doc - are used as the score of that doc. Values are instantiated as DocValues for a particular reader. ValueSource implementations differ in RAM requirements: it would always be a factor of the number of documents, but for each document the number of bytes can be 1, 2, 4, or 8 | ||
ValueSourceQuery | Expert: A Query that sets the scores of document to the values obtained from a ValueSource. This query provides a score for each and every undeleted document in the index. The value source can be based on a (cached) value of an indexed field, but it can also be based on an external source, e.g. values read from an external database. Score is set as: Score(doc,query) = query.getBoost()2 * valueSource(doc) | ||
Highlight | |||
DefaultEncoder | Simple IEncoder implementation that does not modify the output | ||
GradientFormatter | Formats text with different color intensity depending on the score of the term. | ||
Highlighter | Class used to markup highlighted terms found in the best sections of a text, using configurable IFragmenter, Scorer, IFormatter, IEncoder and tokenizers. | ||
IEncoder | Encodes original text. The IEncoder works with the Formatter to generate the output. | ||
IFormatter | Processes terms found in the original text, typically by applying some form of mark-up to highlight terms in HTML search results pages. | ||
IFragmenter | Implements the policy for breaking text into multiple fragments for consideration by the Highlighter class. A sophisticated implementation may do this on the basis of detecting end of sentences in the text. | ||
InvalidTokenOffsetsException | |||
IScorer | Adds to the score for a fragment based on its tokens | ||
NullFragmenter | IFragmenter implementation which does not fragment the text. This is useful for highlighting the entire content of a document or field. | ||
QueryScorer | IScorer implementation which scores text fragments by the number of unique query terms found. This class converts appropriate Querys to SpanQuerys and attempts to score only those terms that participated in generating the 'hit' on the document. | ||
QueryTermScorer | |||
SimpleFragmenter | IFragmenter implementation which breaks text up into same-size fragments with no concerns over spotting sentence boundaries. | ||
SimpleHTMLEncoder | Simple IEncoder implementation to escape text for HTML output | ||
SimpleHTMLFormatter | Simple IFormatter implementation to highlight terms with a pre and post tag | ||
SimpleSpanFragmenter | |||
SpanGradientFormatter | Formats text with different color intensity depending on the score of the term using the span tag. GradientFormatter uses a bgcolor argument to the font tag which doesn't work in Mozilla, thus this class. | ||
TextFragment | Low-level class used to record information about a section of a document with a score. | ||
TokenGroup | One, or several overlapping tokens, along with the score(s) and the scope of the original text | ||
TokenSources | Hides implementation issues associated with obtaining a TokenStream for use with the higlighter - can obtain from TermFreqVectors with offsets and (optionally) positions or from Analyzer class reparsing the stored content. | ||
StoredTokenStream | |||
WeightedSpanTerm | Lightweight class to hold term, Weight, and positions used for scoring this term. | ||
PositionSpan | |||
WeightedSpanTermExtractor | Class used to extract WeightedSpanTerms from a Query based on whether Terms from the Query are contained in a supplied Analysis.TokenStream. | ||
WeightedTerm | Lightweight class to hold term and a Weight value used for scoring this term | ||
Payloads | |||
AveragePayloadFunction | Calculate the final score as the average score of all payloads seen. Is thread safe and completely reusable | ||
MaxPayloadFunction | Returns the maximum payload score seen, else 1 if there are no payloads on the doc. Is thread safe and completely reusable | ||
MinPayloadFunction | Calculates the minimum payload seen | ||
PayloadFunction | An abstract class that defines a way for Payload*Query instances to transform the cumulative effects of payload scores for a document | ||
PayloadNearQuery | This class is very similar to Lucene.Net.Search.Spans.SpanNearQuery except that it factors in the value of the payloads located at each of the positions where the Lucene.Net.Search.Spans.TermSpans occurs. In order to take advantage of this, you must override Lucene.Net.Search.Similarity.ScorePayload which returns 1 by default. Payload scores are aggregated using a pluggable PayloadFunction | ||
PayloadNearSpanScorer | |||
PayloadNearSpanWeight | |||
PayloadSpanUtil | Experimental class to get set of payloads for most standard Lucene queries. Operates like Highlighter - IndexReader should only contain doc of interest, best to use MemoryIndex | ||
PayloadTermQuery | This class is very similar to Lucene.Net.Search.Spans.SpanTermQuery except that it factors in the value of the payload located at each of the positions where the Lucene.Net.Index.Term occurs. In order to take advantage of this, you must override Lucene.Net.Search.Similarity.ScorePayload(int, String, int, int, byte[],int,int) which returns 1 by default. Payload scores are aggregated using a pluggable PayloadFunction | ||
Similar | |||
MoreLikeThis | Generate "more like this" similarity queries. Based on this mail: | ||
MoreLikeThisQuery | |||
SimilarityQueries | Simple similarity measures | ||
Spans | |||
FieldMaskingSpanQuery | Wrapper to allow SpanQuery objects participate in composite single-field SpanQueries by 'lying' about their search field. That is, the masked SpanQuery will function as normal, but SpanQuery.Field simply hands back the value supplied in this class's constructor. | ||
NearSpansOrdered | A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them. The formed spans only contains minimum slop matches. The matching slop is computed from the distance(s) between the non overlapping matching Spans. Successive matches are always formed from the successive Spans of the SpanNearQuery. The formed spans may contain overlaps when the slop is at least 1. For example, when querying using t1 t2 t3 with slop at least 1, the fragment: t1 t2 t1 t3 t2 t3 matches twice: t1 t2 .. t3 t1 .. t2 t3 | ||
NearSpansUnordered | Similar to NearSpansOrdered, but for the unordered case | ||
SpanFirstQuery | Matches spans near the beginning of a field. | ||
SpanNearQuery | Matches spans which are near one another. One can specify slop, the maximum number of intervening unmatched positions, as well as whether matches are required to be in-order. | ||
SpanNotQuery | Removes matches which overlap with another SpanQuery. | ||
SpanOrQuery | Matches the union of its clauses. | ||
SpanQuery | Base class for span-based queries. | ||
Spans | Expert: an enumeration of span matches. Used to implement span searching. Each span represents a range of term positions within a document. Matches are enumerated in order, by increasing document number, within that by increasing start position and finally by increasing end position. | ||
SpanScorer | Public for extension only. | ||
SpanTermQuery | Matches spans containing a term. | ||
SpanWeight | Expert-only. Public for use by other weight implementations | ||
TermSpans | Expert: Public for extension only | ||
Vectorhighlight | |||
BaseFragmentsBuilder | |||
FastVectorHighlighter | |||
FieldFragList | FieldFragList has a list of "frag info" that is used by FragmentsBuilder class to create fragments (snippets). /summary> | ||
WeightedFragInfo | |||
FieldPhraseList | FieldPhraseList has a list of WeightedPhraseInfo that is used by FragListBuilder to create a FieldFragList object. | ||
WeightedPhraseInfo | |||
Toffs | |||
FieldQuery | |||
QueryPhraseMap | |||
FieldTermStack | FieldTermStack is a stack that keeps query terms in the specified field of the document to be highlighted. | ||
TermInfo | |||
FragListBuilder | |||
FragmentsBuilder | FragmentsBuilder is an interface for fragments (snippets) builder classes. A FragmentsBuilder class can be plugged in to Highlighter. | ||
ScoreOrderFragmentsBuilder | |||
ScoreComparator | |||
SimpleFragListBuilder | A simple implementation of FragListBuilder. | ||
SimpleFragmentsBuilder | A simple implementation of FragmentsBuilder. | ||
HashMap< K, V > | |||
BooleanFilter | |||
BoostingQuery | The BoostingQuery class can be used to effectively demote results that match a given query. Unlike the "NOT" clause, this still selects documents that contain undesirable terms, but reduces their overall score: | ||
DuplicateFilter | |||
FilterClause | |||
FuzzyLikeThisQuery | Fuzzifies ALL terms provided as strings and then picks the best n differentiating terms. In effect this mixes the behaviour of FuzzyQuery and MoreLikeThis but with special consideration of fuzzy scoring factors. This generally produces good results for queries where users may provide details in a number of fields and have no knowledge of boolean query syntax and also want a degree of fuzzy matching and a fast query | ||
TermsFilter | A filter that contains multiple terms. | ||
SimpleFacetedSearch | |||
FacetName | |||
Hits | |||
HitsPerFacet | |||
FieldValuesBitSets | |||
BooleanClause | A clause in a BooleanQuery. | ||
BooleanQuery | A Query that matches documents matching boolean combinations of other queries, e.g. TermQuerys, PhraseQuerys or other BooleanQuerys. | ||
TooManyClauses | Thrown when an attempt is made to add more than MaxClauseCount clauses. This typically happens if a PrefixQuery, FuzzyQuery, WildcardQuery, or TermRangeQuery is expanded to many terms during search. | ||
BooleanScorer | |||
BooleanScorer2 | An alternative to BooleanScorer that also allows a minimum number of optional scorers that should match. Implements skipTo(), and has no limitations on the numbers of added scorers. Uses ConjunctionScorer, DisjunctionScorer, ReqOptScorer and ReqExclScorer. | ||
CachingSpanFilter | Wraps another SpanFilter's result and caches it. The purpose is to allow filters to simply filter, and then wrap with this class to add caching. | ||
CachingWrapperFilter | Wraps another filter's result and caches it. The purpose is to allow filters to simply filter, and then wrap with this class to add caching. | ||
Collector | Expert: Collectors are primarily meant to be used to gather raw results from a search, and implement sorting or custom result filtering, collation, etc. | ||
ComplexExplanation | Expert: Describes the score computation for document and query, and can distinguish a match independent of a positive value. | ||
ConjunctionScorer | Scorer for conjunctions, sets of queries, all of which are required. | ||
ConstantScoreQuery | A query that wraps a filter and simply returns a constant score equal to the query boost for every document in the filter. | ||
DefaultSimilarity | Expert: Default scoring implementation. | ||
DisjunctionMaxQuery | A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries. This is useful when searching for a word in multiple fields with different boost factors (so that the fields cannot be combined equivalently into a single search field). We want the primary score to be the one associated with the highest boost, not the sum of the field scores (as BooleanQuery would give). If the query is "albino elephant" this ensures that "albino" matching one field and "elephant" matching another gets a higher score than "albino" matching both fields. To get this result, use both BooleanQuery and DisjunctionMaxQuery: for each term a DisjunctionMaxQuery searches for it in each field, while the set of these DisjunctionMaxQuery's is combined into a BooleanQuery. The tie breaker capability allows results that include the same term in multiple fields to be judged better than results that include this term in only the best of those multiple fields, without confusing this with the better case of two different terms in the multiple fields. | ||
DisjunctionMaxScorer | The Scorer for DisjunctionMaxQuery's. The union of all documents generated by the the subquery scorers is generated in document number order. The score for each document is the maximum of the scores computed by the subquery scorers that generate that document, plus tieBreakerMultiplier times the sum of the scores for the other subqueries that generate the document. | ||
DisjunctionSumScorer | A Scorer for OR like queries, counterpart of ConjunctionScorer . This Scorer implements DocIdSetIterator.Advance(int) and uses skipTo() on the given Scorers. | ||
DocIdSet | A DocIdSet contains a set of doc ids. Implementing classes must only implement Iterator to provide access to the set. | ||
AnonymousClassDocIdSet | |||
AnonymousClassDocIdSetIterator | |||
DocIdSetIterator | This abstract class defines methods to iterate over a set of non-decreasing doc ids. Note that this class assumes it iterates on doc Ids, and therefore NO_MORE_DOCS is set to Int32.MaxValue in order to be used as a sentinel object. Implementations of this class are expected to consider int.MaxValue as an invalid value. | ||
ExactPhraseScorer | |||
Explanation | Expert: Describes the score computation for document and query. | ||
IDFExplanation | Small Util class used to pass both an idf factor as well as an explanation for that factor | ||
CreationPlaceholder | Expert: Maintains caches of term values | ||
StringIndex | Expert: Stores term text values and document ordering data. | ||
CacheEntry | EXPERT: A unique Identifier/Description for each item in the FieldCache. Can be useful for logging/debugging. EXPERIMENTAL API: This API is considered extremely advanced and experimental. It may be removed or altered w/o warning in future releases of Lucene. | ||
FieldCache_Fields | |||
AnonymousClassByteParser | |||
AnonymousClassShortParser | |||
AnonymousClassIntParser | |||
AnonymousClassFloatParser | |||
AnonymousClassLongParser | |||
AnonymousClassDoubleParser | |||
AnonymousClassIntParser1 | |||
AnonymousClassFloatParser1 | |||
AnonymousClassLongParser1 | |||
AnonymousClassDoubleParser1 | |||
FieldCache | |||
Parser | Marker interface as super-interface to all parsers. It is used to specify a custom parser to SortField(String, Parser). | ||
ByteParser | Interface to parse bytes from document fields. | ||
ShortParser | Interface to parse shorts from document fields. | ||
IntParser | Interface to parse ints from document fields. | ||
FloatParser | Interface to parse floats from document fields. | ||
LongParser | Interface to parse long from document fields. | ||
DoubleParser | Interface to parse doubles from document fields. | ||
FieldCacheImpl | Expert: The default cache implementation, storing all values in memory. A WeakDictionary is used for storage | ||
FieldCacheRangeFilter< T > | |||
FieldCacheTermsFilter | A Filter that only accepts documents whose single term value in the specified field is contained in the provided set of allowed terms | ||
FieldComparator | Expert: a FieldComparator compares hits so as to determine their sort order when collecting the top results with TopFieldCollector . The concrete public FieldComparator classes here correspond to the SortField types | ||
ByteComparator | Parses field's values as byte (using FieldCache.GetBytes(Lucene.Net.Index.IndexReader,string) and sorts by ascending value | ||
DocComparator | Sorts by ascending docID | ||
DoubleComparator | Parses field's values as double (using FieldCache.GetDoubles(Lucene.Net.Index.IndexReader,string) and sorts by ascending value | ||
FloatComparator | Parses field's values as float (using FieldCache.GetFloats(Lucene.Net.Index.IndexReader,string) and sorts by ascending value | ||
IntComparator | Parses field's values as int (using FieldCache.GetInts(Lucene.Net.Index.IndexReader,string) and sorts by ascending value | ||
LongComparator | Parses field's values as long (using FieldCache.GetLongs(Lucene.Net.Index.IndexReader,string) and sorts by ascending value | ||
RelevanceComparator | Sorts by descending relevance. NOTE: if you are sorting only by descending relevance and then secondarily by ascending docID, peformance is faster using TopScoreDocCollector directly (which Searcher.Search(Query, int) uses when no Sort is specified). | ||
ShortComparator | Parses field's values as short (using FieldCache.GetShorts(IndexReader, string)) and sorts by ascending value | ||
StringComparatorLocale | Sorts by a field's value using the Collator for a given Locale. | ||
StringOrdValComparator | Sorts by field's natural String sort order, using ordinals. This is functionally equivalent to FieldComparator.StringValComparator , but it first resolves the string to their relative ordinal positions (using the index returned by FieldCache.GetStringIndex), and does most comparisons using the ordinals. For medium to large results, this comparator will be much faster than FieldComparator.StringValComparator. For very small result sets it may be slower. | ||
StringValComparator | Sorts by field's natural String sort order. All comparisons are done using String.compareTo, which is slow for medium to large result sets but possibly very fast for very small results sets. | ||
FieldComparatorSource | Provides a FieldComparator for custom field sorting | ||
FieldDoc | Expert: A ScoreDoc which also contains information about how to sort the referenced document. In addition to the document number and score, this object contains an array of values for the document from the field(s) used to sort. For example, if the sort criteria was to sort by fields "a", "b" then "c", the fields object array will have three elements, corresponding respectively to the term values for the document in fields "a", "b" and "c". The class of each element in the array will be either Integer, Float or String depending on the type of values in the terms of each field | ||
FieldDocSortedHitQueue | Expert: Collects sorted results from Searchable's and collates them. The elements put into this queue must be of type FieldDoc | ||
FieldValueHitQueue | Expert: A hit queue for sorting by hits by terms in more than one field. Uses FieldCache.DEFAULT for maintaining internal term lookup tables | ||
Entry | |||
Filter | Abstract base class for restricting which documents may be returned during searching. | ||
FilteredDocIdSet | Abstract decorator class for a DocIdSet implementation that provides on-demand filtering/validation mechanism on a given DocIdSet | ||
FilteredDocIdSetIterator | Abstract decorator class of a DocIdSetIterator implementation that provides on-demand filter/validation mechanism on an underlying DocIdSetIterator. See FilteredDocIdSet | ||
FilteredQuery | A query that applies a filter to the results of another query | ||
FilteredTermEnum | Abstract class for enumerating a subset of all terms. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it. | ||
FilterManager | Filter caching singleton. It can be used to save filters locally for reuse. This class makes it possble to cache Filters even when using RMI, as it keeps the cache on the seaercher side of the RMI connection | ||
FuzzyQuery | Implements the fuzzy search query. The similarity measurement is based on the Levenshtein (edit distance) algorithm | ||
FuzzyTermEnum | Subclass of FilteredTermEnum for enumerating all terms that are similiar to the specified filter term | ||
HitQueue | |||
IndexSearcher | Implements search over a single IndexReader | ||
MatchAllDocsQuery | A query that matches all documents | ||
MultiPhraseQuery | MultiPhraseQuery is a generalized version of PhraseQuery, with an added method Add(Term[]). To use this class, to search for the phrase "Microsoft app*" first use add(Term) on the term "Microsoft", then find all terms that have "app" as prefix using IndexReader.terms(Term), and use MultiPhraseQuery.add(Term[] terms) to add them to the query | ||
MultiSearcher | Implements search over a set of Searchables | ||
MultiTermQuery | An abstract Query that matches documents containing a subset of terms provided by a FilteredTermEnum enumeration | ||
AnonymousClassConstantScoreAutoRewrite | |||
ConstantScoreAutoRewrite | A rewrite method that tries to pick the best constant-score rewrite method based on term and document counts from the query. If both the number of terms and documents is small enough, then CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE is used. Otherwise, CONSTANT_SCORE_FILTER_REWRITE is used. | ||
RewriteMethod | Abstract class that defines how the query is rewritten. | ||
MultiTermQueryWrapperFilter< T > | A wrapper for MultiTermQuery, that exposes its functionality as a Filter. MultiTermQueryWrapperFilter is not designed to be used by itself. Normally you subclass it to provide a Filter counterpart for a MultiTermQuery subclass. For example, TermRangeFilter and PrefixFilter extend MultiTermQueryWrapperFilter . This class also provides the functionality behind MultiTermQuery.CONSTANT_SCORE_FILTER_REWRITE; this is why it is not abstract. | ||
NumericRangeFilter< T > | A Filter that only accepts numeric values within a specified range. To use this, you must first index the numeric values using NumericField (expert: NumericTokenStream ) | ||
NumericRangeQuery< T > | A Query that matches numeric values within a specified range. To use this, you must first index the numeric values using NumericField (expert: NumericTokenStream ). If your terms are instead textual, you should use TermRangeQuery. NumericRangeFilter{T} is the filter equivalent of this query. | ||
ParallelMultiSearcher | Implements parallel search over a set of Searchables | ||
PhrasePositions | Position of a term in a document that takes into account the term offset within the phrase. | ||
PhraseQuery | A Query that matches documents containing a particular sequence of terms. A PhraseQuery is built by QueryParser for input like "new york" | ||
PhraseQueue | |||
PhraseScorer | Expert: Scoring functionality for phrase queries. A document is considered matching if it contains the phrase-query terms at "valid" positons. What "valid positions" are depends on the type of the phrase query: for an exact phrase query terms are required to appear in adjacent locations, while for a sloppy phrase query some distance between the terms is allowed. The abstract method PhraseFreq() of extending classes is invoked for each document containing all the phrase query terms, in order to compute the frequency of the phrase query in that document. A non zero frequency means a match. | ||
PositiveScoresOnlyCollector | A Collector implementation which wraps another Collector and makes sure only documents with scores > 0 are collected. | ||
PrefixFilter | A Filter that restricts search results to values that have a matching prefix in a given field. | ||
PrefixQuery | A Query that matches documents containing terms with a specified prefix. A PrefixQuery is built by QueryParser for input like app* | ||
PrefixTermEnum | Subclass of FilteredTermEnum for enumerating all terms that match the specified prefix filter term. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it | ||
Query | The abstract base class for queries. Instantiable subclasses are: A parser for queries is contained in: | ||
QueryTermVector | |||
QueryWrapperFilter | Constrains search results to only match those which also match a provided query | ||
ReqExclScorer | A Scorer for queries with a required subscorer and an excluding (prohibited) sub DocIdSetIterator. This Scorer implements DocIdSetIterator.Advance(int), and it uses the skipTo() on the given scorers. | ||
ReqOptSumScorer | A Scorer for queries with a required part and an optional part. Delays skipTo() on the optional part until a score() is needed. This Scorer implements DocIdSetIterator.Advance(int). | ||
ScoreCachingWrappingScorer | A Scorer which wraps another scorer and caches the score of the current document. Successive calls to Score() will return the same result and will not invoke the wrapped Scorer's score() method, unless the current document has changed. This class might be useful due to the changes done to the Collector interface, in which the score is not computed for a document by default, only if the collector requests it. Some collectors may need to use the score in several places, however all they have in hand is a Scorer object, and might end up computing the score of a document more than once. | ||
ScoreDoc | Expert: Returned by low-level search implementations. | ||
Scorer | Expert: Common scoring functionality for different types of queries | ||
Searchable | The interface for search implementations | ||
Searcher | An abstract base class for search implementations. Implements the main search methods | ||
Similarity | Expert: Scoring API. Subclasses implement search scoring | ||
SimilarityDelegator | Expert: Delegating scoring implementation. Useful in Query.GetSimilarity(Searcher) implementations, to override only certain methods of a Searcher's Similiarty implementation.. | ||
SingleTermEnum | Subclass of FilteredTermEnum for enumerating a single term. This can be used by MultiTermQuerys that need only visit one term, but want to preserve MultiTermQuery semantics such as RewriteMethod. | ||
SloppyPhraseScorer | |||
Sort | Encapsulates sort criteria for returned hits | ||
SortField | Stores information about how to sort documents by terms in an individual field. Fields must be indexed in order to sort by them | ||
SpanFilter | Abstract base class providing a mechanism to restrict searches to a subset of an index and also maintains and returns position information. This is useful if you want to compare the positions from a SpanQuery with the positions of items in a filter. For instance, if you had a SpanFilter that marked all the occurrences of the word "foo" in documents, and then you entered a new SpanQuery containing bar, you could not only filter by the word foo, but you could then compare position information for post processing. | ||
SpanFilterResult | The results of a SpanQueryFilter. Wraps the BitSet and the position information from the SpanQuery | ||
PositionInfo | |||
StartEnd | |||
SpanQueryFilter | Constrains search results to only match those which also match a provided query. Also provides position information about where each document matches at the cost of extra space compared with the QueryWrapperFilter. There is an added cost to this above what is stored in a QueryWrapperFilter. Namely, the position information for each matching document is stored. This filter does not cache. See the Lucene.Net.Search.CachingSpanFilter for a wrapper that caches | ||
TermQuery | A Query that matches documents containing a term. This may be combined with other terms with a BooleanQuery. | ||
TermRangeFilter | A Filter that restricts search results to a range of values in a given field | ||
TermRangeQuery | A Query that matches documents within an exclusive range of terms | ||
TermRangeTermEnum | Subclass of FilteredTermEnum for enumerating all terms that match the specified range parameters. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it. | ||
TermScorer | Expert: A Scorer for documents matching a Term . | ||
TimeLimitingCollector | The TimeLimitingCollector is used to timeout search requests that take longer than the maximum allowed search time limit. After this time is exceeded, the search thread is stopped by throwing a TimeExceededException. | ||
TimeExceededException | Thrown when elapsed search time exceeds allowed search time. | ||
TopDocs | Represents hits returned by Searcher.Search(Query,Filter,int) and Searcher.Search(Query,int) | ||
TopDocsCollector< T > | A base class for all collectors that return a Lucene.Net.Search.TopDocs output. This collector allows easy extension by providing a single constructor which accepts a PriorityQueue{T} as well as protected members for that priority queue and a counter of the number of total hits. Extending classes can override TopDocs(int, int) and TotalHits in order to provide their own implementation. | ||
TopFieldCollector | A Collector that sorts by SortField using FieldComparators. See the Create method for instantiating a TopFieldCollector | ||
TopFieldDocs | Represents hits returned by Searcher.Search(Query,Filter,int,Sort). | ||
TopScoreDocCollector | A Collector implementation that collects the top-scoring hits, returning them as a TopDocs. This is used by IndexSearcher to implement TopDocs-based search. Hits are sorted by score descending and then (when the scores are tied) docID ascending. When you create an instance of this collector you should know in advance whether documents are going to be collected in doc Id order or not | ||
Weight | Expert: Calculate query weights and build query scorers. The purpose of Weight is to ensure searching does not modify a Query, so that a Query instance can be reused. Searcher dependent state of the query should reside in the Weight. IndexReader dependent state should reside in the Scorer. A Weight is used in the following way:
| ||
WildcardQuery | Implements the wildcard search query. Supported wildcards are * , which matches any character sequence (including the empty one), and ? , which matches any single character. Note this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow WildcardQueries, a Wildcard term should not start with one of the wildcards * or ? | ||
WildcardTermEnum | Subclass of FilteredTermEnum for enumerating all terms that match the specified wildcard filter term. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it. | ||
Spatial | |||
BBox | |||
AreaSimilarity | The algorithm is implemented as envelope on envelope overlays rather than complex polygon on complex polygon overlays. Spatial relevance scoring algorithm: queryArea = the area of the input query envelope targetArea = the area of the target envelope (per Lucene document) intersectionArea = the area of the intersection for the query/target envelopes queryPower = the weighting power associated with the query envelope (default = 1.0) targetPower = the weighting power associated with the target envelope (default = 1.0) queryRatio = intersectionArea / queryArea; targetRatio = intersectionArea / targetArea; queryFactor = Math.pow(queryRatio,queryPower); targetFactor = Math.pow(targetRatio,targetPower); score = queryFactor /// targetFactor; Based on Geoportal's SpatialRankingValueSource | ||
BBoxSimilarity | Abstraction of the calculation used to determine how similar two Bounding Boxes are. | ||
BBoxSimilarityValueSource | |||
BBoxStrategy | |||
DistanceSimilarity | Returns the distance between the center of the indexed rectangle and the query shape. | ||
Prefix | |||
Tree | |||
GeohashPrefixTree | A SpatialPrefixGrid based on Geohashes. Uses GeohashUtils to do all the geohash work. | ||
Factory | Factory for creating GeohashPrefixTree instances with useful defaults | ||
GhCell | |||
Node | |||
QuadPrefixTree | Implementation of SpatialPrefixTree which uses a quad tree (http://en.wikipedia.org/wiki/Quadtree) | ||
Factory | Factory for creating QuadPrefixTree instances with useful defaults | ||
QuadCell | |||
SpatialPrefixTree | A spatial Prefix Tree, or Trie, which decomposes shapes into prefixed strings at variable lengths corresponding to variable precision. Each string corresponds to a spatial region | ||
SpatialPrefixTreeFactory | Abstract Factory for creating SpatialPrefixTree instances with useful defaults and passed on configurations defined in a Map. | ||
PointPrefixTreeFieldCacheProvider | Implementation of ShapeFieldCacheProvider designed for PrefixTreeStrategys | ||
PrefixTreeStrategy | Abstract SpatialStrategy which provides common functionality for those Strategys which use SpatialPrefixTrees | ||
CellTokenStream | Outputs the tokenString of a cell, and if its a leaf, outputs it again with the leaf byte. | ||
RecursivePrefixTreeFilter | Performs a spatial intersection filter against a field indexed with SpatialPrefixTree, a Trie. SPT yields terms (grids) at length 1 and at greater lengths corresponding to greater precisions. This filter recursively traverses each grid length and uses methods on Shape to efficiently know that all points at a prefix fit in the shape or not to either short-circuit unnecessary traversals or to efficiently load all enclosed points. | ||
RecursivePrefixTreeStrategy | Based on RecursivePrefixTreeFilter. | ||
TermQueryPrefixTreeStrategy | A basic implementation using a large TermsFilter of all the nodes from SpatialPrefixTree#getNodes(com.spatial4j.core.shape.Shape, int, boolean). | ||
Queries | |||
SpatialArgsParser | |||
SpatialOperation | |||
UnsupportedSpatialOperation | |||
Util | |||
IBits | Interface for Bitset-like structures. | ||
Bits | Empty implementation, basically just so we can provide EMPTY_ARRAY | ||
MatchAllBits | Bits impl of the specified length with all bits set. | ||
MatchNoBits | Bits impl of the specified length with no bits set. | ||
CachingDoubleValueSource | |||
CachingDoubleDocValue | |||
FixedBitSet | |||
FixedBitSetIterator | A FixedBitSet Iterator implementation | ||
FunctionQuery | Port of Solr's FunctionQuery (v1.4) | ||
AllScorer | |||
FunctionWeight | |||
ReciprocalFloatFunction | |||
FloatDocValues | |||
ShapeFieldCache< T > | Bounded Cache of Shapes associated with docIds. Note, multiple Shapes can be associated with a given docId | ||
ShapeFieldCacheDistanceValueSource | An implementation of the Lucene ValueSource model to support spatial relevance ranking. | ||
CachedDistanceDocValues | |||
ShapeFieldCacheProvider< T > | Provides access to a ShapeFieldCache for a given AtomicReader | ||
TermsEnumCompatibility | Wraps Lucene 3 TermEnum to make it look like a Lucene 4 TermsEnum SOLR-2155 | ||
TermsFilter | Constructs a filter for docs matching any of the terms added to this class. Unlike a RangeFilter this can be used for filtering on multiple terms that are not necessarily in a sequence. An example might be a collection of primary keys from a database query result or perhaps a choice of "category" labels picked by the end user. As a filter, this is much faster than the equivalent query (a BooleanQuery with many "should" TermQueries) | ||
ValueSourceFilter | Filter that matches all documents where a valuesource is in between a range of min and max inclusive. | ||
ValueSourceFilteredDocIdSet | |||
Vector | |||
DistanceValueSource | An implementation of the Lucene ValueSource model that returns the distance. | ||
DistanceDocValues | |||
PointVectorStrategy | Simple SpatialStrategy which represents Points in two numeric DoubleFields | ||
Store | |||
AlreadyClosedException | This exception is thrown when there is an attempt to access something that has already been closed. | ||
BufferedIndexInput | Base implementation class for buffered IndexInput. | ||
BufferedIndexOutput | Base implementation class for buffered IndexOutput. | ||
ChecksumIndexInput | Writes bytes through to a primary IndexOutput, computing checksum as it goes. Note that you cannot use seek(). | ||
ChecksumIndexOutput | Writes bytes through to a primary IndexOutput, computing checksum. Note that you cannot use seek(). | ||
Directory | A Directory is a flat list of files. Files may be written once, when they are created. Once a file is created it may only be opened for read, or deleted. Random access is permitted both when reading and writing | ||
FileSwitchDirectory | Expert: A Directory instance that switches files between two other Directory instances. Files with the specified extensions are placed in the primary directory; others are placed in the secondary directory. The provided Set must not change once passed to this class, and must allow multiple threads to call contains at once. | ||
FSDirectory | Base class for Directory implementations that store index files in the file system. There are currently three core subclasses: | ||
FSLockFactory | Base class for file system based locking implementation. | ||
IndexInput | Abstract base class for input from a file in a Directory. A random-access input stream. Used for all Lucene index input operations. | ||
IndexOutput | Abstract base class for output to a file in a Directory. A random-access output stream. Used for all Lucene index output operations. | ||
Lock | An interprocess mutex lock. Typical use might look like: | ||
With | Utility class for executing code with exclusive access. | ||
LockFactory | Base class for Locking implementation. Directory uses instances of this class to implement locking. | ||
LockObtainFailedException | This exception is thrown when the write.lock could not be acquired. This happens when a writer tries to open an index that another writer already has open. | ||
LockReleaseFailedException | This exception is thrown when the write.lock could not be released. | ||
LockStressTest | Simple standalone tool that forever acquires & releases a lock using a specific LockFactory. Run without any args to see usage | ||
LockVerifyServer | Simple standalone server that must be running when you use VerifyingLockFactory. This server simply verifies at most one process holds the lock at a time. Run without any args to see usage | ||
MMapDirectory | File-based Directory implementation that uses mmap for reading, and SimpleFSDirectory.SimpleFSIndexOutput for writing | ||
NativeFSLockFactory | Implements LockFactory using native OS file locks. Note that because this LockFactory relies on java.nio.* APIs for locking, any problems with those APIs will cause locking to fail. Specifically, on certain NFS environments the java.nio.* locks will fail (the lock can incorrectly be double acquired) whereas SimpleFSLockFactory worked perfectly in those same environments. For NFS based access to an index, it's recommended that you try SimpleFSLockFactory first and work around the one limitation that a lock file could be left when the JVM exits abnormally. | ||
NativeFSLock | |||
NIOFSDirectory | Not implemented. Waiting for volunteers. | ||
NIOFSIndexInput | Not implemented. Waiting for volunteers. | ||
NoLockFactory | Use this LockFactory to disable locking entirely. Only one instance of this lock is created. You should call Instance to get the instance | ||
NoLock | |||
NoSuchDirectoryException | This exception is thrown when you try to list a non-existent directory. | ||
RAMDirectory | A memory-resident Directory implementation. Locking implementation is by default the SingleInstanceLockFactory but can be changed with Directory.SetLockFactory. | ||
RAMFile | |||
RAMInputStream | A memory-resident IndexInput implementation | ||
RAMOutputStream | A memory-resident IndexOutput implementation | ||
SimpleFSDirectory | A straightforward implementation of FSDirectory using java.io.RandomAccessFile. However, this class has poor concurrent performance (multiple threads will bottleneck) as it synchronizes when multiple threads read from the same file. It's usually better to use NIOFSDirectory or MMapDirectory instead. | ||
SimpleFSIndexOutput | |||
SimpleFSLockFactory | Implements LockFactory using System.IO.FileInfo.Create() . | ||
SimpleFSLock | |||
SingleInstanceLockFactory | Implements LockFactory for a single in-process instance, meaning all locking will take place through this one instance. Only use this LockFactory when you are certain all IndexReaders and IndexWriters for a given index are running against a single shared in-process Directory instance. This is currently the default locking for RAMDirectory | ||
SingleInstanceLock | |||
VerifyingLockFactory | A LockFactory that wraps another LockFactory and verifies that each lock obtain/release is "correct" (never results in two processes holding the lock at the same time). It does this by contacting an external server (LockVerifyServer) to assert that at most one process holds the lock at a time. To use this, you should also run LockVerifyServer on the host & port matching what you pass to the constructor | ||
Support | |||
Compatibility | |||
AppSettings | |||
BitSetSupport | This class provides supporting methods of java.util.BitSet that are not present in System.Collections.BitArray. | ||
BuildType | |||
Character | Mimics Java's Character class. | ||
CloseableThreadLocalProfiler | For Debuging purposes. | ||
CollectionsHelper | Support class used to handle Hashtable addition, which does a check first to make sure the added item is unique in the hash. | ||
Compare | Summary description for TestSupportClass. | ||
CRC32 | |||
Deflater | |||
Double | |||
EquatableList< T > | Represents a strongly typed list of objects that can be accessed by index. Provides methods to search, sort, and manipulate lists. Also provides functionality to compare lists against each other through an implementations of IEquatable{T}. | ||
FileSupport | Represents the methods to support some operations over files. | ||
HashMap< TKey, TValue > | A C# emulation of the Java Hashmap | ||
IChecksum | Contains conversion support elements such as classes, interfaces and static methods. | ||
Inflater | |||
IThreadRunnable | This interface should be implemented by any class whose instances are intended to be executed by a thread. | ||
Number | A simple class for number conversions. | ||
OS | Provides platform infos. | ||
SharpZipLib | |||
Single | |||
TextSupport | |||
ThreadClass | Support class used to handle threads | ||
ThreadLock | Abstract base class that provides a synchronization interface for derived lock types | ||
WeakDictionary< TKey, TValue > | |||
Util | |||
Cache | |||
AbstractSegmentCache | Root custom cache to allow a factory to retain references to the custom caches without having to be aware of the type. | ||
SegmentCache< T > | Custom cache with two levels of keys, outer key is the IndexReader with the inner key being a string, commonly a field name but can be anything. Refer to the unit tests for an example implementation.
| ||
Cache< TKey, TValue > | Base class for cache implementations. | ||
SimpleLRUCache< TKey, TValue > | |||
SimpleMapCache< TKey, TValue > | Simple cache implementation that uses a HashMap to store (key, value) pairs. This cache is not synchronized, use Cache{TKey, TValue}.SynchronizedCache(Cache{TKey, TValue}) if needed. | ||
ArrayUtil | Methods for manipulating arrays. | ||
Attribute | Base class for Attributes that can be added to a Lucene.Net.Util.AttributeSource. Attributes are used to add data in a dynamic, yet type-safe way to a source of usually streamed objects, e. g. a Lucene.Net.Analysis.TokenStream. | ||
AttributeSource | An AttributeSource contains a list of different Attributes, and methods to add and get them. There can only be a single instance of an attribute in the same AttributeSource instance. This is ensured by passing in the actual type of the Attribute (Class<Attribute>) to the AddAttribute{T}(), which then checks if an instance of that type is already present. If yes, it returns the instance, otherwise it creates a new instance and returns it. | ||
AttributeFactory | An AttributeFactory creates instances of Attributes. | ||
State | This class holds the state of an AttributeSource. | ||
AverageGuessMemoryModel | An average, best guess, MemoryModel that should work okay on most systems | ||
BitUtil | A variety of high efficiencly bit twiddling routines | ||
BitVector | Optimized implementation of a vector of bits. This is more-or-less like java.util.BitSet, but also includes the following:
| ||
CloseableThreadLocal | Java's builtin ThreadLocal has a serious flaw: it can take an arbitrarily long amount of time to dereference the things you had stored in it, even once the ThreadLocal instance itself is no longer referenced. This is because there is single, master map stored for each thread, which all ThreadLocals share, and that master map only periodically purges "stale" entries | ||
CloseableThreadLocal< T > | Java's builtin ThreadLocal has a serious flaw: it can take an arbitrarily long amount of time to dereference the things you had stored in it, even once the ThreadLocal instance itself is no longer referenced. This is because there is single, master map stored for each thread, which all ThreadLocals share, and that master map only periodically purges "stale" entries | ||
Constants | Some useful constants. | ||
DocIdBitSet | Simple DocIdSet and DocIdSetIterator backed by a BitSet | ||
FieldCacheSanityChecker | Provides methods for sanity checking that entries in the FieldCache are not wasteful or inconsistent. Lucene 2.9 Introduced numerous enhancements into how the FieldCache is used by the low levels of Lucene searching (for Sorting and ValueSourceQueries) to improve both the speed for Sorting, as well as reopening of IndexReaders. But these changes have shifted the usage of FieldCache from "top level" IndexReaders (frequently a MultiReader or DirectoryReader) down to the leaf level SegmentReaders. As a result, existing applications that directly access the FieldCache may find RAM usage increase significantly when upgrading to 2.9 or Later. This class provides an API for these applications (or their Unit tests) to check at run time if the FieldCache contains "insane" usages of the FieldCache. EXPERIMENTAL API: This API is considered extremely advanced and experimental. It may be removed or altered w/o warning in future releases of Lucene. | ||
Insanity | Simple container for a collection of related CacheEntry objects that in conjunction with eachother represent some "insane" usage of the FieldCache. | ||
InsanityType | An Enumaration of the differnet types of "insane" behavior that may be detected in a FieldCache | ||
IAttribute | Base interface for attributes. | ||
IdentityDictionary< TKey, TValue > | A class that mimics Java's IdentityHashMap in that it determines object equality solely on ReferenceEquals rather than (possibly overloaded) object.Equals() | ||
IndexableBinaryStringTools | Provides support for converting byte sequences to Strings and back again. The resulting Strings preserve the original byte sequences' sort order | ||
MapOfSets< TKey, TValue > | Helper class for keeping Listss of Objects associated with keys. WARNING: THIS CLASS IS NOT THREAD SAFE | ||
MemoryModel | Returns primitive memory sizes for estimating RAM usage | ||
NumericUtils | This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs | ||
IntRangeBuilder | Expert: Callback for SplitIntRange. You need to overwrite only one of the methods. <font color="red">NOTE: This is a very low-level interface, the method signatures may change in later versions.</font> | ||
LongRangeBuilder | Expert: Callback for SplitLongRange. You need to overwrite only one of the methods. <font color="red">NOTE: This is a very low-level interface, the method signatures may change in later versions.</font> | ||
OpenBitSet | An "open" BitSet implementation that allows direct access to the array of words storing the bits. Unlike java.util.bitset, the fact that bits are packed into an array of longs is part of the interface. This allows efficient implementation of other algorithms by someone other than the author. It also allows one to efficiently implement alternate serialization or interchange formats. OpenBitSet is faster than java.util.BitSet in most operations and much faster at calculating cardinality of sets and results of set operations. It can also handle sets of larger cardinality (up to 64 * 2**32-1) The goals of OpenBitSet are the fastest implementation possible, and maximum code reuse. Extra safety and encapsulation may always be built on top, but if that's built in, the cost can never be removed (and hence people re-implement their own version in order to get better performance). If you want a "safe", totally encapsulated (and slower and limited) BitSet class, use java.util.BitSet . | ||
OpenBitSetDISI | |||
OpenBitSetIterator | An iterator to iterate over set bits in an OpenBitSet. This is faster than nextSetBit() for iterating over the complete set of bits, especially when the density of the bits set is high | ||
PriorityQueue< T > | A PriorityQueue maintains a partial ordering of its elements such that the least element can always be found in constant time. Put()'s and pop()'s require log(size) time | ||
RamUsageEstimator | Estimates the size of a given Object using a given MemoryModel for primitive size information | ||
ReaderUtil | Common util methods for dealing with IndexReaders. | ||
ScorerDocQueue | A ScorerDocQueue maintains a partial ordering of its Scorers such that the least Scorer can always be found in constant time. Put()'s and pop()'s require log(size) time. The ordering is by Scorer.doc(). | ||
SimpleStringInterner | Simple lockless and memory barrier free String intern cache that is guaranteed to return the same String instance as String.intern() does. | ||
SmallFloat | Floating point numbers smaller than 32 bits | ||
SortedVIntList | Stores and iterate on sorted integers in compressed form in RAM. The code for compressing the differences between ascending integers was borrowed from Lucene.Net.Store.IndexInput and Lucene.Net.Store.IndexOutput.NOTE: this class assumes the stored integers are doc Ids (hence why it extends DocIdSet). Therefore its Iterator() assumes DocIdSetIterator.NO_MORE_DOCS can be used as sentinel. If you intent to use this value, then make sure it's not used during search flow. | ||
SorterTemplate | Borrowed from Cglib. Allows custom swap so that two arrays can be sorted at the same time. | ||
StringHelper | Methods for manipulating strings. | ||
StringInterner | Subclasses of StringInterner are required to return the same single String object for all equal strings. Depending on the implementation, this may not be the same object returned as String.intern() | ||
ToStringUtils | Helper methods to ease implementing Object.ToString(). | ||
LucenePackage | Lucene's package information, including version. * | ||
LuceneMonitorInstall | |||
ProjectInstaller | Summary description for ProjectInstaller. | ||
SF | |||
Snowball | |||
Ext | |||
DanishStemmer | Generated class implementing code defined by a snowball script. | ||
DutchStemmer | Generated class implementing code defined by a snowball script. | ||
EnglishStemmer | Generated class implementing code defined by a snowball script. | ||
FinnishStemmer | Generated class implementing code defined by a snowball script. | ||
FrenchStemmer | Generated class implementing code defined by a snowball script. | ||
German2Stemmer | Generated class implementing code defined by a snowball script. | ||
GermanStemmer | Generated class implementing code defined by a snowball script. | ||
HungarianStemmer | |||
ItalianStemmer | Generated class implementing code defined by a snowball script. | ||
KpStemmer | Generated class implementing code defined by a snowball script. | ||
LovinsStemmer | Generated class implementing code defined by a snowball script. | ||
NorwegianStemmer | Generated class implementing code defined by a snowball script. | ||
PorterStemmer | Generated class implementing code defined by a snowball script. | ||
PortugueseStemmer | |||
RomanianStemmer | |||
RussianStemmer | Generated class implementing code defined by a snowball script. | ||
SpanishStemmer | Generated class implementing code defined by a snowball script. | ||
SwedishStemmer | Generated class implementing code defined by a snowball script. | ||
TurkishStemmer | |||
Among | |||
SnowballProgram | This is the rev 500 of the snowball SVN trunk, but modified: made abstract and introduced abstract method stem to avoid expensive reflection in filter class | ||
TestApp | |||
Simplicit | |||
Net | |||
Lzo | |||
LZOCompressor | Wrapper class for the highly performant LZO compression library | ||
Spatial4n | |||
Core | |||
Exceptions | |||
InvalidSpatialArgument | |||
SpellChecker | |||
Net | |||
Search | |||
Spell | |||
IDictionary | A simple interface representing a Dictionary | ||
JaroWinklerDistance | |||
LevenshteinDistance | Levenshtein edit distance | ||
LuceneDictionary | Lucene Dictionary | ||
NGramDistance | |||
PlainTextDictionary | Dictionary represented by a file text. Format allowed: 1 word per line: word1 word2 word3 | ||
SpellChecker | |||
StringDistance | Interface for string distances. | ||
SuggestWord | SuggestWord Class, used in suggestSimilar method in SpellChecker class | ||
SuggestWordQueue | |||
TRStringDistance | Edit distance class | ||
System | |||
WorldNet | |||
Net | |||
SynExpand | Expand a query by looking up synonyms for every term. You need to invoke Syns2Index first to build the synonym index | ||
Syns2Index | From project WordNet.Net.Syns2Index | ||
SynLookup | Test program to look up synonyms. | ||
Syns2Index | From project WordNet.Net.Syns2Index | ||
Syns2Index | Convert the prolog file wn_s.pl from the WordNet prolog download into a Lucene index suitable for looking up synonyms and performing query expansion (SynExpand.Expand) |