Here are the classes, structs, unions and interfaces with brief descriptions:

[detail level 1234567]

Contrib

Regex

CSharpRegexCapabilities C# Regex based implementation of IRegexCapabilities.

IRegexCapabilities Defines basic operations needed by RegexQuery for a regular expression implementation.

IRegexQueryCapable Defines methods for regular expression supporting queries to use.

RegexQuery Regular expression based query.

RegexTermEnum Subclass of FilteredTermEnum for enumerating all terms that match the specified regular expression term using the specified regular expression implementation

SpanRegexQuery A SpanQuery version of RegexQuery allowing regular expression queries to be nested within other SpanQuery subclasses.

ArabicLetterTokenizer

ArabicNormalizationFilter

CJKAnalyzer Filters CJKTokenizer with StopFilter

CJKTokenizer

Cn

ChineseAnalyzer An Analyzer that tokenizes text with ChineseTokenizer and filters with ChineseFilter

ChineseFilter A TokenFilter with a stop word table

ChineseTokenizer Tokenize Chinese text as individual chinese chars

Compound

CompoundWordTokenFilterBase

DictionaryCompoundWordTokenFilter

Cz

CzechAnalyzer

De

GermanAnalyzer Analyzer for German language. Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default.

GermanDIN2Stemmer A stemmer for the german language that uses the DIN-5007-2 "Phone Book" rules for handling umlaut characters.

GermanStemFilter A filter that stems German words. It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a GermanStemmer).

GermanStemmer A stemmer for German words. The algorithm is based on the report "A Fast and Simple Stemming Algorithm for German Words" by JГ¶rg Caumanns (joerg.nosp@m..cau.nosp@m.manns.nosp@m.@iss.nosp@m.t.fhg.nosp@m..de).

SingleCharTokenAnalyzer This analyzer targets short fields where word like searches are required. [SomeU.nosp@m.ser@.nosp@m.GMAIL.nosp@m..com 1234567890] will be tokenized as [s.o.m.e.u.s.e.r..g.m.a.i.l..com..1.2.3.4.5.6.7.8.9.0] (read .'s as blank)

UnaccentedWordAnalyzer Another Analyzer. Every char which is not a letter or digit is treated as a word separator. [Name..nosp@m.Surn.nosp@m.ame@g.nosp@m.mail.nosp@m..com 123.456 ğüşıöçĞÜŞİÖÇ$ΑΒΓΔΕΖ::АБВГДЕ SSß] will be tokenized as [name surname gmail com 123 456 gusioc gusioc αβγδεζ абвгде ssss]

LetterOrDigitTokenizer if a char is not a letter or digit, it is a word separator

Fa

PersianAnalyzer

PersianNormalizationFilter

HunspellAffix Wrapper class representing a hunspell affix.

HunspellDictionary

HunspellStem

HunspellStemFilter TokenFilter that uses hunspell affix rules and words to stem tokens. Since hunspell supports a word having multiple stems, this filter can emit multiple tokens for each consumed token.

HunspellStemmer HunspellStemmer uses the affix rules declared in the HunspellDictionary to generate one or more stems for a word. It conforms to the algorithm in the original hunspell algorithm, including recursive suffix stripping.

HunspellWord

Miscellaneous

EmptyTokenStream An always exhausted token stream

InjectablePrefixAwareTokenFilter

PatternAnalyzer

PrefixAndSuffixAwareTokenFilter Links two PrefixAwareTokenFilter. NOTE: This filter might not behave correctly if used with custom Attributes, i.e. Attributes other than the ones located in Lucene.Net.Analysis.Tokenattributes.

PrefixAwareTokenFilter Joins two token streams and leaves the last token of the first stream available to be used when updating the token values in the second stream based on that token

SingleTokenTokenStream A TokenStream containing a single token.

AbstractEncoder Base class for payload encoders.

DelimitedPayloadTokenFilter Characters before the delimiter are the "token", those after are the payload. For example, if the delimiter is '|', then for the string "foo|bar", foo is the token and "bar" is a payload. Note, you can also include a org.apache.lucene.analysis.payloads.PayloadEncoder to convert the payload in an appropriate way (from characters to bytes). Note make sure your Tokenizer doesn't split on the delimiter, or this won't work

FloatEncoder Encode a character array Float as a org.apache.lucene.index.Payload.

IdentityEncoder Does nothing other than convert the char array to a byte array using the specified encoding.

IntegerEncoder Encode a character array Integer as a org.apache.lucene.index.Payload.

NumericPayloadTokenFilter Assigns a payload to a token based on the Token.Type()

PayloadEncoder Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to Payload NOTE: this interface is subject to change

TokenOffsetPayloadTokenFilter Adds the Token.StartOffset and Token.EndOffset First 4 bytes are the start

TypeAsPayloadTokenFilter Makes the Token.Type() a payload. Encodes the type using System.Text.Encoding.UTF8 as the encoding

Position

PositionFilter

Query

QueryAutoStopWordAnalyzer

Reverse

ReverseStringFilter

Ru

RussianAnalyzer Analyzer for Russian language. Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.

RussianLetterTokenizer A RussianLetterTokenizer is a Tokenizer that extends LetterTokenizer by also allowing the basic latin digits 0-9. /summary>

RussianLowerCaseFilter Normalizes token text to lower case.

OneDimensionalNonWeightedTokenSettingsCodec Using this codec makes a ShingleMatrixFilter act like ShingleFilter. It produces the most simple sort of shingles, ignoring token position increments, etc

SimpleThreeDimensionalTokenSettingsCodec A full featured codec not to be used for something serious

TokenSettingsCodec Strategy used to code and decode meta data of the tokens from the input stream regarding how to position the tokens in the matrix, set and retreive weight, etc.

TwoDimensionalNonWeightedSynonymTokenSettingsCodec A codec that creates a two dimensional matrix by treating tokens from the input stream with 0 position increment as new rows to the current column.

Matrix

Column

Matrix A column focused matrix in three dimensions:

MatrixPermutationIterator

Row

ShingleAnalyzerWrapper

DateRecognizerSinkFilter

TokenRangeSinkFilter

TokenTypeSinkFilter

Snowball

SnowballAnalyzer Filters StandardTokenizer with StandardFilter, LowerCaseFilter, StopFilter and SnowballFilter

SnowballFilter A filter that stems words using a Snowball-generated stemmer

Standard

StandardAnalyzer Filters StandardTokenizer with StandardFilter, LowerCaseFilter and StopFilter, using a list of English stop words

StandardFilter Normalizes tokens extracted with StandardTokenizer.

StandardTokenizer A grammar-based tokenizer constructed with JFlex

StandardTokenizerImpl This class is a scanner generated by JFlex 1.4.1 on 9/4/08 6:49 PM from the specification file /tango/mike/src/lucene.standarddigit/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex

FlagsAttribute This attribute can be used to pass different flags down the tokenizer chain, eg from one TokenFilter to another one.

IFlagsAttribute This attribute can be used to pass different flags down the Tokenizer chain, eg from one TokenFilter to another one.

IOffsetAttribute The start and end character offset of a Token.

IPayloadAttribute The payload of a Token. See also Payload.

IPositionIncrementAttribute The positionIncrement determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching

ITermAttribute The term text of a Token.

ITypeAttribute A Token's lexical type. The Default value is "word".

OffsetAttribute The start and end character offset of a Token.

PayloadAttribute The payload of a Token. See also Payload.

PositionIncrementAttribute The positionIncrement determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching

TermAttribute The term text of a Token.

TypeAttribute A Token's lexical type. The Default value is "word".

ChainedFilter

Analyzer An Analyzer builds TokenStreams, which analyze text. It thus represents a policy for extracting index terms from text. Typical implementations first build a Tokenizer, which breaks the stream of characters from the Reader into raw Tokens. One or more TokenFilters may then be applied to the output of the Tokenizer.

ASCIIFoldingFilter This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists

BaseCharFilter

CachingTokenFilter This class can be used if the token attributes of a TokenStream are intended to be consumed more than once. It caches all token attribute states locally in a List

CharArraySet A simple class that stores Strings as char[]'s in a hash table. Note that this is not a general purpose class. For example, it cannot remove items from the set, nor does it resize its hash table to be smaller, etc. It is designed to be quick to test if a char[] is in the set without the necessity of converting it to a String first. Please note: This class implements System.Collections.Generic.ISet{T} but does not behave like it should in all cases. The generic type is System.Collections.Generic.ICollection{T}, because you can add any object to it, that has a string representation. The add methods will use object.ToString() and store the result using a char buffer. The same behaviour have the Contains(object) methods. The GetEnumerator method returns an string IEnumerable. For type safety also stringIterator() is provided.

CharArraySetEnumerator The IEnumerator<String> for this set. Strings are constructed on the fly, so use nextCharArray for more efficient access

CharFilter Subclasses of CharFilter can be chained to filter CharStream. They can be used as System.IO.TextReader with additional offset correction. Tokenizers will automatically use CorrectOffset if a CharFilter/CharStream subclass is used

CharReader CharReader is a Reader wrapper. It reads chars from Reader and outputs CharStream, defining an identify function CorrectOffset method that simply returns the provided offset.

CharStream CharStream adds CorrectOffset functionality over System.IO.TextReader. All Tokenizers accept a CharStream instead of System.IO.TextReader as input, which enables arbitrary character based filtering before tokenization. The CorrectOffset method fixed offsets to account for removal or insertion of characters, so that the offsets reported in the tokens match the character offsets of the original Reader.

CharTokenizer An abstract base class for simple, character-oriented tokenizers.

ISOLatin1AccentFilter A filter that replaces accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented equivalent. The case will not be altered. For instance, 'À' will be replaced by 'a'.

KeywordAnalyzer "Tokenizes" the entire stream as a single token. This is useful for data like zip codes, ids, and some product names.

KeywordTokenizer Emits the entire input as a single token.

LengthFilter Removes words that are too long or too short from the stream.

LetterTokenizer A LetterTokenizer is a tokenizer that divides text at non-letters. That's to say, it defines tokens as maximal strings of adjacent letters, as defined by java.lang.Character.isLetter() predicate. Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.

LowerCaseFilter Normalizes token text to lower case.

LowerCaseTokenizer LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. It divides text at non-letters and converts them to lower case. While it is functionally equivalent to the combination of LetterTokenizer and LowerCaseFilter, there is a performance advantage to doing the two tasks at once, hence this (redundant) implementation. Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.

MappingCharFilter Simplistic CharFilter that applies the mappings contained in a NormalizeCharMap to the character stream, and correcting the resulting changes to the offsets.

NormalizeCharMap Holds a map of String input to String output, to be used with MappingCharFilter.

NumericTokenStream Expert: This class provides a TokenStream for indexing numeric values that can be used by NumericRangeQuery{T} or NumericRangeFilter{T}

PerFieldAnalyzerWrapper This analyzer is used to facilitate scenarios where different fields require different analysis techniques. Use AddAnalyzer to add a non-default analyzer on a field name basis

PorterStemFilter Transforms the token stream as per the Porter stemming algorithm. Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther down the Tokenizer chain in order for this to work properly! To use this filter with other analyzers, you'll want to write an Analyzer class that sets up the TokenStream chain as you want it. To use this with LowerCaseTokenizer, for example, you'd write an analyzer like this:

PorterStemmer Stemmer, implementing the Porter Stemming Algorithm

SimpleAnalyzer An Analyzer that filters LetterTokenizer with LowerCaseFilter

StopAnalyzer Filters LetterTokenizer with LowerCaseFilter and StopFilter

StopFilter Removes stop words from a token stream.

TeeSinkTokenFilter This TokenFilter provides the ability to set aside attribute states that have already been analyzed. This is useful in situations where multiple fields share many common analysis steps and then go their separate ways. It is also useful for doing things like entity extraction or proper noun analysis as part of the analysis workflow and saving off those tokens for use in another field

AnonymousClassSinkFilter

SinkFilter A filter that decides which AttributeSource states to store in the sink.

SinkTokenStream

Token A Token is an occurrence of a term from the text of a field. It consists of a term's text, the start and end offset of the term in the text of the field, and a type string. The start and end offsets permit applications to re-associate a token with its source text, e.g., to display highlighted query terms in a document browser, or to show matching text fragments in a <abbr title="KeyWord In Context">KWIC</abbr> display, etc. The type is a string, assigned by a lexical analyzer (a.k.a. tokenizer), naming the lexical or syntactic class that the token belongs to. For example an end of sentence marker token might be implemented with type "eos". The default token type is "word". A Token can optionally have metadata (a.k.a. Payload) in the form of a variable length byte array. Use TermPositions.PayloadLength and TermPositions.GetPayload(byte[], int) to retrieve the payloads from the index.

TokenAttributeFactory Expert: Creates an AttributeFactory returning Token as instance for the basic attributes and for all other attributes calls the given delegate factory.

TokenFilter A TokenFilter is a TokenStream whose input is another TokenStream. This is an abstract class; subclasses must override TokenStream.IncrementToken()

Tokenizer A Tokenizer is a TokenStream whose input is a Reader. This is an abstract class; subclasses must override TokenStream.IncrementToken() NOTE: Subclasses overriding TokenStream.IncrementToken() must call AttributeSource.ClearAttributes() before setting attributes.

TokenStream

A TokenStream enumerates the sequence of tokens, either from Fields of a Document or from query text. This is an abstract class. Concrete subclasses are:

Tokenizer, a TokenStream whose input is a Reader; and
TokenFilter, a TokenStream whose input is another TokenStream.

A new TokenStream API has been introduced with Lucene 2.9. This API has moved from being Token based to IAttribute based. While Token still exists in 2.9 as a convenience class, the preferred way to store the information of a Token is to use Util.Attributes. TokenStream now extends AttributeSource, which provides access to all of the token IAttributes for the TokenStream. Note that only one instance per Util.Attribute is created and reused for every token. This approach reduces object creation and allows local caching of references to the Util.Attributes. See IncrementToken() for further details. The workflow of the new TokenStream API is as follows:

Instantiation of TokenStream/TokenFilters which add/get attributes to/from the AttributeSource.
The consumer calls TokenStream.Reset().
The consumer retrieves attributes from the stream and stores local references to all attributes it wants to access
The consumer calls IncrementToken() until it returns false and consumes the attributes after each call.
The consumer calls End() so that any end-of-stream operations can be performed.
The consumer calls Close() to release any resource when finished using the TokenStream

To make sure that filters and consumers know which attributes are available, the attributes must be added during instantiation. Filters and consumers are not required to check for availability of attributes in IncrementToken(). You can find some example code for the new API in the analysis package level Javadoc. Sometimes it is desirable to capture a current state of a TokenStream , e. g. for buffering purposes (see CachingTokenFilter, TeeSinkTokenFilter). For this usecase AttributeSource.CaptureState and AttributeSource.RestoreState can be used.

WhitespaceAnalyzer An Analyzer that uses WhitespaceTokenizer.

WhitespaceTokenizer A WhitespaceTokenizer is a tokenizer that divides text at whitespace. Adjacent sequences of non-Whitespace characters form tokens.

WordlistLoader Loader for text files that represent a list of stopwords.

HTMLParserConstants_Fields

HTMLParserTokenManager

ParseException This exception is thrown when parse errors are encountered. You can explicitly create objects of this exception type by calling the method generateParseException in the generated parser

ParserThread

SimpleCharStream An implementation of interface CharStream, where the stream is assumed to contain only ASCII characters (without unicode processing).

Tags

Test

Token Describes the input token stream.

TokenMgrError

Distributed

Configuration

CurrentIndex Definition of current index information managed by the LuceneUpdater windows service. The <copy> node within the <indexset> node represents the information needed to load a CurrentIndex object for a given IndexSet

DistributedSearcher Definition of a configurable set of search indexes made accessible by the LuceneServer windows service for a consuming application. These search indexes are defined in the configuration file of an application. The locations defined in a DistributedSearcher match the exposed object URIs as defined in the LuceneServer service

DistributedSearcherConfigurationHandler Implementation of custom configuration handler for the definition of search indexes made accessible by the LuceneServer windows service. This configuration resides in the configuration file of an application consuming the search indexes made accessible by the LuceneServer windows service.

DistributedSearchers Definition of a configurable set of search indexes made accessible by the LuceneServer windows service for a consuming application. These search indexes are defined in the configuration file of an application. The locations defined in a DistributedSearcher match the exposed object URIs as defined in the LuceneServer service

LuceneServerIndex Definition of a configurable search index made accessible by the LuceneServer windows service

LuceneServerIndexConfigurationHandler Implementation of custom configuration handler for the definition of search indexes made accessible by the LuceneServer windows service.

LuceneServerIndexes Definition of configurable search indexes made accessible by the LuceneServer windows service

Indexing

DeleteIndexDocument

FileNameComparer Summary description for FileNameComparer.

IndexDocument Base class representing a record to be added to a Lucene index

IndexSet Definition of configurable search indexes managed by the LuceneUpdater windows service

IndexSetConfigurationHandler Implementation of custom configuration handler for the definition of master indexes as managed by the LuceneUpdater windows service.

IndexSets Definition of configurable search indexes managed by the LuceneUpdater windows service

Operations

LuceneMonitor A Windows service that provides system ping checking against LuceneServer.

Search

DistributedSearchable An derived implementation of RemoteSearchable, DistributedSearchable provides additional support for integration with .Net remoting objects and constructs.

Documents

AbstractField

CompressionTools Simple utility class providing static methods to compress and decompress binary data for stored fields. This class uses java.util.zip.Deflater and Inflater classes to compress and decompress.

DateField Provides support for converting dates to strings and vice-versa. The strings are structured so that lexicographic sorting orders by date, which makes them suitable for use as field values and search terms

DateTools Provides support for converting dates to strings and vice-versa. The strings are structured so that lexicographic sorting orders them by date, which makes them suitable for use as field values and search terms

Resolution Specifies the time granularity.

Document Documents are the unit of indexing and search

Field A field is a section of a Document. Each field has two parts, a name and a value. Values may be free text, provided as a String or as a Reader, or they may be atomic keywords, which are not further processed. Such keywords may be used to represent dates, urls, etc. Fields are optionally stored in the index, so that they may be returned with hits on the document.

IFieldable Synonymous with Field

FieldSelector Similar to a java.io.FileFilter, the FieldSelector allows one to make decisions about what Fields get loaded on a Document by Lucene.Net.Index.IndexReader.Document(int,Lucene.Net.Documents.FieldSelector)

LoadFirstFieldSelector Load the First field and break. See FieldSelectorResult.LOAD_AND_BREAK

MapFieldSelector A FieldSelector based on a Map of field names to FieldSelectorResults

NumberTools Provides support for converting longs to Strings, and back again. The strings are structured so that lexicographic sorting order is preserved

NumericField This class provides a Field that enables indexing of numeric values for efficient range filtering and sorting. Here's an example usage, adding an int value:

SetBasedFieldSelector Declare what fields to load normally and what fields to load lazily

Index

Memory

MemoryIndex High-performance single-document main memory Apache Lucene fulltext search index

TermComparer

TermComparer< T >

FieldEnumerator< T >

TermEnumerator The enumerator over the terms in an index.

TermDocEnumerator Class to handle creating a TermDocs and allowing for seeking and enumeration. Used when you have a set of one or moreterms for which you want to enumerate over the documents that contain those terms.

TermDocUsingTermsEnumerator Class to handle enumeration over the TermDocs that does NOT close them on a call to Dispose!

StringFieldEnumerator Implementation for enumerating over terms with a string value.

NumericFieldEnum< T > Base for enumerating over numeric fields.

IntFieldEnumerator Implementation for enumerating over all of the terms in an int numeric field.

FloatFieldEnumerator Implementation for enumerating over all of the terms in a float numeric field.

DoubleFieldEnumerator Implementation for enumerating over all of the terms in a double numeric field.

LongFieldEnumerator Implementation for enumerating over all of the terms in a long numeric field.

SegmentsGenCommit Class that will force an index writer to open an index based on the generation in the segments.gen file as opposed to the highest generation found in a directory listing

TermVectorEnumerator Class to allow for enumerating over the documents in the index to retrieve the term vector for each one.

EmptyVector A simple TermFreqVector implementation for an empty vector for use with a deleted document or a document that does not have the field that is being enumerated.

AbstractAllTermDocs Base class for enumerating all but deleted docs

AllTermDocs

BufferedDeletes Holds buffered deletes, by docID, term or query. We hold two instances of this class: one for the deletes prior to the last flush, the other for deletes after the last flush. This is so if we need to abort (discard all buffered docs) we can also discard the buffered deletes yet keep the deletes done during previously flushed segments.

ByteBlockPool

Allocator

ByteSliceReader

ByteSliceWriter Class to write byte streams into slices of shared byte[]. This is used by DocumentsWriter to hold the posting list for many terms in RAM.

CharBlockPool

CheckIndex Basic tool and API to check the health of an index and write a new segments file that removes reference to problematic segments

Status Returned from CheckIndex_Renamed_Method() detailing the health and status of the index

FieldNormStatus Status from testing field norms.

SegmentInfoStatus Holds the status of each segment in the index. See SegmentInfos

StoredFieldStatus Status from testing stored fields.

TermIndexStatus Status from testing term index.

TermVectorStatus Status from testing stored fields.

CompoundFileReader Class for accessing a compound stream. This class implements a directory, but is limited to only read operations. Directory methods that would normally modify data throw an exception.

CSIndexInput Implementation of an IndexInput that reads from a portion of the compound file. The visibility is left as "package" only because this helps with testing since JUnit test cases in a different class can then access package fields of this class.

CompoundFileWriter

Combines multiple files into a single compound file. The file format:

VInt fileCount
{Directory} fileCount entries with the following structure:
- long dataOffset
- String fileName
{File Data} fileCount entries with the raw data of the corresponding file

ConcurrentMergeScheduler A MergeScheduler that runs each merge using a separate thread, up until a maximum number of threads (MaxThreadCount) at which when a merge is needed, the thread(s) that are updating the index will pause until one or more merges completes. This is a simple way to use concurrency in the indexing process without having to create and manage application level threads.

MergeThread

CorruptIndexException This exception is thrown when Lucene detects an inconsistency in the index.

DefaultSkipListReader Implements the skip list reader for the default posting list format that stores positions and payloads

DefaultSkipListWriter Implements the skip list writer for the default posting list format that stores positions and payloads

DirectoryReader An IndexReader which reads indexes with multiple segments.

DocConsumer

DocConsumerPerThread

DocFieldConsumer

DocFieldConsumerPerField

DocFieldConsumerPerThread

DocFieldConsumers This is just a "splitter" class: it lets you wrap two DocFieldConsumer instances as a single consumer.

DocFieldConsumersPerField

DocFieldConsumersPerThread

DocFieldProcessor This is a DocConsumer that gathers all fields under the same name, and calls per-field consumers to process field by field. This class doesn't doesn't do any "real" work of its own: it just forwards the fields to a DocFieldConsumer.

DocFieldProcessorPerField Holds all per thread, per field state.

DocFieldProcessorPerThread Gathers all Fieldables for a document under the same name, updates FieldInfos, and calls per-field consumers to process field by field

DocInverter This is a DocFieldConsumer that inverts each field, separately, from a Document, and accepts a InvertedTermsConsumer to process those terms.

DocInverterPerField Holds state for inverting all occurrences of a single field in the document. This class doesn't do anything itself; instead, it forwards the tokens produced by analysis to its own consumer (InvertedDocConsumerPerField). It also interacts with an endConsumer (InvertedDocEndConsumerPerField).

DocInverterPerThread This is a DocFieldConsumer that inverts each field, separately, from a Document, and accepts a InvertedTermsConsumer to process those terms.

DocumentsWriter This class accepts multiple added documents and directly writes a single segment file. It does this more efficiently than creating a single segment per document (with DocumentWriter) and doing standard merges on those segments

DocumentsWriterThreadState Used by DocumentsWriter to maintain per-thread state. We keep a separate Posting hash and other state for each thread and then merge postings hashes from all threads when writing the segment.

FieldInfo

FieldInfos Access to the Fieldable Info file that describes document fields and whether or not they are indexed. Each segment has a separate Fieldable Info file. Objects of this class are thread-safe for multiple readers, but only one thread can be adding documents at a time, with no other reader or writer threads accessing this object.

FieldInvertState This class tracks the number and position / offset parameters of terms being added to the index. The information collected in this class is also used to calculate the normalization factor for a field

FieldReaderException

FieldSortedTermVectorMapper For each Field, store a sorted collection of TermVectorEntrys This is not thread-safe.

FieldsReader Class responsible for access to stored document fields. It uses <segment>.fdt and <segment>.fdx; files

FieldsWriter

FilterIndexReader A FilterIndexReader contains another IndexReader, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality. The class FilterIndexReader itself simply implements all abstract methods of IndexReader with versions that pass all requests to the contained index reader. Subclasses of FilterIndexReader may further override some of these methods and may also provide additional methods and fields.

FilterTermDocs Base class for filtering Lucene.Net.Index.TermDocs implementations.

FilterTermEnum Base class for filtering TermEnum implementations.

FilterTermPositions Base class for filtering TermPositions implementations.

FormatPostingsDocsConsumer NOTE: this API is experimental and will likely change

FormatPostingsDocsWriter Consumes doc and freq, writing them using the current index file format

FormatPostingsFieldsConsumer Abstract API that consumes terms, doc, freq, prox and payloads postings. Concrete implementations of this actually do "something" with the postings (write it into the index in a specific format)

FormatPostingsFieldsWriter

FormatPostingsPositionsConsumer

FormatPostingsPositionsWriter

FormatPostingsTermsConsumer NOTE: this API is experimental and will likely change

FormatPostingsTermsWriter

FreqProxFieldMergeState Used by DocumentsWriter to merge the postings from multiple ThreadStates when creating a segment

FreqProxTermsWriter

FreqProxTermsWriterPerField

FreqProxTermsWriterPerThread

IndexCommit Expert: represents a single commit into an index as seen by the IndexDeletionPolicy or IndexReader.

IndexDeletionPolicy Expert: policy for deletion of stale index commits

IndexFileDeleter

IndexFileNameFilter Filename filter that accept filenames and extensions only created by Lucene.

IndexFileNames Useful constants representing filenames and extensions used by lucene

IndexReader IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable. Concrete subclasses of IndexReader are usually constructed with a call to one of the static open() methods, e.g. Open(Lucene.Net.Store.Directory, bool) For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral–they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions. An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then. NOTE: for backwards API compatibility, several methods are not listed as abstract, but have no useful implementations in this base class and instead always throw UnsupportedOperationException. Subclasses are strongly encouraged to override these methods, but in many cases may not need to. NOTE: as of 2.4, it's possible to open a read-only IndexReader using the static open methods that accepts the boolean readOnly parameter. Such a reader has better better concurrency as it's not necessary to synchronize on the isDeleted method. You must explicitly specify false if you want to make changes with the resulting IndexReader. NOTE: IndexReader instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexReader instance; use your own (non-Lucene) objects instead.

FieldOption Constants describing field properties, for example used for IndexReader.GetFieldNames(FieldOption).

IndexWriter An IndexWriter creates and maintains an index. The create argument to the constructor determines whether a new index is created, or whether an existing index is opened. Note that you can open an index with create=true even while readers are using the index. The old readers will continue to search the "point in time" snapshot they had opened, and won't see the newly created index until they re-open. There are also constructors with no create argument which will create a new index if there is not already an index at the provided path and otherwise open the existing index.In either case, documents are added with AddDocument(Document) and removed with DeleteDocuments(Term) or DeleteDocuments(Query). A document can be updated with UpdateDocument(Term, Document) (which just deletes and then adds the entire document). When finished adding, deleting and updating documents, Close() should be called. These changes are buffered in memory and periodically flushed to the Directory (during the above method calls). A flush is triggered when there are enough buffered deletes (see SetMaxBufferedDeleteTerms) or enough added documents since the last flush, whichever is sooner. For the added documents, flushing is triggered either by RAM usage of the documents (see SetRAMBufferSizeMB) or the number of added documents. The default is to flush when RAM usage hits 16 MB. For best indexing speed you should flush by RAM usage with a large RAM buffer. Note that flushing just moves the internal buffered state in IndexWriter into the index, but these changes are not visible to IndexReader until either Commit() or Close() is called. A flush may also trigger one or more segment merges which by default run with a background thread so as not to block the addDocument calls (see below for changing the MergeScheduler). If an index will not have more documents added for a while and optimal search performance is desired, then either the full Optimize() method or partial Optimize(int) method should be called before the index is closed. Opening an IndexWriter creates a lock file for the directory in use. Trying to open another IndexWriter on the same directory will lead to a LockObtainFailedException. The LockObtainFailedException is also thrown if an IndexReader on the same directory is used to delete documents from the index.

IndexReaderWarmer If GetReader() has been called (ie, this writer is in near real-time mode), then after a merge completes, this class can be invoked to warm the reader on the newly merged segment, before the merge commits. This is not required for near real-time search, but will reduce search latency on opening a new near real-time reader after a merge completes

MaxFieldLength Specifies maximum field length (in number of tokens/terms) in IndexWriter constructors. SetMaxFieldLength(int) overrides the value set by the constructor.

IntBlockPool

InvertedDocConsumer

InvertedDocConsumerPerField

InvertedDocConsumerPerThread

InvertedDocEndConsumer

InvertedDocEndConsumerPerField

InvertedDocEndConsumerPerThread

KeepOnlyLastCommitDeletionPolicy This IndexDeletionPolicy implementation that keeps only the most recent commit and immediately removes all prior commits after a new commit is done. This is the default deletion policy.

LogByteSizeMergePolicy This is a LogMergePolicy that measures size of a segment as the total byte size of the segment's files.

LogDocMergePolicy This is a LogMergePolicy that measures size of a segment as the number of documents (not taking deletions into account).

LogMergePolicy This class implements a MergePolicy that tries to merge segments into levels of exponentially increasing size, where each level has fewer segments than the value of the merge factor. Whenever extra segments (beyond the merge factor upper bound) are encountered, all segments within the level are merged. You can get or set the merge factor using MergeFactor and MergeFactor respectively.

MergeDocIDRemapper Remaps docIDs after a merge has completed, where the merged segments had at least one deletion. This is used to renumber the buffered deletes in IndexWriter when a merge of segments with deletions commits.

MergePolicy Expert: a MergePolicy determines the sequence of primitive merge operations to be used for overall merge and optimize operations.

MergeAbortedException

MergeException Exception thrown if there are any problems while executing a merge.

MergeSpecification A MergeSpecification instance provides the information necessary to perform multiple merges. It simply contains a list of OneMerge instances.

OneMerge OneMerge provides the information necessary to perform an individual primitive merge operation, resulting in a single new segment. The merge spec includes the subset of segments to be merged as well as whether the new segment should use the compound file format.

MergeScheduler Expert: IndexWriter uses an instance implementing this interface to execute the merges selected by a MergePolicy. The default MergeScheduler is ConcurrentMergeScheduler.

MultiLevelSkipListReader This abstract class reads skip lists with multiple levels

MultiLevelSkipListWriter This abstract class writes skip lists with multiple levels

MultipleTermPositions Allows you to iterate over the TermPositions for multiple Terms as a single TermPositions

MultiReader An IndexReader which reads multiple indexes, appending their content.

NormsWriter Writes norms. Each thread X field accumulates the norms for the doc/fields it saw, then the flush method below merges all of these together into a single _X.nrm file.

NormsWriterPerField Taps into DocInverter, as an InvertedDocEndConsumer, which is called at the end of inverting each field. We just look at the length for the field (docState.length) and record the norm.

NormsWriterPerThread

ParallelReader An IndexReader which reads multiple, parallel indexes. Each index added must have the same number of documents, but typically each contains different fields. Each document contains the union of the fields of all documents with the same document number. When searching, matches for a query term are from the first index added that has the field

Payload A Payload is metadata that can be stored together with each occurrence of a term. This metadata is stored inline in the posting list of the specific term. To store payloads in the index a TokenStream has to be used that produces payload data. Use TermPositions.PayloadLength and TermPositions.GetPayload(byte[], int) to retrieve the payloads from the index.

PositionBasedTermVectorMapper For each Field, store position by position information. It ignores frequency information This is not thread-safe.

TVPositionInfo Container for a term at a position

RawPostingList This is the base class for an in-memory posting list, keyed by a Token. TermsHash maintains a hash table holding one instance of this per unique Token. Consumers of TermsHash (TermsHashConsumer) must subclass this class with its own concrete class. FreqProxTermsWriter.PostingList is a private inner class used for the freq/prox postings, and TermVectorsTermsWriter.PostingList is a private inner class used to hold TermVectors postings.

ReadOnlyDirectoryReader

ReadOnlySegmentReader

ReusableStringReader Used by DocumentsWriter to implemented a StringReader that can be reset to a new string; we use this when tokenizing the string value from a Field.

SegmentInfo Information about a segment such as it's name, directory, and files related to the segment

SegmentInfos A collection of segmentInfo objects with methods for operating on those segments in relation to the file system

FindSegmentsFile Utility class for executing code that needs to do something with the current segments file. This is necessary with lock-less commits because from the time you locate the current segments file name, until you actually open it, read its contents, or check modified time, etc., it could have been deleted due to a writer commit finishing.

SegmentMergeInfo

SegmentMergeQueue

SegmentMerger The SegmentMerger class combines two or more Segments, represented by an IndexReader (Add, into a single Segment. After adding the appropriate readers, call the merge method to combine the segments. If the compoundFile flag is set, then the segments will be merged into a compound file

SegmentReader NOTE: This API is new and still experimental (subject to change suddenly in the next release)

CoreReaders

Norm Byte[] referencing is used because a new norm object needs to be created for each clone, and the byte array is all that is needed for sharing between cloned readers. The current norm referencing is for sharing between readers whereas the byte[] referencing is for copy on write which is independent of reader references (i.e. incRef, decRef).

Ref

SegmentTermPositionVector

SegmentTermVector

SegmentWriteState

SerialMergeScheduler A MergeScheduler that simply does each merge sequentially, using the current thread.

SnapshotDeletionPolicy A IndexDeletionPolicy that wraps around any other IndexDeletionPolicy and adds the ability to hold and later release a single "snapshot" of an index. While the snapshot is held, the IndexWriter will not remove any files associated with it even if the index is otherwise being actively, arbitrarily changed. Because we wrap another arbitrary IndexDeletionPolicy, this gives you the freedom to continue using whatever IndexDeletionPolicy you would normally want to use with your index. Note that you can re-use a single instance of SnapshotDeletionPolicy across multiple writers as long as they are against the same index Directory. Any snapshot held when a writer is closed will "survive" when the next writer is opened

SortedTermVectorMapper Store a sorted collection of Lucene.Net.Index.TermVectorEntrys. Collects all term information into a single, SortedSet.
NOTE: This Mapper ignores all Field information for the Document. This means that if you are using offset/positions you will not know what Fields they correlate with.
This is not thread-safe

StaleReaderException This exception is thrown when an IndexReader tries to make changes to the index (via IndexReader.DeleteDocument , IndexReader.UndeleteAll or IndexReader.SetNorm(int,string,float)) but changes have already been committed to the index since this reader was instantiated. When this happens you must open a new reader on the current index to make the changes.

StoredFieldsWriter This is a DocFieldConsumer that writes stored fields.

StoredFieldsWriterPerThread

Term A Term represents a word from text. This is the unit of search. It is composed of two elements, the text of the word, as a string, and the name of the field that the text occured in, an interned string. Note that terms may represent more than words from text fields, but also things like dates, email addresses, urls, etc.

TermBuffer

TermDocs TermDocs provides an interface for enumerating <document, frequency> pairs for a term. The document portion names each document containing the term. Documents are indicated by number. The frequency portion gives the number of times the term occurred in each document. The pairs are ordered by document number.

TermEnum Abstract class for enumerating terms. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it.

ITermFreqVector Provides access to stored term vector of a document field. The vector consists of the name of the field, an array of the terms tha occur in the field of the Lucene.Net.Documents.Document and a parallel array of frequencies. Thus, getTermFrequencies()[5] corresponds with the frequency of getTerms()[5], assuming there are at least 5 terms in the Document.

TermInfo A TermInfo is the record of information stored for a term.

TermInfosReader This stores a monotonically increasing set of <Term, TermInfo> pairs in a Directory. Pairs are accessed either by Term or by ordinal position the set.

TermInfosWriter This stores a monotonically increasing set of <Term, TermInfo> pairs in a Directory. A TermInfos can be written once, in order.

TermPositions TermPositions provides an interface for enumerating the <document, frequency, <position>* > tuples for a term. The document and frequency are the same as for a TermDocs. The positions portion lists the ordinal positions of each occurrence of a term in a document

TermPositionVector Extends TermFreqVector to provide additional information about positions in which each of the terms is found. A TermPositionVector not necessarily contains both positions and offsets, but at least one of these arrays exists.

TermsHash This class implements InvertedDocConsumer, which is passed each token produced by the analyzer on each field. It stores these tokens in a hash table, and allocates separate byte streams per token. Consumers of this class, eg FreqProxTermsWriter and TermVectorsTermsWriter , write their own byte streams under each term.

TermsHashConsumer

TermsHashConsumerPerField Implement this class to plug into the TermsHash processor, which inverts and stores Tokens into a hash table and provides an API for writing bytes into multiple streams for each unique Token.

TermsHashConsumerPerThread

TermsHashPerField

TermsHashPerThread

TermVectorEntry Convenience class for holding TermVector information.

TermVectorEntryFreqSortedComparator Compares Lucene.Net.Index.TermVectorEntrys first by frequency and then by the term (case-sensitive)

TermVectorMapper The TermVectorMapper can be used to map Term Vectors into your own structure instead of the parallel array structure used by Lucene.Net.Index.IndexReader.GetTermFreqVector(int,String). It is up to the implementation to make sure it is thread-safe

TermVectorOffsetInfo The TermVectorOffsetInfo class holds information pertaining to a Term in a Lucene.Net.Index.TermPositionVector's offset information. This offset information is the character offset as set during the Analysis phase (and thus may not be the actual offset in the original content).

TermVectorsReader

ParallelArrayTermVectorMapper Models the existing parallel array structure

TermVectorsTermsWriter

TermVectorsTermsWriterPerField

TermVectorsTermsWriterPerThread

TermVectorsWriter

Messages

INLSException Interface that exceptions should implement to support lazy loading of messages

Message Message Interface for a lazy loading. For Native Language Support (NLS), system of software internationalization.

MessageImpl Default implementation of Message interface. For Native Language Support (NLS), system of software internationalization.

NLS MessageBundles classes extend this class, to implement a bundle

IPriviligedAction

QueryParsers

ICharStream This interface describes a character stream that maintains line and column number positions of the characters. It also has the capability to backup the stream to some extent. An implementation of this interface is used in the TokenManager implementation generated by JavaCCParser

FastCharStream An efficient implementation of JavaCC's CharStream interface. Note that this does not do line-number counting, but instead keeps track of the character position of the token in the input, as required by Lucene's Lucene.Net.Analysis.Token API

MultiFieldQueryParser A QueryParser which constructs queries to search multiple fields

ParseException This exception is thrown when parse errors are encountered. You can explicitly create objects of this exception type by calling the method generateParseException in the generated parser

QueryParser This class is generated by JavaCC. The most important method is Parse(String)

QueryParserConstants Token literal values and constants. Generated by org.javacc.parser.OtherFilesGen::start()

QueryParserTokenManager Token Manager.

Token Describes the input token stream.

TokenMgrError Token Manager Error.

Search

Function

ByteFieldSource Expert: obtains single byte field values from the FieldCache using getBytes() and makes those values available as other numeric types, casting as needed

CustomScoreProvider An instance of this subclass should be returned by CustomScoreQuery.GetCustomScoreProvider, if you want to modify the custom score calculation of a CustomScoreQuery

CustomScoreQuery

Query that sets document score as a programmatic function of several (sub) scores:

the score of its subQuery (any query)
(optional) the score of its ValueSourceQuery (or queries). For most simple/convenient use cases this query is likely to be a FieldScoreQuery

Subclasses can modify the computation by overriding GetCustomScoreProvider

DocValues Expert: represents field values as different types. Normally created via a ValueSuorce for a particular field and reader

FieldCacheSource Expert: A base class for ValueSource implementations that retrieve values for a single field from the FieldCache. Fields used herein nust be indexed (doesn't matter if these fields are stored or not). It is assumed that each such indexed field is untokenized, or at least has a single token in a document. For documents with multiple tokens of the same field, behavior is undefined (It is likely that current code would use the value of one of these tokens, but this is not guaranteed). Document with no tokens in this field are assigned the Zero value

FieldScoreQuery

A query that scores each document as the value of the numeric input field. The query matches all documents, and scores each document according to the numeric value of that field. It is assumed, and expected, that:

The field used here is indexed, and has exactly one token in every scored document.
Best if this field is un_tokenized.
That token is parsable to the selected type.

Combining this query in a FunctionQuery allows much freedom in affecting document scores. Note, that with this freedom comes responsibility: it is more than likely that the default Lucene scoring is superior in quality to scoring modified as explained here. However, in some cases, and certainly for research experiments, this capability may turn useful. When contructing this query, select the appropriate type. That type should match the data stored in the field. So in fact the "right" type should be selected before indexing. Type selection has effect on the RAM usage:

Type.BYTE consumes 1 * maxDocs bytes.
Type.SHORT consumes 2 * maxDocs bytes.
Type.INT consumes 4 * maxDocs bytes.
Type.FLOAT consumes 8 * maxDocs bytes.

Caching: Values for the numeric field are loaded once and cached in memory for further use with the same IndexReader. To take advantage of this, it is extremely important to reuse index-readers or index-searchers, otherwise, for instance if for each query a new index reader is opened, large penalties would be paid for loading the field values into memory over and over again!

Type

Type of score field, indicating how field values are interpreted/parsed. The type selected at search search time should match the data stored in the field. Different types have different RAM requirements:

BYTE consumes 1 * maxDocs bytes.
SHORT consumes 2 * maxDocs bytes.
INT consumes 4 * maxDocs bytes.
FLOAT consumes 8 * maxDocs bytes.

FloatFieldSource Expert: obtains float field values from the FieldCache using getFloats() and makes those values available as other numeric types, casting as needed

IntFieldSource Expert: obtains int field values from the FieldCache using getInts() and makes those values available as other numeric types, casting as needed

OrdFieldSource Expert: obtains the ordinal of the field value from the default Lucene Fieldcache using getStringIndex(). The native lucene index order is used to assign an ordinal value for each field value. Field values (terms) are lexicographically ordered by unicode value, and numbered starting at 1. Example:
If there were only three field values: "apple","banana","pear"
then ord("apple")=1, ord("banana")=2, ord("pear")=3 WARNING: ord() depends on the position in an index and can thus change when other documents are inserted or deleted, or if a MultiSearcher is used

ReverseOrdFieldSource Expert: obtains the ordinal of the field value from the default Lucene FieldCache using getStringIndex() and reverses the order. The native lucene index order is used to assign an ordinal value for each field value. Field values (terms) are lexicographically ordered by unicode value, and numbered starting at 1.
Example of reverse ordinal (rord):
If there were only three field values: "apple","banana","pear"
then rord("apple")=3, rord("banana")=2, ord("pear")=1 WARNING: rord() depends on the position in an index and can thus change when other documents are inserted or deleted, or if a MultiSearcher is used

ShortFieldSource Expert: obtains short field values from the FieldCache using getShorts() and makes those values available as other numeric types, casting as needed

ValueSource Expert: source of values for basic function queries. At its default/simplest form, values - one per doc - are used as the score of that doc. Values are instantiated as DocValues for a particular reader. ValueSource implementations differ in RAM requirements: it would always be a factor of the number of documents, but for each document the number of bytes can be 1, 2, 4, or 8

ValueSourceQuery Expert: A Query that sets the scores of document to the values obtained from a ValueSource. This query provides a score for each and every undeleted document in the index. The value source can be based on a (cached) value of an indexed field, but it can also be based on an external source, e.g. values read from an external database. Score is set as: Score(doc,query) = query.getBoost()² * valueSource(doc)

Highlight

DefaultEncoder Simple IEncoder implementation that does not modify the output

GradientFormatter Formats text with different color intensity depending on the score of the term.

Highlighter Class used to markup highlighted terms found in the best sections of a text, using configurable IFragmenter, Scorer, IFormatter, IEncoder and tokenizers.

IEncoder Encodes original text. The IEncoder works with the Formatter to generate the output.

IFormatter Processes terms found in the original text, typically by applying some form of mark-up to highlight terms in HTML search results pages.

IFragmenter Implements the policy for breaking text into multiple fragments for consideration by the Highlighter class. A sophisticated implementation may do this on the basis of detecting end of sentences in the text.

InvalidTokenOffsetsException

IScorer Adds to the score for a fragment based on its tokens

NullFragmenter IFragmenter implementation which does not fragment the text. This is useful for highlighting the entire content of a document or field.

QueryScorer IScorer implementation which scores text fragments by the number of unique query terms found. This class converts appropriate Querys to SpanQuerys and attempts to score only those terms that participated in generating the 'hit' on the document.

QueryTermScorer

SimpleFragmenter IFragmenter implementation which breaks text up into same-size fragments with no concerns over spotting sentence boundaries.

SimpleHTMLEncoder Simple IEncoder implementation to escape text for HTML output

SimpleHTMLFormatter Simple IFormatter implementation to highlight terms with a pre and post tag

SimpleSpanFragmenter

SpanGradientFormatter Formats text with different color intensity depending on the score of the term using the span tag. GradientFormatter uses a bgcolor argument to the font tag which doesn't work in Mozilla, thus this class.

TextFragment Low-level class used to record information about a section of a document with a score.

TokenGroup One, or several overlapping tokens, along with the score(s) and the scope of the original text

TokenSources Hides implementation issues associated with obtaining a TokenStream for use with the higlighter - can obtain from TermFreqVectors with offsets and (optionally) positions or from Analyzer class reparsing the stored content.

StoredTokenStream

WeightedSpanTerm Lightweight class to hold term, Weight, and positions used for scoring this term.

PositionSpan

WeightedSpanTermExtractor Class used to extract WeightedSpanTerms from a Query based on whether Terms from the Query are contained in a supplied Analysis.TokenStream.

WeightedTerm Lightweight class to hold term and a Weight value used for scoring this term

Payloads

AveragePayloadFunction Calculate the final score as the average score of all payloads seen. Is thread safe and completely reusable

MaxPayloadFunction Returns the maximum payload score seen, else 1 if there are no payloads on the doc. Is thread safe and completely reusable

MinPayloadFunction Calculates the minimum payload seen

PayloadFunction An abstract class that defines a way for Payload*Query instances to transform the cumulative effects of payload scores for a document

PayloadNearQuery This class is very similar to Lucene.Net.Search.Spans.SpanNearQuery except that it factors in the value of the payloads located at each of the positions where the Lucene.Net.Search.Spans.TermSpans occurs. In order to take advantage of this, you must override Lucene.Net.Search.Similarity.ScorePayload which returns 1 by default. Payload scores are aggregated using a pluggable PayloadFunction

PayloadNearSpanScorer

PayloadNearSpanWeight

PayloadSpanUtil Experimental class to get set of payloads for most standard Lucene queries. Operates like Highlighter - IndexReader should only contain doc of interest, best to use MemoryIndex

PayloadTermQuery This class is very similar to Lucene.Net.Search.Spans.SpanTermQuery except that it factors in the value of the payload located at each of the positions where the Lucene.Net.Index.Term occurs. In order to take advantage of this, you must override Lucene.Net.Search.Similarity.ScorePayload(int, String, int, int, byte[],int,int) which returns 1 by default. Payload scores are aggregated using a pluggable PayloadFunction

Similar

MoreLikeThis Generate "more like this" similarity queries. Based on this mail:

MoreLikeThisQuery

SimilarityQueries Simple similarity measures

Spans

FieldMaskingSpanQuery Wrapper to allow SpanQuery objects participate in composite single-field SpanQueries by 'lying' about their search field. That is, the masked SpanQuery will function as normal, but SpanQuery.Field simply hands back the value supplied in this class's constructor.

NearSpansOrdered A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them. The formed spans only contains minimum slop matches.
The matching slop is computed from the distance(s) between the non overlapping matching Spans.
Successive matches are always formed from the successive Spans of the SpanNearQuery. The formed spans may contain overlaps when the slop is at least 1. For example, when querying using t1 t2 t3 with slop at least 1, the fragment: t1 t2 t1 t3 t2 t3 matches twice: t1 t2 .. t3 t1 .. t2 t3

NearSpansUnordered Similar to NearSpansOrdered, but for the unordered case

SpanFirstQuery Matches spans near the beginning of a field.

SpanNearQuery Matches spans which are near one another. One can specify slop, the maximum number of intervening unmatched positions, as well as whether matches are required to be in-order.

SpanNotQuery Removes matches which overlap with another SpanQuery.

SpanOrQuery Matches the union of its clauses.

SpanQuery Base class for span-based queries.

Spans Expert: an enumeration of span matches. Used to implement span searching. Each span represents a range of term positions within a document. Matches are enumerated in order, by increasing document number, within that by increasing start position and finally by increasing end position.

SpanScorer Public for extension only.

SpanTermQuery Matches spans containing a term.

SpanWeight Expert-only. Public for use by other weight implementations

TermSpans Expert: Public for extension only

Vectorhighlight

BaseFragmentsBuilder

FastVectorHighlighter

FieldFragList FieldFragList has a list of "frag info" that is used by FragmentsBuilder class to create fragments (snippets). /summary>

WeightedFragInfo

FieldPhraseList FieldPhraseList has a list of WeightedPhraseInfo that is used by FragListBuilder to create a FieldFragList object.

FieldTermStack FieldTermStack is a stack that keeps query terms in the specified field of the document to be highlighted.

TermInfo

FragListBuilder

FragmentsBuilder FragmentsBuilder is an interface for fragments (snippets) builder classes. A FragmentsBuilder class can be plugged in to Highlighter.

ScoreOrderFragmentsBuilder

ScoreComparator

SimpleFragListBuilder A simple implementation of FragListBuilder.

SimpleFragmentsBuilder A simple implementation of FragmentsBuilder.

HashMap< K, V >

BooleanFilter

BoostingQuery The BoostingQuery class can be used to effectively demote results that match a given query. Unlike the "NOT" clause, this still selects documents that contain undesirable terms, but reduces their overall score:

DuplicateFilter

FilterClause

FuzzyLikeThisQuery Fuzzifies ALL terms provided as strings and then picks the best n differentiating terms. In effect this mixes the behaviour of FuzzyQuery and MoreLikeThis but with special consideration of fuzzy scoring factors. This generally produces good results for queries where users may provide details in a number of fields and have no knowledge of boolean query syntax and also want a degree of fuzzy matching and a fast query

TermsFilter A filter that contains multiple terms.

BooleanClause A clause in a BooleanQuery.

BooleanQuery A Query that matches documents matching boolean combinations of other queries, e.g. TermQuerys, PhraseQuerys or other BooleanQuerys.

TooManyClauses Thrown when an attempt is made to add more than MaxClauseCount clauses. This typically happens if a PrefixQuery, FuzzyQuery, WildcardQuery, or TermRangeQuery is expanded to many terms during search.

BooleanScorer

BooleanScorer2 An alternative to BooleanScorer that also allows a minimum number of optional scorers that should match.
Implements skipTo(), and has no limitations on the numbers of added scorers.
Uses ConjunctionScorer, DisjunctionScorer, ReqOptScorer and ReqExclScorer.

CachingSpanFilter Wraps another SpanFilter's result and caches it. The purpose is to allow filters to simply filter, and then wrap with this class to add caching.

CachingWrapperFilter Wraps another filter's result and caches it. The purpose is to allow filters to simply filter, and then wrap with this class to add caching.

Collector Expert: Collectors are primarily meant to be used to gather raw results from a search, and implement sorting or custom result filtering, collation, etc.

ComplexExplanation Expert: Describes the score computation for document and query, and can distinguish a match independent of a positive value.

ConjunctionScorer Scorer for conjunctions, sets of queries, all of which are required.

ConstantScoreQuery A query that wraps a filter and simply returns a constant score equal to the query boost for every document in the filter.

DefaultSimilarity Expert: Default scoring implementation.

DisjunctionMaxQuery A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries. This is useful when searching for a word in multiple fields with different boost factors (so that the fields cannot be combined equivalently into a single search field). We want the primary score to be the one associated with the highest boost, not the sum of the field scores (as BooleanQuery would give). If the query is "albino elephant" this ensures that "albino" matching one field and "elephant" matching another gets a higher score than "albino" matching both fields. To get this result, use both BooleanQuery and DisjunctionMaxQuery: for each term a DisjunctionMaxQuery searches for it in each field, while the set of these DisjunctionMaxQuery's is combined into a BooleanQuery. The tie breaker capability allows results that include the same term in multiple fields to be judged better than results that include this term in only the best of those multiple fields, without confusing this with the better case of two different terms in the multiple fields.

DisjunctionMaxScorer The Scorer for DisjunctionMaxQuery's. The union of all documents generated by the the subquery scorers is generated in document number order. The score for each document is the maximum of the scores computed by the subquery scorers that generate that document, plus tieBreakerMultiplier times the sum of the scores for the other subqueries that generate the document.

DisjunctionSumScorer A Scorer for OR like queries, counterpart of ConjunctionScorer. This Scorer implements DocIdSetIterator.Advance(int) and uses skipTo() on the given Scorers.

DocIdSet A DocIdSet contains a set of doc ids. Implementing classes must only implement Iterator to provide access to the set.

AnonymousClassDocIdSet

AnonymousClassDocIdSetIterator

DocIdSetIterator This abstract class defines methods to iterate over a set of non-decreasing doc ids. Note that this class assumes it iterates on doc Ids, and therefore NO_MORE_DOCS is set to Int32.MaxValue in order to be used as a sentinel object. Implementations of this class are expected to consider int.MaxValue as an invalid value.

ExactPhraseScorer

Explanation Expert: Describes the score computation for document and query.

IDFExplanation Small Util class used to pass both an idf factor as well as an explanation for that factor

CreationPlaceholder Expert: Maintains caches of term values

StringIndex Expert: Stores term text values and document ordering data.

CacheEntry EXPERT: A unique Identifier/Description for each item in the FieldCache. Can be useful for logging/debugging. EXPERIMENTAL API: This API is considered extremely advanced and experimental. It may be removed or altered w/o warning in future releases of Lucene.

FieldCache_Fields

AnonymousClassByteParser

AnonymousClassShortParser

AnonymousClassIntParser

AnonymousClassFloatParser

AnonymousClassLongParser

AnonymousClassDoubleParser

AnonymousClassIntParser1

AnonymousClassFloatParser1

AnonymousClassLongParser1

AnonymousClassDoubleParser1

FieldCache

Parser Marker interface as super-interface to all parsers. It is used to specify a custom parser to SortField(String, Parser).

ByteParser Interface to parse bytes from document fields.

ShortParser Interface to parse shorts from document fields.

IntParser Interface to parse ints from document fields.

FloatParser Interface to parse floats from document fields.

LongParser Interface to parse long from document fields.

DoubleParser Interface to parse doubles from document fields.

FieldCacheImpl Expert: The default cache implementation, storing all values in memory. A WeakDictionary is used for storage

FieldCacheRangeFilter< T >

FieldCacheTermsFilter A Filter that only accepts documents whose single term value in the specified field is contained in the provided set of allowed terms

FieldComparator Expert: a FieldComparator compares hits so as to determine their sort order when collecting the top results with TopFieldCollector . The concrete public FieldComparator classes here correspond to the SortField types

ByteComparator Parses field's values as byte (using FieldCache.GetBytes(Lucene.Net.Index.IndexReader,string) and sorts by ascending value

DocComparator Sorts by ascending docID

DoubleComparator Parses field's values as double (using FieldCache.GetDoubles(Lucene.Net.Index.IndexReader,string) and sorts by ascending value

FloatComparator Parses field's values as float (using FieldCache.GetFloats(Lucene.Net.Index.IndexReader,string) and sorts by ascending value

IntComparator Parses field's values as int (using FieldCache.GetInts(Lucene.Net.Index.IndexReader,string) and sorts by ascending value

LongComparator Parses field's values as long (using FieldCache.GetLongs(Lucene.Net.Index.IndexReader,string) and sorts by ascending value

RelevanceComparator Sorts by descending relevance. NOTE: if you are sorting only by descending relevance and then secondarily by ascending docID, peformance is faster using TopScoreDocCollector directly (which Searcher.Search(Query, int) uses when no Sort is specified).

ShortComparator Parses field's values as short (using FieldCache.GetShorts(IndexReader, string)) and sorts by ascending value

StringComparatorLocale Sorts by a field's value using the Collator for a given Locale.

StringOrdValComparator Sorts by field's natural String sort order, using ordinals. This is functionally equivalent to FieldComparator.StringValComparator , but it first resolves the string to their relative ordinal positions (using the index returned by FieldCache.GetStringIndex), and does most comparisons using the ordinals. For medium to large results, this comparator will be much faster than FieldComparator.StringValComparator. For very small result sets it may be slower.

StringValComparator Sorts by field's natural String sort order. All comparisons are done using String.compareTo, which is slow for medium to large result sets but possibly very fast for very small results sets.

FieldComparatorSource Provides a FieldComparator for custom field sorting

FieldDoc Expert: A ScoreDoc which also contains information about how to sort the referenced document. In addition to the document number and score, this object contains an array of values for the document from the field(s) used to sort. For example, if the sort criteria was to sort by fields "a", "b" then "c", the fields object array will have three elements, corresponding respectively to the term values for the document in fields "a", "b" and "c". The class of each element in the array will be either Integer, Float or String depending on the type of values in the terms of each field

FieldDocSortedHitQueue Expert: Collects sorted results from Searchable's and collates them. The elements put into this queue must be of type FieldDoc

FieldValueHitQueue Expert: A hit queue for sorting by hits by terms in more than one field. Uses FieldCache.DEFAULT for maintaining internal term lookup tables

Entry

Filter Abstract base class for restricting which documents may be returned during searching.

FilteredDocIdSet Abstract decorator class for a DocIdSet implementation that provides on-demand filtering/validation mechanism on a given DocIdSet

FilteredDocIdSetIterator Abstract decorator class of a DocIdSetIterator implementation that provides on-demand filter/validation mechanism on an underlying DocIdSetIterator. See FilteredDocIdSet

FilteredQuery A query that applies a filter to the results of another query

FilteredTermEnum Abstract class for enumerating a subset of all terms. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it.

FilterManager Filter caching singleton. It can be used to save filters locally for reuse. This class makes it possble to cache Filters even when using RMI, as it keeps the cache on the seaercher side of the RMI connection

FuzzyQuery Implements the fuzzy search query. The similarity measurement is based on the Levenshtein (edit distance) algorithm

FuzzyTermEnum Subclass of FilteredTermEnum for enumerating all terms that are similiar to the specified filter term

HitQueue

IndexSearcher Implements search over a single IndexReader

MatchAllDocsQuery A query that matches all documents

MultiPhraseQuery MultiPhraseQuery is a generalized version of PhraseQuery, with an added method Add(Term[]). To use this class, to search for the phrase "Microsoft app*" first use add(Term) on the term "Microsoft", then find all terms that have "app" as prefix using IndexReader.terms(Term), and use MultiPhraseQuery.add(Term[] terms) to add them to the query

MultiSearcher Implements search over a set of Searchables

MultiTermQuery An abstract Query that matches documents containing a subset of terms provided by a FilteredTermEnum enumeration

AnonymousClassConstantScoreAutoRewrite

ConstantScoreAutoRewrite A rewrite method that tries to pick the best constant-score rewrite method based on term and document counts from the query. If both the number of terms and documents is small enough, then CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE is used. Otherwise, CONSTANT_SCORE_FILTER_REWRITE is used.

RewriteMethod Abstract class that defines how the query is rewritten.

MultiTermQueryWrapperFilter< T > A wrapper for MultiTermQuery, that exposes its functionality as a Filter. MultiTermQueryWrapperFilter is not designed to be used by itself. Normally you subclass it to provide a Filter counterpart for a MultiTermQuery subclass. For example, TermRangeFilter and PrefixFilter extend MultiTermQueryWrapperFilter. This class also provides the functionality behind MultiTermQuery.CONSTANT_SCORE_FILTER_REWRITE; this is why it is not abstract.

NumericRangeFilter< T > A Filter that only accepts numeric values within a specified range. To use this, you must first index the numeric values using NumericField (expert: NumericTokenStream )

NumericRangeQuery< T > A Query that matches numeric values within a specified range. To use this, you must first index the numeric values using NumericField (expert: NumericTokenStream ). If your terms are instead textual, you should use TermRangeQuery. NumericRangeFilter{T} is the filter equivalent of this query.

ParallelMultiSearcher Implements parallel search over a set of Searchables

PhrasePositions Position of a term in a document that takes into account the term offset within the phrase.

PhraseQuery A Query that matches documents containing a particular sequence of terms. A PhraseQuery is built by QueryParser for input like "new york"

PhraseQueue

PhraseScorer Expert: Scoring functionality for phrase queries.
A document is considered matching if it contains the phrase-query terms at "valid" positons. What "valid positions" are depends on the type of the phrase query: for an exact phrase query terms are required to appear in adjacent locations, while for a sloppy phrase query some distance between the terms is allowed. The abstract method PhraseFreq() of extending classes is invoked for each document containing all the phrase query terms, in order to compute the frequency of the phrase query in that document. A non zero frequency means a match.

PositiveScoresOnlyCollector A Collector implementation which wraps another Collector and makes sure only documents with scores > 0 are collected.

PrefixFilter A Filter that restricts search results to values that have a matching prefix in a given field.

PrefixQuery A Query that matches documents containing terms with a specified prefix. A PrefixQuery is built by QueryParser for input like app*

PrefixTermEnum Subclass of FilteredTermEnum for enumerating all terms that match the specified prefix filter term. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it

Query

The abstract base class for queries. Instantiable subclasses are:

A parser for queries is contained in:

QueryParser

QueryTermVector

QueryWrapperFilter Constrains search results to only match those which also match a provided query

ReqExclScorer A Scorer for queries with a required subscorer and an excluding (prohibited) sub DocIdSetIterator.
This Scorer implements DocIdSetIterator.Advance(int), and it uses the skipTo() on the given scorers.

ReqOptSumScorer A Scorer for queries with a required part and an optional part. Delays skipTo() on the optional part until a score() is needed.
This Scorer implements DocIdSetIterator.Advance(int).

ScoreCachingWrappingScorer A Scorer which wraps another scorer and caches the score of the current document. Successive calls to Score() will return the same result and will not invoke the wrapped Scorer's score() method, unless the current document has changed.
This class might be useful due to the changes done to the Collector interface, in which the score is not computed for a document by default, only if the collector requests it. Some collectors may need to use the score in several places, however all they have in hand is a Scorer object, and might end up computing the score of a document more than once.

ScoreDoc Expert: Returned by low-level search implementations.

Scorer Expert: Common scoring functionality for different types of queries

Searchable The interface for search implementations

Searcher An abstract base class for search implementations. Implements the main search methods

Similarity Expert: Scoring API. Subclasses implement search scoring

SimilarityDelegator Expert: Delegating scoring implementation. Useful in Query.GetSimilarity(Searcher) implementations, to override only certain methods of a Searcher's Similiarty implementation..

SingleTermEnum Subclass of FilteredTermEnum for enumerating a single term. This can be used by MultiTermQuerys that need only visit one term, but want to preserve MultiTermQuery semantics such as RewriteMethod.

SloppyPhraseScorer

Sort Encapsulates sort criteria for returned hits

SortField Stores information about how to sort documents by terms in an individual field. Fields must be indexed in order to sort by them

SpanFilter Abstract base class providing a mechanism to restrict searches to a subset of an index and also maintains and returns position information. This is useful if you want to compare the positions from a SpanQuery with the positions of items in a filter. For instance, if you had a SpanFilter that marked all the occurrences of the word "foo" in documents, and then you entered a new SpanQuery containing bar, you could not only filter by the word foo, but you could then compare position information for post processing.

SpanFilterResult The results of a SpanQueryFilter. Wraps the BitSet and the position information from the SpanQuery

PositionInfo

StartEnd

SpanQueryFilter Constrains search results to only match those which also match a provided query. Also provides position information about where each document matches at the cost of extra space compared with the QueryWrapperFilter. There is an added cost to this above what is stored in a QueryWrapperFilter. Namely, the position information for each matching document is stored. This filter does not cache. See the Lucene.Net.Search.CachingSpanFilter for a wrapper that caches

TermQuery A Query that matches documents containing a term. This may be combined with other terms with a BooleanQuery.

TermRangeFilter A Filter that restricts search results to a range of values in a given field

TermRangeQuery A Query that matches documents within an exclusive range of terms

TermRangeTermEnum Subclass of FilteredTermEnum for enumerating all terms that match the specified range parameters. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it.

TermScorer Expert: A Scorer for documents matching a Term.

TimeLimitingCollector The TimeLimitingCollector is used to timeout search requests that take longer than the maximum allowed search time limit. After this time is exceeded, the search thread is stopped by throwing a TimeExceededException.

TimeExceededException Thrown when elapsed search time exceeds allowed search time.

TopDocs Represents hits returned by Searcher.Search(Query,Filter,int) and Searcher.Search(Query,int)

TopDocsCollector< T > A base class for all collectors that return a Lucene.Net.Search.TopDocs output. This collector allows easy extension by providing a single constructor which accepts a PriorityQueue{T} as well as protected members for that priority queue and a counter of the number of total hits.
Extending classes can override TopDocs(int, int) and TotalHits in order to provide their own implementation.

TopFieldCollector A Collector that sorts by SortField using FieldComparators. See the Create method for instantiating a TopFieldCollector

TopFieldDocs Represents hits returned by Searcher.Search(Query,Filter,int,Sort).

TopScoreDocCollector A Collector implementation that collects the top-scoring hits, returning them as a TopDocs. This is used by IndexSearcher to implement TopDocs-based search. Hits are sorted by score descending and then (when the scores are tied) docID ascending. When you create an instance of this collector you should know in advance whether documents are going to be collected in doc Id order or not

Weight

Expert: Calculate query weights and build query scorers. The purpose of Weight is to ensure searching does not modify a Query, so that a Query instance can be reused.
Searcher dependent state of the query should reside in the Weight.
IndexReader dependent state should reside in the Scorer. A Weight is used in the following way:

A Weight is constructed by a top-level query, given a Searcher (Lucene.Net.Search.Query.CreateWeight(Searcher)).
The GetSumOfSquaredWeights() method is called on the Weight to compute the query normalization factor Similarity.QueryNorm(float) of the query clauses contained in the query.
The query normalization factor is passed to Normalize(float). At this point the weighting is complete.
A Scorer is constructed by Scorer(IndexReader,bool,bool).

WildcardQuery Implements the wildcard search query. Supported wildcards are *, which matches any character sequence (including the empty one), and ?, which matches any single character. Note this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow WildcardQueries, a Wildcard term should not start with one of the wildcards * or ?

WildcardTermEnum Subclass of FilteredTermEnum for enumerating all terms that match the specified wildcard filter term. Term enumerations are always ordered by Term.compareTo(). Each term in the enumeration is greater than all that precede it.

Spatial

BBox

AreaSimilarity The algorithm is implemented as envelope on envelope overlays rather than complex polygon on complex polygon overlays. Spatial relevance scoring algorithm:
queryArea = the area of the input query envelope
targetArea = the area of the target envelope (per Lucene document)
intersectionArea = the area of the intersection for the query/target envelopes
queryPower = the weighting power associated with the query envelope (default = 1.0)
targetPower = the weighting power associated with the target envelope (default = 1.0)
queryRatio = intersectionArea / queryArea;
targetRatio = intersectionArea / targetArea;
queryFactor = Math.pow(queryRatio,queryPower);
targetFactor = Math.pow(targetRatio,targetPower);
score = queryFactor /// targetFactor; Based on Geoportal's SpatialRankingValueSource

BBoxSimilarity Abstraction of the calculation used to determine how similar two Bounding Boxes are.

BBoxSimilarityValueSource

BBoxStrategy

DistanceSimilarity Returns the distance between the center of the indexed rectangle and the query shape.

Prefix

Tree

GeohashPrefixTree A SpatialPrefixGrid based on Geohashes. Uses GeohashUtils to do all the geohash work.

Factory Factory for creating GeohashPrefixTree instances with useful defaults

GhCell

Node

QuadPrefixTree Implementation of SpatialPrefixTree which uses a quad tree (http://en.wikipedia.org/wiki/Quadtree)

Factory Factory for creating QuadPrefixTree instances with useful defaults

QuadCell

SpatialPrefixTree A spatial Prefix Tree, or Trie, which decomposes shapes into prefixed strings at variable lengths corresponding to variable precision. Each string corresponds to a spatial region

SpatialPrefixTreeFactory Abstract Factory for creating SpatialPrefixTree instances with useful defaults and passed on configurations defined in a Map.

PointPrefixTreeFieldCacheProvider Implementation of ShapeFieldCacheProvider designed for PrefixTreeStrategys

PrefixTreeStrategy Abstract SpatialStrategy which provides common functionality for those Strategys which use SpatialPrefixTrees

CellTokenStream Outputs the tokenString of a cell, and if its a leaf, outputs it again with the leaf byte.

RecursivePrefixTreeFilter Performs a spatial intersection filter against a field indexed with SpatialPrefixTree, a Trie. SPT yields terms (grids) at length 1 and at greater lengths corresponding to greater precisions. This filter recursively traverses each grid length and uses methods on Shape to efficiently know that all points at a prefix fit in the shape or not to either short-circuit unnecessary traversals or to efficiently load all enclosed points.

RecursivePrefixTreeStrategy Based on RecursivePrefixTreeFilter.

TermQueryPrefixTreeStrategy A basic implementation using a large TermsFilter of all the nodes from SpatialPrefixTree#getNodes(com.spatial4j.core.shape.Shape, int, boolean).

Queries

SpatialArgsParser

SpatialOperation

UnsupportedSpatialOperation

Util

IBits Interface for Bitset-like structures.

Bits Empty implementation, basically just so we can provide EMPTY_ARRAY

MatchAllBits Bits impl of the specified length with all bits set.

MatchNoBits Bits impl of the specified length with no bits set.

CachingDoubleValueSource

CachingDoubleDocValue

FixedBitSet

FixedBitSetIterator A FixedBitSet Iterator implementation

FunctionQuery Port of Solr's FunctionQuery (v1.4)

AllScorer

FunctionWeight

ReciprocalFloatFunction

FloatDocValues

ShapeFieldCache< T > Bounded Cache of Shapes associated with docIds. Note, multiple Shapes can be associated with a given docId

ShapeFieldCacheDistanceValueSource An implementation of the Lucene ValueSource model to support spatial relevance ranking.

CachedDistanceDocValues

ShapeFieldCacheProvider< T > Provides access to a ShapeFieldCache for a given AtomicReader

TermsEnumCompatibility Wraps Lucene 3 TermEnum to make it look like a Lucene 4 TermsEnum SOLR-2155

TermsFilter Constructs a filter for docs matching any of the terms added to this class. Unlike a RangeFilter this can be used for filtering on multiple terms that are not necessarily in a sequence. An example might be a collection of primary keys from a database query result or perhaps a choice of "category" labels picked by the end user. As a filter, this is much faster than the equivalent query (a BooleanQuery with many "should" TermQueries)

ValueSourceFilter Filter that matches all documents where a valuesource is in between a range of min and max inclusive.

ValueSourceFilteredDocIdSet

Vector

DistanceValueSource An implementation of the Lucene ValueSource model that returns the distance.

DistanceDocValues

PointVectorStrategy Simple SpatialStrategy which represents Points in two numeric DoubleFields

Store

AlreadyClosedException This exception is thrown when there is an attempt to access something that has already been closed.

BufferedIndexInput Base implementation class for buffered IndexInput.

BufferedIndexOutput Base implementation class for buffered IndexOutput.

ChecksumIndexInput Writes bytes through to a primary IndexOutput, computing checksum as it goes. Note that you cannot use seek().

ChecksumIndexOutput Writes bytes through to a primary IndexOutput, computing checksum. Note that you cannot use seek().

Directory A Directory is a flat list of files. Files may be written once, when they are created. Once a file is created it may only be opened for read, or deleted. Random access is permitted both when reading and writing

FileSwitchDirectory Expert: A Directory instance that switches files between two other Directory instances. Files with the specified extensions are placed in the primary directory; others are placed in the secondary directory. The provided Set must not change once passed to this class, and must allow multiple threads to call contains at once.

FSDirectory Base class for Directory implementations that store index files in the file system. There are currently three core subclasses:

FSLockFactory Base class for file system based locking implementation.

IndexInput Abstract base class for input from a file in a Directory. A random-access input stream. Used for all Lucene index input operations.

IndexOutput Abstract base class for output to a file in a Directory. A random-access output stream. Used for all Lucene index output operations.

Lock An interprocess mutex lock. Typical use might look like:

With Utility class for executing code with exclusive access.

LockFactory Base class for Locking implementation. Directory uses instances of this class to implement locking.

LockObtainFailedException This exception is thrown when the write.lock could not be acquired. This happens when a writer tries to open an index that another writer already has open.

LockReleaseFailedException This exception is thrown when the write.lock could not be released.

LockStressTest Simple standalone tool that forever acquires & releases a lock using a specific LockFactory. Run without any args to see usage

LockVerifyServer Simple standalone server that must be running when you use VerifyingLockFactory. This server simply verifies at most one process holds the lock at a time. Run without any args to see usage

MMapDirectory File-based Directory implementation that uses mmap for reading, and SimpleFSDirectory.SimpleFSIndexOutput for writing

NativeFSLockFactory Implements LockFactory using native OS file locks. Note that because this LockFactory relies on java.nio.* APIs for locking, any problems with those APIs will cause locking to fail. Specifically, on certain NFS environments the java.nio.* locks will fail (the lock can incorrectly be double acquired) whereas SimpleFSLockFactory worked perfectly in those same environments. For NFS based access to an index, it's recommended that you try SimpleFSLockFactory first and work around the one limitation that a lock file could be left when the JVM exits abnormally.

NativeFSLock

NIOFSDirectory Not implemented. Waiting for volunteers.

NIOFSIndexInput Not implemented. Waiting for volunteers.

NoLockFactory Use this LockFactory to disable locking entirely. Only one instance of this lock is created. You should call Instance to get the instance

NoLock

NoSuchDirectoryException This exception is thrown when you try to list a non-existent directory.

RAMDirectory A memory-resident Directory implementation. Locking implementation is by default the SingleInstanceLockFactory but can be changed with Directory.SetLockFactory.

RAMFile

RAMInputStream A memory-resident IndexInput implementation

RAMOutputStream A memory-resident IndexOutput implementation

SimpleFSDirectory A straightforward implementation of FSDirectory using java.io.RandomAccessFile. However, this class has poor concurrent performance (multiple threads will bottleneck) as it synchronizes when multiple threads read from the same file. It's usually better to use NIOFSDirectory or MMapDirectory instead.

SimpleFSIndexOutput

SimpleFSLockFactory Implements LockFactory using System.IO.FileInfo.Create() .

SimpleFSLock

SingleInstanceLockFactory Implements LockFactory for a single in-process instance, meaning all locking will take place through this one instance. Only use this LockFactory when you are certain all IndexReaders and IndexWriters for a given index are running against a single shared in-process Directory instance. This is currently the default locking for RAMDirectory

SingleInstanceLock

VerifyingLockFactory A LockFactory that wraps another LockFactory and verifies that each lock obtain/release is "correct" (never results in two processes holding the lock at the same time). It does this by contacting an external server (LockVerifyServer) to assert that at most one process holds the lock at a time. To use this, you should also run LockVerifyServer on the host & port matching what you pass to the constructor

Support

Compatibility

AppSettings

BitSetSupport This class provides supporting methods of java.util.BitSet that are not present in System.Collections.BitArray.

BuildType

Character Mimics Java's Character class.

CloseableThreadLocalProfiler For Debuging purposes.

CollectionsHelper Support class used to handle Hashtable addition, which does a check first to make sure the added item is unique in the hash.

Compare Summary description for TestSupportClass.

CRC32

Deflater

Double

EquatableList< T > Represents a strongly typed list of objects that can be accessed by index. Provides methods to search, sort, and manipulate lists. Also provides functionality to compare lists against each other through an implementations of IEquatable{T}.

FileSupport Represents the methods to support some operations over files.

HashMap< TKey, TValue > A C# emulation of the Java Hashmap

IChecksum Contains conversion support elements such as classes, interfaces and static methods.

Inflater

IThreadRunnable This interface should be implemented by any class whose instances are intended to be executed by a thread.

Number A simple class for number conversions.

OS Provides platform infos.

SharpZipLib

Single

TextSupport

ThreadClass Support class used to handle threads

ThreadLock Abstract base class that provides a synchronization interface for derived lock types

WeakDictionary< TKey, TValue >

Util

Cache

AbstractSegmentCache Root custom cache to allow a factory to retain references to the custom caches without having to be aware of the type.

SegmentCache< T >

Custom cache with two levels of keys, outer key is the IndexReader with the inner key being a string, commonly a field name but can be anything. Refer to the unit tests for an example implementation.

Template Parameters

T	The type that is being cached.

Cache< TKey, TValue > Base class for cache implementations.

SimpleLRUCache< TKey, TValue >

SimpleMapCache< TKey, TValue > Simple cache implementation that uses a HashMap to store (key, value) pairs. This cache is not synchronized, use Cache{TKey, TValue}.SynchronizedCache(Cache{TKey, TValue}) if needed.

ArrayUtil Methods for manipulating arrays.

Attribute Base class for Attributes that can be added to a Lucene.Net.Util.AttributeSource. Attributes are used to add data in a dynamic, yet type-safe way to a source of usually streamed objects, e. g. a Lucene.Net.Analysis.TokenStream.

AttributeSource An AttributeSource contains a list of different Attributes, and methods to add and get them. There can only be a single instance of an attribute in the same AttributeSource instance. This is ensured by passing in the actual type of the Attribute (Class<Attribute>) to the AddAttribute{T}(), which then checks if an instance of that type is already present. If yes, it returns the instance, otherwise it creates a new instance and returns it.

AttributeFactory An AttributeFactory creates instances of Attributes.

State This class holds the state of an AttributeSource.

AverageGuessMemoryModel An average, best guess, MemoryModel that should work okay on most systems

BitUtil A variety of high efficiencly bit twiddling routines

BitVector

Optimized implementation of a vector of bits. This is more-or-less like java.util.BitSet, but also includes the following:

a count() method, which efficiently computes the number of one bits;
optimized read from and write to disk;
inlinable get() method;
store and load, as bit set or d-gaps, depending on sparseness;

CloseableThreadLocal Java's builtin ThreadLocal has a serious flaw: it can take an arbitrarily long amount of time to dereference the things you had stored in it, even once the ThreadLocal instance itself is no longer referenced. This is because there is single, master map stored for each thread, which all ThreadLocals share, and that master map only periodically purges "stale" entries

CloseableThreadLocal< T > Java's builtin ThreadLocal has a serious flaw: it can take an arbitrarily long amount of time to dereference the things you had stored in it, even once the ThreadLocal instance itself is no longer referenced. This is because there is single, master map stored for each thread, which all ThreadLocals share, and that master map only periodically purges "stale" entries

Constants Some useful constants.

DocIdBitSet Simple DocIdSet and DocIdSetIterator backed by a BitSet

FieldCacheSanityChecker Provides methods for sanity checking that entries in the FieldCache are not wasteful or inconsistent. Lucene 2.9 Introduced numerous enhancements into how the FieldCache is used by the low levels of Lucene searching (for Sorting and ValueSourceQueries) to improve both the speed for Sorting, as well as reopening of IndexReaders. But these changes have shifted the usage of FieldCache from "top level" IndexReaders (frequently a MultiReader or DirectoryReader) down to the leaf level SegmentReaders. As a result, existing applications that directly access the FieldCache may find RAM usage increase significantly when upgrading to 2.9 or Later. This class provides an API for these applications (or their Unit tests) to check at run time if the FieldCache contains "insane" usages of the FieldCache. EXPERIMENTAL API: This API is considered extremely advanced and experimental. It may be removed or altered w/o warning in future releases of Lucene.

Insanity Simple container for a collection of related CacheEntry objects that in conjunction with eachother represent some "insane" usage of the FieldCache.

InsanityType An Enumaration of the differnet types of "insane" behavior that may be detected in a FieldCache

IAttribute Base interface for attributes.

IdentityDictionary< TKey, TValue > A class that mimics Java's IdentityHashMap in that it determines object equality solely on ReferenceEquals rather than (possibly overloaded) object.Equals()

IndexableBinaryStringTools Provides support for converting byte sequences to Strings and back again. The resulting Strings preserve the original byte sequences' sort order

MapOfSets< TKey, TValue > Helper class for keeping Listss of Objects associated with keys. WARNING: THIS CLASS IS NOT THREAD SAFE

MemoryModel Returns primitive memory sizes for estimating RAM usage

NumericUtils This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs

IntRangeBuilder Expert: Callback for SplitIntRange. You need to overwrite only one of the methods. <font color="red">NOTE: This is a very low-level interface, the method signatures may change in later versions.</font>

LongRangeBuilder Expert: Callback for SplitLongRange. You need to overwrite only one of the methods. <font color="red">NOTE: This is a very low-level interface, the method signatures may change in later versions.</font>

OpenBitSet An "open" BitSet implementation that allows direct access to the array of words storing the bits. Unlike java.util.bitset, the fact that bits are packed into an array of longs is part of the interface. This allows efficient implementation of other algorithms by someone other than the author. It also allows one to efficiently implement alternate serialization or interchange formats. OpenBitSet is faster than java.util.BitSet in most operations and much faster at calculating cardinality of sets and results of set operations. It can also handle sets of larger cardinality (up to 64 * 2**32-1) The goals of OpenBitSet are the fastest implementation possible, and maximum code reuse. Extra safety and encapsulation may always be built on top, but if that's built in, the cost can never be removed (and hence people re-implement their own version in order to get better performance). If you want a "safe", totally encapsulated (and slower and limited) BitSet class, use java.util.BitSet.

OpenBitSetDISI

OpenBitSetIterator An iterator to iterate over set bits in an OpenBitSet. This is faster than nextSetBit() for iterating over the complete set of bits, especially when the density of the bits set is high

PriorityQueue< T > A PriorityQueue maintains a partial ordering of its elements such that the least element can always be found in constant time. Put()'s and pop()'s require log(size) time

RamUsageEstimator Estimates the size of a given Object using a given MemoryModel for primitive size information

ReaderUtil Common util methods for dealing with IndexReaders.

ScorerDocQueue A ScorerDocQueue maintains a partial ordering of its Scorers such that the least Scorer can always be found in constant time. Put()'s and pop()'s require log(size) time. The ordering is by Scorer.doc().

SimpleStringInterner Simple lockless and memory barrier free String intern cache that is guaranteed to return the same String instance as String.intern() does.

SmallFloat Floating point numbers smaller than 32 bits

SortedVIntList Stores and iterate on sorted integers in compressed form in RAM.
The code for compressing the differences between ascending integers was borrowed from Lucene.Net.Store.IndexInput and Lucene.Net.Store.IndexOutput.NOTE: this class assumes the stored integers are doc Ids (hence why it extends DocIdSet). Therefore its Iterator() assumes DocIdSetIterator.NO_MORE_DOCS can be used as sentinel. If you intent to use this value, then make sure it's not used during search flow.

SorterTemplate Borrowed from Cglib. Allows custom swap so that two arrays can be sorted at the same time.

StringHelper Methods for manipulating strings.

StringInterner Subclasses of StringInterner are required to return the same single String object for all equal strings. Depending on the implementation, this may not be the same object returned as String.intern()

ToStringUtils Helper methods to ease implementing Object.ToString().

LucenePackage Lucene's package information, including version. *

LuceneMonitorInstall

ProjectInstaller Summary description for ProjectInstaller.

SF

Snowball

Ext

DanishStemmer Generated class implementing code defined by a snowball script.

DutchStemmer Generated class implementing code defined by a snowball script.

EnglishStemmer Generated class implementing code defined by a snowball script.

FinnishStemmer Generated class implementing code defined by a snowball script.

FrenchStemmer Generated class implementing code defined by a snowball script.

German2Stemmer Generated class implementing code defined by a snowball script.

GermanStemmer Generated class implementing code defined by a snowball script.

HungarianStemmer

ItalianStemmer Generated class implementing code defined by a snowball script.

KpStemmer Generated class implementing code defined by a snowball script.

LovinsStemmer Generated class implementing code defined by a snowball script.

NorwegianStemmer Generated class implementing code defined by a snowball script.

PorterStemmer Generated class implementing code defined by a snowball script.

PortugueseStemmer

RomanianStemmer

RussianStemmer Generated class implementing code defined by a snowball script.

SpanishStemmer Generated class implementing code defined by a snowball script.

SwedishStemmer Generated class implementing code defined by a snowball script.

TurkishStemmer

Among

SnowballProgram This is the rev 500 of the snowball SVN trunk, but modified: made abstract and introduced abstract method stem to avoid expensive reflection in filter class

LZOCompressor Wrapper class for the highly performant LZO compression library

Spatial4n

Core

Exceptions

InvalidSpatialArgument

IDictionary A simple interface representing a Dictionary

JaroWinklerDistance

LevenshteinDistance Levenshtein edit distance

LuceneDictionary Lucene Dictionary

NGramDistance

PlainTextDictionary Dictionary represented by a file text. Format allowed: 1 word per line:
word1
word2
word3

SpellChecker

StringDistance Interface for string distances.

SuggestWord SuggestWord Class, used in suggestSimilar method in SpellChecker class

SuggestWordQueue

TRStringDistance Edit distance class

System

WorldNet

Net

SynExpand Expand a query by looking up synonyms for every term. You need to invoke Syns2Index first to build the synonym index

Syns2Index From project WordNet.Net.Syns2Index

SynLookup Test program to look up synonyms.

Syns2Index From project WordNet.Net.Syns2Index

Syns2Index Convert the prolog file wn_s.pl from the WordNet prolog download into a Lucene index suitable for looking up synonyms and performing query expansion (SynExpand.Expand)