Apache Lucene.Net 2.4.0 Class Library API

IndexWriter Methods

The methods of the IndexWriter class are listed below. For a complete list of IndexWriter class members, see the IndexWriter Members topic.

Public Static Methods

GetDefaultInfoStream Returns the current default infoStream for newly instantiated IndexWriters.
GetDefaultWriteLockTimeout Returns default write lock timeout for newly instantiated IndexWriters.
IsLockedOverloaded.  
SetDefaultInfoStreamIf non-null, this will be the default infoStream used by a newly instantiated IndexWriter.
SetDefaultWriteLockTimeout Sets the default (for any instance of IndexWriter) maximum time to wait for a write lock (in milliseconds).
Unlock 

Public Instance Methods

AbortObsolete.  
AddDocumentOverloaded.  
AddIndexesOverloaded.  
AddIndexesNoOptimize 
CloseOverloaded.  
Commit 
DeleteDocumentsOverloaded. Deletes the document(s) containing
term
.
DocCountObsolete. Returns the number of documents currently in this index, not counting deletions.
Equals (inherited from Object)Determines whether the specified Object is equal to the current Object.
ExpungeDeletesOverloaded. Just like {@link #expungeDeletes()}, except you can specify whether the call should block until the operation completes. This is only meaningful with a {@link MergeScheduler} that is able to run merges in background threads.
FlushOverloaded.  
GetAnalyzerReturns the analyzer used by this index.
GetBufferedDeleteTermsSize 
GetDirectoryReturns the Directory used by this index.
GetDocCount 
GetFlushCount 
GetFlushDeletesCount 
GetHashCode (inherited from Object)Serves as a hash function for a particular type. GetHashCode is suitable for use in hashing algorithms and data structures like a hash table.
GetInfoStream Returns the current infoStream in use by this writer.
GetMaxBufferedDeleteTerms Returns the number of buffered deleted terms that will trigger a flush if enabled.
GetMaxBufferedDocs Returns the number of buffered added documents that will trigger a flush if enabled.
GetMaxFieldLength Returns the maximum number of terms that will be indexed for a single field in a document.
GetMaxMergeDocs Returns the largest segment (measured by document count) that may be merged with other segments. Note that this method is a convenience method: it just calls mergePolicy.GetMaxMergeDocs as long as mergePolicy is an instance of {@link LogMergePolicy}. Otherwise an System.ArgumentException is thrown.
GetMaxSyncPauseSecondsObsolete. Expert: returns max delay inserted before syncing a commit point. On Windows, at least, pausing before syncing can increase net indexing throughput. The delay is variable based on size of the segment's files, and is only inserted when using ConcurrentMergeScheduler for merges.
GetMergeFactor Returns the number of segments that are merged at once and also controls the total number of segments allowed to accumulate in the index. Note that this method is a convenience method: it just calls mergePolicy.GetMergeFactor as long as mergePolicy is an instance of {@link LogMergePolicy}. Otherwise an System.ArgumentException is thrown.
GetMergePolicy Expert: returns the current MergePolicy in use by this writer.
GetMergeScheduler Expert: returns the current MergePolicy in use by this writer.
GetNextMergeExpert: the {@link MergeScheduler} calls this method to retrieve the next merge requested by the MergePolicy
GetNumBufferedDeleteTerms 
GetNumBufferedDocuments 
GetRAMBufferSizeMB Returns the value set by {@link #setRAMBufferSizeMB} if enabled.
GetSegmentCount 
GetSimilarity 
GetTermIndexIntervalExpert: Return the interval between indexed terms.
GetType (inherited from Object)Gets the Type of the current instance.
GetUseCompoundFileGet the current setting of whether newly flushed segments will use the compound file format. Note that this just returns the value previously set with setUseCompoundFile(bool), or the default value (true). You cannot use this to query the status of previously flushed segments. Note that this method is a convenience method: it just calls mergePolicy.GetUseCompoundFile as long as mergePolicy is an instance of {@link LogMergePolicy}. Otherwise an System.ArgumentException is thrown.
GetWriteLockTimeout Returns allowed timeout when acquiring the write lock.
HasDeletions 
MaxDoc Returns total number of docs in this index, including docs not yet flushed (still in the RAM buffer), without regard for deletions (see NumDocs()).
MaybeMerge Expert: asks the mergePolicy whether any merges are necessary now and if so, runs the requested merges and then iterate (test again if merges are needed) until no more merges are returned by the mergePolicy. Explicit calls to MaybeMerge() are usually not necessary. The most common case is when merge policy parameters have changed.
Merge 
Message Prints a message to the infoStream (if non-null), prefixed with the identifying information for this writer and the thread that's calling it.
NewestSegment 
NumDocs Returns total number of docs in this index, including docs not yet flushed (still in the RAM buffer), with regard for deletions. NOTE: Buffered deletions are not excluded. If these need to be excluded, call Commit() first.
NumRamDocs 
OptimizeOverloaded.  
PrepareCommit 
RamSizeInBytes 
Rollback Close the
IndexWriter
without committing any of the changes that have occurred since it was opened. This removes any temporary files that had been created, after which the state of the index will be the same as it was when this writer was first opened. This can only be called when this IndexWriter was opened with
autoCommit=false
. This also clears a previous call to PrepareCommit().
SegString 
SetInfoStreamIf non-null, information about merges, deletes and a message when maxFieldLength is reached will be printed to this.
SetMaxBufferedDeleteTerms Determines the minimal number of delete terms required before the buffered in-memory delete terms are applied and flushed. If there are documents buffered in memory at the time, they are merged and a new segment is created. Disabled by default (writer flushes by RAM usage).
SetMaxBufferedDocsDetermines the minimal number of documents required before the buffered in-memory documents are flushed as a new Segment. Large values generally gives faster indexing. When this is set, the writer will flush every maxBufferedDocs added documents. Pass in {@link #DISABLE_AUTO_FLUSH} to prevent triggering a flush due to number of buffered documents. Note that if flushing by RAM usage is also enabled, then the flush will be triggered by whichever comes first. Disabled by default (writer flushes by RAM usage).
SetMaxFieldLength The maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory. This setting refers to the number of running terms, not to the number of different terms. Note: this silently truncates large documents, excluding from the index all terms that occur further in the document. If you know your source documents are large, be sure to set this value high enough to accomodate the expected size. If you set it to Integer.MAX_VALUE, then the only limit is your memory, but you should anticipate an System.OutOfMemoryException. By default, no more than DEFAULT_MAX_FIELD_LENGTH terms will be indexed for a field.
SetMaxMergeDocsDetermines the largest segment (measured by document count) that may be merged with other segments. Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches. The default value is {@link Integer#MAX_VALUE}. Note that this method is a convenience method: it just calls mergePolicy.SetMaxMergeDocs as long as mergePolicy is an instance of {@link LogMergePolicy}. Otherwise an System.ArgumentException is thrown. The default merge policy ({@link LogByteSizeMergePolicy}) also allows you to set this limit by net size (in MB) of the segment, using {@link LogByteSizeMergePolicy#setMaxMergeMB}.
SetMaxSyncPauseSecondsObsolete. Expert: sets the max delay before syncing a commit point.
SetMergeFactor 
SetMergePolicy Expert: set the merge policy used by this writer.
SetMergeScheduler Expert: set the merge scheduler used by this writer.
SetRAMBufferSizeMBDetermines the amount of RAM that may be used for buffering added documents before they are flushed as a new Segment. Generally for faster indexing performance it's best to flush by RAM usage instead of document count and use as large a RAM buffer as you can. When this is set, the writer will flush whenever buffered documents use this much RAM. Pass in {@link #DISABLE_AUTO_FLUSH} to prevent triggering a flush due to RAM usage. Note that if flushing by document count is also enabled, then the flush will be triggered by whichever comes first. The default value is {@link #DEFAULT_RAM_BUFFER_SIZE_MB}.
SetSimilarityExpert: Set the Similarity implementation used by this IndexWriter.
SetTermIndexIntervalExpert: Set the interval between indexed terms. Large values cause less memory to be used by IndexReader, but slow random-access to terms. Small values cause more memory to be used by an IndexReader, and speed random-access to terms. This parameter determines the amount of computation required per query term, regardless of the number of documents that contain that term. In particular, it is the maximum number of other terms that must be scanned before a term is located and its frequency and position information may be processed. In a large index with user-entered query terms, query processing time is likely to be dominated not by term lookup but rather by the processing of frequency and positional data. In a small index or when many uncommon query terms are generated (e.g., by wildcard queries) term lookup may become a dominant cost. In particular,
numUniqueTerms/interval
terms are read into memory by an IndexReader, and, on average,
interval/2
terms must be scanned for each random term access.
SetUseCompoundFileSetting to turn on usage of a compound file. When on, multiple files for each segment are merged into a single file when a new segment is flushed. Note that this method is a convenience method: it just calls mergePolicy.SetUseCompoundFile as long as mergePolicy is an instance of {@link LogMergePolicy}. Otherwise an System.ArgumentException is thrown.
SetWriteLockTimeout 
ToString (inherited from Object)Returns a String that represents the current Object.
UpdateDocumentOverloaded. Updates a document by first deleting the document(s) containing
term
and then adding the new document. The delete and then add are atomic as seen by a reader on the same index (flush may happen only after the add).

Protected Instance Methods

EnsureOpenOverloaded.  
FinalizeRelease the write lock, if needed.
MemberwiseClone (inherited from Object)Creates a shallow copy of the current Object.

Protected Internal Instance Methods

DoAfterFlush 
EnsureOpenOverloaded. Used internally to throw an {@link AlreadyClosedException} if this IndexWriter has been closed.
TestPoint 

See Also

IndexWriter Class | Lucene.Net.Index Namespace