q
for document d
correlates to the
/// cosine-distance or dot-product between document and query vectors in a
///
/// Vector Space Model (VSM) of Information Retrieval.
/// A document whose vector is closer to the query vector in that model is scored higher.
///
/// The score is computed as follows:
///
///
///
///
|
/// {@link Lucene.Net.Search.DefaultSimilarity#Tf(float) tf(t in d)} = /// | ////// frequency½ /// | ///
/// {@link Lucene.Net.Search.DefaultSimilarity#Idf(int, int) idf(t)} = /// | ////// 1 + log ( /// | ///
///
|
/// /// ) /// | ///
/// queryNorm(q) = /// {@link Lucene.Net.Search.DefaultSimilarity#QueryNorm(float) queryNorm(sumOfSquaredWeights)} /// = /// | ///
///
|
///
/// {@link Lucene.Net.Search.Weight#SumOfSquaredWeights() sumOfSquaredWeights} = /// {@link Lucene.Net.Search.Query#GetBoost() q.getBoost()} 2 /// · /// | ////// ∑ /// | ////// ( /// idf(t) · /// t.getBoost() /// ) 2 /// | ///
/// | t in q | ////// |
/// norm(t,d) = /// {@link Lucene.Net.Documents.Document#GetBoost() doc.getBoost()} /// · /// {@link #LengthNorm(String, int) lengthNorm(field)} /// · /// | ////// ∏ /// | ////// {@link Lucene.Net.Documents.Fieldable#GetBoost() f.getBoost}() /// | ///
/// | field f in d named as t | ////// |
numTokens
is large,
/// and larger values when numTokens
is small.
///
/// Note that the return values are computed under
/// {@link Lucene.Net.Index.IndexWriter#AddDocument(Lucene.Net.Documents.Document)}
/// and then stored using
/// {@link #EncodeNorm(float)}.
/// Thus they have limited precision, and documents
/// must be re-indexed if this method is altered.
///
/// freq
is large, and smaller values when freq
/// is small.
///
/// The default implementation calls {@link #Tf(float)}.
///
/// freq
is large, and smaller values when freq
/// is small.
///
/// /// return idf(searcher.docFreq(term), searcher.maxDoc()); ////// /// Note that {@link Searcher#MaxDoc()} is used instead of /// {@link Lucene.Net.Index.IndexReader#NumDocs()} because it is proportional to /// {@link Searcher#DocFreq(Term)} , i.e., when one is inaccurate, /// so is the other, and in the same direction. /// ///
/// idf(searcher.docFreq(term), searcher.maxDoc()); ////// /// Note that {@link Searcher#MaxDoc()} is used instead of /// {@link Lucene.Net.Index.IndexReader#NumDocs()} because it is /// proportional to {@link Searcher#DocFreq(Term)} , i.e., when one is /// inaccurate, so is the other, and in the same direction. /// ///