/[Apache-SVN]/lucene/java/trunk/CHANGES.txt
ViewVC logotype

Contents of /lucene/java/trunk/CHANGES.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 416111 - (show annotations)
Wed Jun 21 21:11:53 2006 UTC (3 years, 5 months ago) by yonik
File MIME type: text/plain
File size: 50121 byte(s)
disambiguate inner class call to doc() for ecj compilation: LUCENE-610
1 Lucene Change Log
2
3 $Id$
4
5 Trunk (not yet released)
6
7 New features
8
9 1. LUCENE-503: New ThaiAnalyzer and ThaiWordFilter in contrib/analyzers
10 (Samphan Raruenrom va Chris Hostetter)
11
12 2. LUCENE-545: New FieldSelector API and associated changes to IndexReader and implementations.
13 New Fieldable interface for use with the lazy field loading mechanism. (Grant Ingersoll and Chuck Williams via Grant Ingersoll)
14
15 API Changes
16
17 1. LUCENE-438: Remove "final" from Token, implement Cloneable, allow
18 changing of termText via setTermText(). (Yonik Seeley)
19
20 2. org.apache.lucene.analysis.nl.WordlistLoader has been deprecated
21 and is supposed to be replaced with the WordlistLoader class in
22 package org.apache.lucene.analysis (Daniel Naber)
23
24 10. LUCENE-609: Revert return type of Document.getField(s) to Field
25 for backward compatibility, added new Document.getFieldable(s)
26 for access to new lazy loaded fields. (Yonik Seeley)
27
28 Bug fixes
29
30 1. Fixed the web application demo (built with "ant war-demo") which
31 didn't work because it used a QueryParser method that had
32 been removed (Daniel Naber)
33
34 2. LUCENE-583: ISOLatin1AccentFilter fails to preserve positionIncrement
35 (Yonik Seeley)
36
37 3. LUCENE-575: SpellChecker min score is incorrectly changed by suggestSimilar
38 (Karl Wettin via Yonik Seeley)
39
40 4. LUCENE-587: Explanation.toHtml was producing malformed HTML
41 (Chris Hostetter)
42
43 5. Fix to allow MatchAllDocsQuery to be used with RemoteSearcher (Yonik Seeley)
44
45 6. LUCENE-601: RAMDirectory and RAMFile made Serializable
46 (Karl Wettin via Otis Gospodnetic)
47
48 7. LUCENE-557: Fixes to BooleanQuery and FilteredQuery so that the score
49 Explanations match up with the real scores.
50 (Chris Hostetter)
51
52 8. LUCENE-607: ParallelReader's TermEnum fails to advance properly to
53 new fields (Chuck Williams, Christian Kohlschuetter via Yonik Seeley)
54
55 9. LUCENE-415: A previously unclean shutdown during indexing can cause
56 a non-empty segment file to be re-used, causing index corruption.
57 (Andy Hind via Yonik Seeley)
58
59 10. LUCENE-610: Simple syntax change to allow compilation with ecj :
60 disambiguate inner class scorer's use of doc() in BooleanScorer2.
61 (DM Smith via Yonik Seeley)
62
63 Optimizations
64
65 1. LUCENE-586: TermDocs.skipTo() is now more efficient for multi-segment
66 indexes. This will improve the performance of many types of queries
67 against a non-optimized index. (Andrew Hudson via Yonik Seeley)
68
69
70
71 Release 2.0.0 2005-05-26
72
73 API Changes
74
75 1. All deprecated methods and fields have been removed, except
76 DateField, which will still be supported for some time
77 so Lucene can read its date fields from old indexes
78 (Yonik Seeley & Grant Ingersoll)
79
80 2. DisjunctionSumScorer is no longer public.
81 (Paul Elschot via Otis Gospodnetic)
82
83 3. Creating a Field with both an empty name and an empty value
84 now throws an IllegalArgumentException
85 (Daniel Naber)
86
87 New features
88
89 1. LUCENE-496: Command line tool for modifying the field norms of an
90 existing index; added to contrib/miscellaneous. (Chris Hostetter)
91
92 2. LUCENE-577: SweetSpotSimilarity added to contrib/miscellaneous.
93 (Chris Hostetter)
94
95 Bug fixes
96
97 1. LUCENE-330: Fix issue of FilteredQuery not working properly within
98 BooleanQuery. (Paul Elschot via Erik Hatcher)
99
100 2. LUCENE-515: Make ConstantScoreRangeQuery and ConstantScoreQuery work
101 with RemoteSearchable. (Philippe Laflamme via Yonik Seeley)
102
103 3. Added methods to get/set writeLockTimeout and commitLockTimeout in
104 IndexWriter. These could be set in Lucene 1.4 using a system property.
105 This feature had been removed without adding the corresponding
106 getter/setter methods. (Daniel Naber)
107
108 4. LUCENE-413: Fixed ArrayIndexOutOfBoundsException exceptions
109 when using SpanQueries. (Paul Elschot via Yonik Seeley)
110
111 5. Implemented FilterIndexReader.getVersion() and isCurrent()
112 (Yonik Seeley)
113
114 6. LUCENE-540: Fixed a bug with IndexWriter.addIndexes(Directory[])
115 that sometimes caused the index order of documents to change.
116 (Yonik Seeley)
117
118 7. LUCENE-526: Fixed a bug in FieldSortedHitQueue that caused
119 subsequent String sorts with different locales to sort identically.
120 (Paul Cowan via Yonik Seeley)
121
122 8. LUCENE-541: Add missing extractTerms() to DisjunctionMaxQuery
123 (Stefan Will via Yonik Seeley)
124
125 9. LUCENE-514: Added getTermArrays() and extractTerms() to
126 MultiPhraseQuery (Eric Jain & Yonik Seeley)
127
128 10. LUCENE-512: Fixed ClassCastException in ParallelReader.getTermFreqVectors
129 (frederic via Yonik)
130
131 11. LUCENE-352: Fixed bug in SpanNotQuery that manifested as
132 NullPointerException when "exclude" query was not a SpanTermQuery.
133 (Chris Hostetter)
134
135 12. LUCENE-572: Fixed bug in SpanNotQuery hashCode, was ignoring exclude clause
136 (Chris Hostetter)
137
138 13. LUCENE-561: Fixed some ParallelReader bugs. NullPointerException if the reader
139 didn't know about the field yet, reader didn't keep track if it had deletions,
140 and deleteDocument calls could circumvent synchronization on the subreaders.
141 (Chuck Williams via Yonik Seeley)
142
143 14. LUCENE-556: Added empty extractTerms() implementation to MatchAllDocsQuery and
144 ConstantScoreQuery in order to allow their use with a MultiSearcher.
145 (Yonik Seeley)
146
147 15. LUCENE-546: Removed 2GB file size limitations for RAMDirectory.
148 (Peter Royal, Michael Chan, Yonik Seeley)
149
150 16. LUCENE-485: Don't hold commit lock while removing obsolete index
151 files. (Luc Vanlerberghe via cutting)
152
153
154 1.9.1
155
156 Bug fixes
157
158 1. LUCENE-511: Fix a bug in the BufferedIndexOutput optimization
159 introduced in 1.9-final. (Shay Banon & Steven Tamm via cutting)
160
161 1.9 final
162
163 Note that this realease is mostly but not 100% source compatible with
164 the previous release of Lucene (1.4.3). In other words, you should
165 make sure your application compiles with this version of Lucene before
166 you replace the old Lucene JAR with the new one. Many methods have
167 been deprecated in anticipation of release 2.0, so deprecation
168 warnings are to be expected when upgrading from 1.4.3 to 1.9.
169
170 Bug fixes
171
172 1. The fix that made IndexWriter.setMaxBufferedDocs(1) work had negative
173 effects on indexing performance and has thus been reverted. The
174 argument for setMaxBufferedDocs(int) must now at least be 2, otherwise
175 an exception is thrown. (Daniel Naber)
176
177 Optimizations
178
179 1. Optimized BufferedIndexOutput.writeBytes() to use
180 System.arraycopy() in more cases, rather than copying byte-by-byte.
181 (Lukas Zapletal via Cutting)
182
183 1.9 RC1
184
185 Requirements
186
187 1. To compile and use Lucene you now need Java 1.4 or later.
188
189 Changes in runtime behavior
190
191 1. FuzzyQuery can no longer throw a TooManyClauses exception. If a
192 FuzzyQuery expands to more than BooleanQuery.maxClauseCount
193 terms only the BooleanQuery.maxClauseCount most similar terms
194 go into the rewritten query and thus the exception is avoided.
195 (Christoph)
196
197 2. Changed system property from "org.apache.lucene.lockdir" to
198 "org.apache.lucene.lockDir", so that its casing follows the existing
199 pattern used in other Lucene system properties. (Bernhard)
200
201 3. The terms of RangeQueries and FuzzyQueries are now converted to
202 lowercase by default (as it has been the case for PrefixQueries
203 and WildcardQueries before). Use setLowercaseExpandedTerms(false)
204 to disable that behavior but note that this also affects
205 PrefixQueries and WildcardQueries. (Daniel Naber)
206
207 4. Document frequency that is computed when MultiSearcher is used is now
208 computed correctly and "globally" across subsearchers and indices, while
209 before it used to be computed locally to each index, which caused
210 ranking across multiple indices not to be equivalent.
211 (Chuck Williams, Wolf Siberski via Otis, bug #31841)
212
213 5. When opening an IndexWriter with create=true, Lucene now only deletes
214 its own files from the index directory (looking at the file name suffixes
215 to decide if a file belongs to Lucene). The old behavior was to delete
216 all files. (Daniel Naber and Bernhard Messer, bug #34695)
217
218 6. The version of an IndexReader, as returned by getCurrentVersion()
219 and getVersion() doesn't start at 0 anymore for new indexes. Instead, it
220 is now initialized by the system time in milliseconds.
221 (Bernhard Messer via Daniel Naber)
222
223 7. Several default values cannot be set via system properties anymore, as
224 this has been considered inappropriate for a library like Lucene. For
225 most properties there are set/get methods available in IndexWriter which
226 you should use instead. This affects the following properties:
227 See IndexWriter for getter/setter methods:
228 org.apache.lucene.writeLockTimeout, org.apache.lucene.commitLockTimeout,
229 org.apache.lucene.minMergeDocs, org.apache.lucene.maxMergeDocs,
230 org.apache.lucene.maxFieldLength, org.apache.lucene.termIndexInterval,
231 org.apache.lucene.mergeFactor,
232 See BooleanQuery for getter/setter methods:
233 org.apache.lucene.maxClauseCount
234 See FSDirectory for getter/setter methods:
235 disableLuceneLocks
236 (Daniel Naber)
237
238 8. Fixed FieldCacheImpl to use user-provided IntParser and FloatParser,
239 instead of using Integer and Float classes for parsing.
240 (Yonik Seeley via Otis Gospodnetic)
241
242 9. Expert level search routines returning TopDocs and TopFieldDocs
243 no longer normalize scores. This also fixes bugs related to
244 MultiSearchers and score sorting/normalization.
245 (Luc Vanlerberghe via Yonik Seeley, LUCENE-469)
246
247 New features
248
249 1. Added support for stored compressed fields (patch #31149)
250 (Bernhard Messer via Christoph)
251
252 2. Added support for binary stored fields (patch #29370)
253 (Drew Farris and Bernhard Messer via Christoph)
254
255 3. Added support for position and offset information in term vectors
256 (patch #18927). (Grant Ingersoll & Christoph)
257
258 4. A new class DateTools has been added. It allows you to format dates
259 in a readable format adequate for indexing. Unlike the existing
260 DateField class DateTools can cope with dates before 1970 and it
261 forces you to specify the desired date resolution (e.g. month, day,
262 second, ...) which can make RangeQuerys on those fields more efficient.
263 (Daniel Naber)
264
265 5. QueryParser now correctly works with Analyzers that can return more
266 than one token per position. For example, a query "+fast +car"
267 would be parsed as "+fast +(car automobile)" if the Analyzer
268 returns "car" and "automobile" at the same position whenever it
269 finds "car" (Patch #23307).
270 (Pierrick Brihaye, Daniel Naber)
271
272 6. Permit unbuffered Directory implementations (e.g., using mmap).
273 InputStream is replaced by the new classes IndexInput and
274 BufferedIndexInput. OutputStream is replaced by the new classes
275 IndexOutput and BufferedIndexOutput. InputStream and OutputStream
276 are now deprecated and FSDirectory is now subclassable. (cutting)
277
278 7. Add native Directory and TermDocs implementations that work under
279 GCJ. These require GCC 3.4.0 or later and have only been tested
280 on Linux. Use 'ant gcj' to build demo applications. (cutting)
281
282 8. Add MMapDirectory, which uses nio to mmap input files. This is
283 still somewhat slower than FSDirectory. However it uses less
284 memory per query term, since a new buffer is not allocated per
285 term, which may help applications which use, e.g., wildcard
286 queries. It may also someday be faster. (cutting & Paul Elschot)
287
288 9. Added javadocs-internal to build.xml - bug #30360
289 (Paul Elschot via Otis)
290
291 10. Added RangeFilter, a more generically useful filter than DateFilter.
292 (Chris M Hostetter via Erik)
293
294 11. Added NumberTools, a utility class indexing numeric fields.
295 (adapted from code contributed by Matt Quail; committed by Erik)
296
297 12. Added public static IndexReader.main(String[] args) method.
298 IndexReader can now be used directly at command line level
299 to list and optionally extract the individual files from an existing
300 compound index file.
301 (adapted from code contributed by Garrett Rooney; committed by Bernhard)
302
303 13. Add IndexWriter.setTermIndexInterval() method. See javadocs.
304 (Doug Cutting)
305
306 14. Added LucenePackage, whose static get() method returns java.util.Package,
307 which lets the caller get the Lucene version information specified in
308 the Lucene Jar.
309 (Doug Cutting via Otis)
310
311 15. Added Hits.iterator() method and corresponding HitIterator and Hit objects.
312 This provides standard java.util.Iterator iteration over Hits.
313 Each call to the iterator's next() method returns a Hit object.
314 (Jeremy Rayner via Erik)
315
316 16. Add ParallelReader, an IndexReader that combines separate indexes
317 over different fields into a single virtual index. (Doug Cutting)
318
319 17. Add IntParser and FloatParser interfaces to FieldCache, so that
320 fields in arbitrarily formats can be cached as ints and floats.
321 (Doug Cutting)
322
323 18. Added class org.apache.lucene.index.IndexModifier which combines
324 IndexWriter and IndexReader, so you can add and delete documents without
325 worrying about synchronisation/locking issues.
326 (Daniel Naber)
327
328 19. Lucene can now be used inside an unsigned applet, as Lucene's access
329 to system properties will not cause a SecurityException anymore.
330 (Jon Schuster via Daniel Naber, bug #34359)
331
332 20. Added a new class MatchAllDocsQuery that matches all documents.
333 (John Wang via Daniel Naber, bug #34946)
334
335 21. Added ability to omit norms on a per field basis to decrease
336 index size and memory consumption when there are many indexed fields.
337 See Field.setOmitNorms()
338 (Yonik Seeley, LUCENE-448)
339
340 22. Added NullFragmenter to contrib/highlighter, which is useful for
341 highlighting entire documents or fields.
342 (Erik Hatcher)
343
344 23. Added regular expression queries, RegexQuery and SpanRegexQuery.
345 Note the same term enumeration caveats apply with these queries as
346 apply to WildcardQuery and other term expanding queries.
347 These two new queries are not currently supported via QueryParser.
348 (Erik Hatcher)
349
350 24. Added ConstantScoreQuery which wraps a filter and produces a score
351 equal to the query boost for every matching document.
352 (Yonik Seeley, LUCENE-383)
353
354 25. Added ConstantScoreRangeQuery which produces a constant score for
355 every document in the range. One advantage over a normal RangeQuery
356 is that it doesn't expand to a BooleanQuery and thus doesn't have a maximum
357 number of terms the range can cover. Both endpoints may also be open.
358 (Yonik Seeley, LUCENE-383)
359
360 26. Added ability to specify a minimum number of optional clauses that
361 must match in a BooleanQuery. See BooleanQuery.setMinimumNumberShouldMatch().
362 (Paul Elschot, Chris Hostetter via Yonik Seeley, LUCENE-395)
363
364 27. Added DisjunctionMaxQuery which provides the maximum score across it's clauses.
365 It's very useful for searching across multiple fields.
366 (Chuck Williams via Yonik Seeley, LUCENE-323)
367
368 28. New class ISOLatin1AccentFilter that replaces accented characters in the ISO
369 Latin 1 character set by their unaccented equivalent.
370 (Sven Duzont via Erik Hatcher)
371
372 29. New class KeywordAnalyzer. "Tokenizes" the entire stream as a single token.
373 This is useful for data like zip codes, ids, and some product names.
374 (Erik Hatcher)
375
376 30. Copied LengthFilter from contrib area to core. Removes words that are too
377 long and too short from the stream.
378 (David Spencer via Otis and Daniel)
379
380 31. Added getPositionIncrementGap(String fieldName) to Analyzer. This allows
381 custom analyzers to put gaps between Field instances with the same field
382 name, preventing phrase or span queries crossing these boundaries. The
383 default implementation issues a gap of 0, allowing the default token
384 position increment of 1 to put the next field's first token into a
385 successive position.
386 (Erik Hatcher, with advice from Yonik)
387
388 32. StopFilter can now ignore case when checking for stop words.
389 (Grant Ingersoll via Yonik, LUCENE-248)
390
391 33. Add TopDocCollector and TopFieldDocCollector. These simplify the
392 implementation of hit collectors that collect only the
393 top-scoring or top-sorting hits.
394
395 API Changes
396
397 1. Several methods and fields have been deprecated. The API documentation
398 contains information about the recommended replacements. It is planned
399 that most of the deprecated methods and fields will be removed in
400 Lucene 2.0. (Daniel Naber)
401
402 2. The Russian and the German analyzers have been moved to contrib/analyzers.
403 Also, the WordlistLoader class has been moved one level up in the
404 hierarchy and is now org.apache.lucene.analysis.WordlistLoader
405 (Daniel Naber)
406
407 3. The API contained methods that declared to throw an IOException
408 but that never did this. These declarations have been removed. If
409 your code tries to catch these exceptions you might need to remove
410 those catch clauses to avoid compile errors. (Daniel Naber)
411
412 4. Add a serializable Parameter Class to standardize parameter enum
413 classes in BooleanClause and Field. (Christoph)
414
415 5. Added rewrite methods to all SpanQuery subclasses that nest other SpanQuerys.
416 This allows custom SpanQuery subclasses that rewrite (for term expansion, for
417 example) to nest within the built-in SpanQuery classes successfully.
418
419 Bug fixes
420
421 1. The JSP demo page (src/jsp/results.jsp) now properly closes the
422 IndexSearcher it opens. (Daniel Naber)
423
424 2. Fixed a bug in IndexWriter.addIndexes(IndexReader[] readers) that
425 prevented deletion of obsolete segments. (Christoph Goller)
426
427 3. Fix in FieldInfos to avoid the return of an extra blank field in
428 IndexReader.getFieldNames() (Patch #19058). (Mark Harwood via Bernhard)
429
430 4. Some combinations of BooleanQuery and MultiPhraseQuery (formerly
431 PhrasePrefixQuery) could provoke UnsupportedOperationException
432 (bug #33161). (Rhett Sutphin via Daniel Naber)
433
434 5. Small bug in skipTo of ConjunctionScorer that caused NullPointerException
435 if skipTo() was called without prior call to next() fixed. (Christoph)
436
437 6. Disable Similiarty.coord() in the scoring of most automatically
438 generated boolean queries. The coord() score factor is
439 appropriate when clauses are independently specified by a user,
440 but is usually not appropriate when clauses are generated
441 automatically, e.g., by a fuzzy, wildcard or range query. Matches
442 on such automatically generated queries are no longer penalized
443 for not matching all terms. (Doug Cutting, Patch #33472)
444
445 7. Getting a lock file with Lock.obtain(long) was supposed to wait for
446 a given amount of milliseconds, but this didn't work.
447 (John Wang via Daniel Naber, Bug #33799)
448
449 8. Fix FSDirectory.createOutput() to always create new files.
450 Previously, existing files were overwritten, and an index could be
451 corrupted when the old version of a file was longer than the new.
452 Now any existing file is first removed. (Doug Cutting)
453
454 9. Fix BooleanQuery containing nested SpanTermQuery's, which previously
455 could return an incorrect number of hits.
456 (Reece Wilton via Erik Hatcher, Bug #35157)
457
458 10. Fix NullPointerException that could occur with a MultiPhraseQuery
459 inside a BooleanQuery.
460 (Hans Hjelm and Scotty Allen via Daniel Naber, Bug #35626)
461
462 11. Fixed SnowballFilter to pass through the position increment from
463 the original token.
464 (Yonik Seeley via Erik Hatcher, LUCENE-437)
465
466 12. Added Unicode range of Korean characters to StandardTokenizer,
467 grouping contiguous characters into a token rather than one token
468 per character. This change also changes the token type to "<CJ>"
469 for Chinese and Japanese character tokens (previously it was "<CJK>").
470 (Cheolgoo Kang via Otis and Erik, LUCENE-444 and LUCENE-461)
471
472 13. FieldsReader now looks at FieldInfo.storeOffsetWithTermVector and
473 FieldInfo.storePositionWithTermVector and creates the Field with
474 correct TermVector parameter.
475 (Frank Steinmann via Bernhard, LUCENE-455)
476
477 14. Fixed WildcardQuery to prevent "cat" matching "ca??".
478 (Xiaozheng Ma via Bernhard, LUCENE-306)
479
480 15. Fixed a bug where MultiSearcher and ParallelMultiSearcher could
481 change the sort order when sorting by string for documents without
482 a value for the sort field.
483 (Luc Vanlerberghe via Yonik, LUCENE-453)
484
485 16. Fixed a sorting problem with MultiSearchers that can lead to
486 missing or duplicate docs due to equal docs sorting in an arbitrary order.
487 (Yonik Seeley, LUCENE-456)
488
489 17. A single hit using the expert level sorted search methods
490 resulted in the score not being normalized.
491 (Yonik Seeley, LUCENE-462)
492
493 18. Fixed inefficient memory usage when loading an index into RAMDirectory.
494 (Volodymyr Bychkoviak via Bernhard, LUCENE-475)
495
496 19. Corrected term offsets returned by ChineseTokenizer.
497 (Ray Tsang via Erik Hatcher, LUCENE-324)
498
499 20. Fixed MultiReader.undeleteAll() to correctly update numDocs.
500 (Robert Kirchgessner via Doug Cutting, LUCENE-479)
501
502 21. Race condition in IndexReader.getCurrentVersion() and isCurrent()
503 fixed by aquiring the commit lock.
504 (Luc Vanlerberghe via Yonik Seeley, LUCENE-481)
505
506 22. IndexWriter.setMaxBufferedDocs(1) didn't have the expected effect,
507 this has now been fixed. (Daniel Naber)
508
509 23. Fixed QueryParser when called with a date in local form like
510 "[1/16/2000 TO 1/18/2000]". This query did not include the documents
511 of 1/18/2000, i.e. the last day was not included. (Daniel Naber)
512
513 24. Removed sorting constraint that threw an exception if there were
514 not yet any values for the sort field (Yonik Seeley, LUCENE-374)
515
516 Optimizations
517
518 1. Disk usage (peak requirements during indexing and optimization)
519 in case of compound file format has been improved.
520 (Bernhard, Dmitry, and Christoph)
521
522 2. Optimize the performance of certain uses of BooleanScorer,
523 TermScorer and IndexSearcher. In particular, a BooleanQuery
524 composed of TermQuery, with not all terms required, that returns a
525 TopDocs (e.g., through a Hits with no Sort specified) runs much
526 faster. (cutting)
527
528 3. Removed synchronization from reading of term vectors with an
529 IndexReader (Patch #30736). (Bernhard Messer via Christoph)
530
531 4. Optimize term-dictionary lookup to allocate far fewer terms when
532 scanning for the matching term. This speeds searches involving
533 low-frequency terms, where the cost of dictionary lookup can be
534 significant. (cutting)
535
536 5. Optimize fuzzy queries so the standard fuzzy queries with a prefix
537 of 0 now run 20-50% faster (Patch #31882).
538 (Jonathan Hager via Daniel Naber)
539
540 6. A Version of BooleanScorer (BooleanScorer2) added that delivers
541 documents in increasing order and implements skipTo. For queries
542 with required or forbidden clauses it may be faster than the old
543 BooleanScorer, for BooleanQueries consisting only of optional
544 clauses it is probably slower. The new BooleanScorer is now the
545 default. (Patch 31785 by Paul Elschot via Christoph)
546
547 7. Use uncached access to norms when merging to reduce RAM usage.
548 (Bug #32847). (Doug Cutting)
549
550 8. Don't read term index when random-access is not required. This
551 reduces time to open IndexReaders and they use less memory when
552 random access is not required, e.g., when merging segments. The
553 term index is now read into memory lazily at the first
554 random-access. (Doug Cutting)
555
556 9. Optimize IndexWriter.addIndexes(Directory[]) when the number of
557 added indexes is larger than mergeFactor. Previously this could
558 result in quadratic performance. Now performance is n log(n).
559 (Doug Cutting)
560
561 10. Speed up the creation of TermEnum for indicies with multiple
562 segments and deleted documents, and thus speed up PrefixQuery,
563 RangeQuery, WildcardQuery, FuzzyQuery, RangeFilter, DateFilter,
564 and sorting the first time on a field.
565 (Yonik Seeley, LUCENE-454)
566
567 11. Optimized and generalized 32 bit floating point to byte
568 (custom 8 bit floating point) conversions. Increased the speed of
569 Similarity.encodeNorm() anywhere from 10% to 250%, depending on the JVM.
570 (Yonik Seeley, LUCENE-467)
571
572 Infrastructure
573
574 1. Lucene's source code repository has converted from CVS to
575 Subversion. The new repository is at
576 http://svn.apache.org/repos/asf/lucene/java/trunk
577
578 2. Lucene's issue tracker has migrated from Bugzilla to JIRA.
579 Lucene's JIRA is at http://issues.apache.org/jira/browse/LUCENE
580 The old issues are still available at
581 http://issues.apache.org/bugzilla/show_bug.cgi?id=xxxx
582 (use the bug number instead of xxxx)
583
584
585 1.4.3
586
587 1. The JSP demo page (src/jsp/results.jsp) now properly escapes error
588 messages which might contain user input (e.g. error messages about
589 query parsing). If you used that page as a starting point for your
590 own code please make sure your code also properly escapes HTML
591 characters from user input in order to avoid so-called cross site
592 scripting attacks. (Daniel Naber)
593
594 2. QueryParser changes in 1.4.2 broke the QueryParser API. Now the old
595 API is supported again. (Christoph)
596
597
598 1.4.2
599
600 1. Fixed bug #31241: Sorting could lead to incorrect results (documents
601 missing, others duplicated) if the sort keys were not unique and there
602 were more than 100 matches. (Daniel Naber)
603
604 2. Memory leak in Sort code (bug #31240) eliminated.
605 (Rafal Krzewski via Christoph and Daniel)
606
607 3. FuzzyQuery now takes an additional parameter that specifies the
608 minimum similarity that is required for a term to match the query.
609 The QueryParser syntax for this is term~x, where x is a floating
610 point number >= 0 and < 1 (a bigger number means that a higher
611 similarity is required). Furthermore, a prefix can be specified
612 for FuzzyQuerys so that only those terms are considered similar that
613 start with this prefix. This can speed up FuzzyQuery greatly.
614 (Daniel Naber, Christoph Goller)
615
616 4. PhraseQuery and PhrasePrefixQuery now allow the explicit specification
617 of relative positions. (Christoph Goller)
618
619 5. QueryParser changes: Fix for ArrayIndexOutOfBoundsExceptions
620 (patch #9110); some unused method parameters removed; The ability
621 to specify a minimum similarity for FuzzyQuery has been added.
622 (Christoph Goller)
623
624 6. IndexSearcher optimization: a new ScoreDoc is no longer allocated
625 for every non-zero-scoring hit. This makes 'OR' queries that
626 contain common terms substantially faster. (cutting)
627
628
629 1.4.1
630
631 1. Fixed a performance bug in hit sorting code, where values were not
632 correctly cached. (Aviran via cutting)
633
634 2. Fixed errors in file format documentation. (Daniel Naber)
635
636
637 1.4 final
638
639 1. Added "an" to the list of stop words in StopAnalyzer, to complement
640 the existing "a" there. Fix for bug 28960
641 (http://issues.apache.org/bugzilla/show_bug.cgi?id=28960). (Otis)
642
643 2. Added new class FieldCache to manage in-memory caches of field term
644 values. (Tim Jones)
645
646 3. Added overloaded getFieldQuery method to QueryParser which
647 accepts the slop factor specified for the phrase (or the default
648 phrase slop for the QueryParser instance). This allows overriding
649 methods to replace a PhraseQuery with a SpanNearQuery instead,
650 keeping the proper slop factor. (Erik Hatcher)
651
652 4. Changed the encoding of GermanAnalyzer.java and GermanStemmer.java to
653 UTF-8 and changed the build encoding to UTF-8, to make changed files
654 compile. (Otis Gospodnetic)
655
656 5. Removed synchronization from term lookup under IndexReader methods
657 termFreq(), termDocs() or termPositions() to improve
658 multi-threaded performance. (cutting)
659
660 6. Fix a bug where obsolete segment files were not deleted on Win32.
661
662
663 1.4 RC3
664
665 1. Fixed several search bugs introduced by the skipTo() changes in
666 release 1.4RC1. The index file format was changed a bit, so
667 collections must be re-indexed to take advantage of the skipTo()
668 optimizations. (Christoph Goller)
669
670 2. Added new Document methods, removeField() and removeFields().
671 (Christoph Goller)
672
673 3. Fixed inconsistencies with index closing. Indexes and directories
674 are now only closed automatically by Lucene when Lucene opened
675 them automatically. (Christoph Goller)
676
677 4. Added new class: FilteredQuery. (Tim Jones)
678
679 5. Added a new SortField type for custom comparators. (Tim Jones)
680
681 6. Lock obtain timed out message now displays the full path to the lock
682 file. (Daniel Naber via Erik)
683
684 7. Fixed a bug in SpanNearQuery when ordered. (Paul Elschot via cutting)
685
686 8. Fixed so that FSDirectory's locks still work when the
687 java.io.tmpdir system property is null. (cutting)
688
689 9. Changed FilteredTermEnum's constructor to take no parameters,
690 as the parameters were ignored anyway (bug #28858)
691
692 1.4 RC2
693
694 1. GermanAnalyzer now throws an exception if the stopword file
695 cannot be found (bug #27987). It now uses LowerCaseFilter
696 (bug #18410) (Daniel Naber via Otis, Erik)
697
698 2. Fixed a few bugs in the file format documentation. (cutting)
699
700
701 1.4 RC1
702
703 1. Changed the format of the .tis file, so that:
704
705 - it has a format version number, which makes it easier to
706 back-compatibly change file formats in the future.
707
708 - the term count is now stored as a long. This was the one aspect
709 of the Lucene's file formats which limited index size.
710
711 - a few internal index parameters are now stored in the index, so
712 that they can (in theory) now be changed from index to index,
713 although there is not yet an API to do so.
714
715 These changes are back compatible. The new code can read old
716 indexes. But old code will not be able read new indexes. (cutting)
717
718 2. Added an optimized implementation of TermDocs.skipTo(). A skip
719 table is now stored for each term in the .frq file. This only
720 adds a percent or two to overall index size, but can substantially
721 speedup many searches. (cutting)
722
723 3. Restructured the Scorer API and all Scorer implementations to take
724 advantage of an optimized TermDocs.skipTo() implementation. In
725 particular, PhraseQuerys and conjunctive BooleanQuerys are
726 faster when one clause has substantially fewer matches than the
727 others. (A conjunctive BooleanQuery is a BooleanQuery where all
728 clauses are required.) (cutting)
729
730 4. Added new class ParallelMultiSearcher. Combined with
731 RemoteSearchable this makes it easy to implement distributed
732 search systems. (Jean-Francois Halleux via cutting)
733
734 5. Added support for hit sorting. Results may now be sorted by any
735 indexed field. For details see the javadoc for
736 Searcher#search(Query, Sort). (Tim Jones via Cutting)
737
738 6. Changed FSDirectory to auto-create a full directory tree that it
739 needs by using mkdirs() instead of mkdir(). (Mladen Turk via Otis)
740
741 7. Added a new span-based query API. This implements, among other
742 things, nested phrases. See javadocs for details. (Doug Cutting)
743
744 8. Added new method Query.getSimilarity(Searcher), and changed
745 scorers to use it. This permits one to subclass a Query class so
746 that it can specify it's own Similarity implementation, perhaps
747 one that delegates through that of the Searcher. (Julien Nioche
748 via Cutting)
749
750 9. Added MultiReader, an IndexReader that combines multiple other
751 IndexReaders. (Cutting)
752
753 10. Added support for term vectors. See Field#isTermVectorStored().
754 (Grant Ingersoll, Cutting & Dmitry)
755
756 11. Fixed the old bug with escaping of special characters in query
757 strings: http://issues.apache.org/bugzilla/show_bug.cgi?id=24665
758 (Jean-Francois Halleux via Otis)
759
760 12. Added support for overriding default values for the following,
761 using system properties:
762 - default commit lock timeout
763 - default maxFieldLength
764 - default maxMergeDocs
765 - default mergeFactor
766 - default minMergeDocs
767 - default write lock timeout
768 (Otis)
769
770 13. Changed QueryParser.jj to allow '-' and '+' within tokens:
771 http://issues.apache.org/bugzilla/show_bug.cgi?id=27491
772 (Morus Walter via Otis)
773
774 14. Changed so that the compound index format is used by default.
775 This makes indexing a bit slower, but vastly reduces the chances
776 of file handle problems. (Cutting)
777
778
779 1.3 final
780
781 1. Added catch of BooleanQuery$TooManyClauses in QueryParser to
782 throw ParseException instead. (Erik Hatcher)
783
784 2. Fixed a NullPointerException in Query.explain(). (Doug Cutting)
785
786 3. Added a new method IndexReader.setNorm(), that permits one to
787 alter the boosting of fields after an index is created.
788
789 4. Distinguish between the final position and length when indexing a
790 field. The length is now defined as the total number of tokens,
791 instead of the final position, as it was previously. Length is
792 used for score normalization (Similarity.lengthNorm()) and for
793 controlling memory usage (IndexWriter.maxFieldLength). In both of
794 these cases, the total number of tokens is a better value to use
795 than the final token position. Position is used in phrase
796 searching (see PhraseQuery and Token.setPositionIncrement()).
797
798 5. Fix StandardTokenizer's handling of CJK characters (Chinese,
799 Japanese and Korean ideograms). Previously contiguous sequences
800 were combined in a single token, which is not very useful. Now
801 each ideogram generates a separate token, which is more useful.
802
803
804 1.3 RC3
805
806 1. Added minMergeDocs in IndexWriter. This can be raised to speed
807 indexing without altering the number of files, but only using more
808 memory. (Julien Nioche via Otis)
809
810 2. Fix bug #24786, in query rewriting. (bschneeman via Cutting)
811
812 3. Fix bug #16952, in demo HTML parser, skip comments in
813 javascript. (Christoph Goller)
814
815 4. Fix bug #19253, in demo HTML parser, add whitespace as needed to
816 output (Daniel Naber via Christoph Goller)
817
818 5. Fix bug #24301, in demo HTML parser, long titles no longer
819 hang things. (Christoph Goller)
820
821 6. Fix bug #23534, Replace use of file timestamp of segments file
822 with an index version number stored in the segments file. This
823 resolves problems when running on file systems with low-resolution
824 timestamps, e.g., HFS under MacOS X. (Christoph Goller)
825
826 7. Fix QueryParser so that TokenMgrError is not thrown, only
827 ParseException. (Erik Hatcher)
828
829 8. Fix some bugs introduced by change 11 of RC2. (Christoph Goller)
830
831 9. Fixed a problem compiling TestRussianStem. (Christoph Goller)
832
833 10. Cleaned up some build stuff. (Erik Hatcher)
834
835
836 1.3 RC2
837
838 1. Added getFieldNames(boolean) to IndexReader, SegmentReader, and
839 SegmentsReader. (Julien Nioche via otis)
840
841 2. Changed file locking to place lock files in
842 System.getProperty("java.io.tmpdir"), where all users are
843 permitted to write files. This way folks can open and correctly
844 lock indexes which are read-only to them.
845
846 3. IndexWriter: added a new method, addDocument(Document, Analyzer),
847 permitting one to easily use different analyzers for different
848 documents in the same index.
849
850 4. Minor enhancements to FuzzyTermEnum.
851 (Christoph Goller via Otis)
852
853 5. PriorityQueue: added insert(Object) method and adjusted IndexSearcher
854 and MultiIndexSearcher to use it.
855 (Christoph Goller via Otis)
856
857 6. Fixed a bug in IndexWriter that returned incorrect docCount().
858 (Christoph Goller via Otis)
859
860 7. Fixed SegmentsReader to eliminate the confusing and slightly different
861 behaviour of TermEnum when dealing with an enumeration of all terms,
862 versus an enumeration starting from a specific term.
863 This patch also fixes incorrect term document frequences when the same term
864 is present in multiple segments.
865 (Christoph Goller via Otis)
866
867 8. Added CachingWrapperFilter and PerFieldAnalyzerWrapper. (Erik Hatcher)
868
869 9. Added support for the new "compound file" index format (Dmitry
870 Serebrennikov)
871
872 10. Added Locale setting to QueryParser, for use by date range parsing.
873
874 11. Changed IndexReader so that it can be subclassed by classes
875 outside of its package. Previously it had package-private
876 abstract methods. Also modified the index merging code so that it
877 can work on an arbitrary IndexReader implementation, and added a
878 new method, IndexWriter.addIndexes(IndexReader[]), to take
879 advantage of this. (cutting)
880
881 12. Added a limit to the number of clauses which may be added to a
882 BooleanQuery. The default limit is 1024 clauses. This should
883 stop most OutOfMemoryExceptions by prefix, wildcard and fuzzy
884 queries which run amok. (cutting)
885
886 13. Add new method: IndexReader.undeleteAll(). This undeletes all
887 deleted documents which still remain in the index. (cutting)
888
889
890 1.3 RC1
891
892 1. Fixed PriorityQueue's clear() method.
893 Fix for bug 9454, http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9454
894 (Matthijs Bomhoff via otis)
895
896 2. Changed StandardTokenizer.jj grammar for EMAIL tokens.
897 Fix for bug 9015, http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9015
898 (Dale Anson via otis)
899
900 3. Added the ability to disable lock creation by using disableLuceneLocks
901 system property. This is useful for read-only media, such as CD-ROMs.
902 (otis)
903
904 4. Added id method to Hits to be able to access the index global id.
905 Required for sorting options.
906 (carlson)
907
908 5. Added support for new range query syntax to QueryParser.jj.
909 (briangoetz)
910
911 6. Added the ability to retrieve HTML documents' META tag values to
912 HTMLParser.jj.
913 (Mark Harwood via otis)
914
915 7. Modified QueryParser to make it possible to programmatically specify the
916 default Boolean operator (OR or AND).
917 (Péter Halácsy via otis)
918
919 8. Made many search methods and classes non-final, per requests.
920 This includes IndexWriter and IndexSearcher, among others.
921 (cutting)
922
923 9. Added class RemoteSearchable, providing support for remote
924 searching via RMI. The test class RemoteSearchableTest.java
925 provides an example of how this can be used. (cutting)
926
927 10. Added PhrasePrefixQuery (and supporting MultipleTermPositions). The
928 test class TestPhrasePrefixQuery provides the usage example.
929 (Anders Nielsen via otis)
930
931 11. Changed the German stemming algorithm to ignore case while
932 stripping. The new algorithm is faster and produces more equal
933 stems from nouns and verbs derived from the same word.
934 (gschwarz)
935
936 12. Added support for boosting the score of documents and fields via
937 the new methods Document.setBoost(float) and Field.setBoost(float).
938
939 Note: This changes the encoding of an indexed value. Indexes
940 should be re-created from scratch in order for search scores to
941 be correct. With the new code and an old index, searches will
942 yield very large scores for shorter fields, and very small scores
943 for longer fields. Once the index is re-created, scores will be
944 as before. (cutting)
945
946 13. Added new method Token.setPositionIncrement().
947
948 This permits, for the purpose of phrase searching, placing
949 multiple terms in a single position. This is useful with
950 stemmers that produce multiple possible stems for a word.
951
952 This also permits the introduction of gaps between terms, so that
953 terms which are adjacent in a token stream will not be matched by
954 and exact phrase query. This makes it possible, e.g., to build
955 an analyzer where phrases are not matched over stop words which
956 have been removed.
957
958 Finally, repeating a token with an increment of zero can also be
959 used to boost scores of matches on that token. (cutting)
960
961 14. Added new Filter class, QueryFilter. This constrains search
962 results to only match those which also match a provided query.
963 Results are cached, so that searches after the first on the same
964 index using this filter are very fast.
965
966 This could be used, for example, with a RangeQuery on a formatted
967 date field to implement date filtering. One could re-use a
968 single QueryFilter that matches, e.g., only documents modified
969 within the last week. The QueryFilter and RangeQuery would only
970 need to be reconstructed once per day. (cutting)
971
972 15. Added a new IndexWriter method, getAnalyzer(). This returns the
973 analyzer used when adding documents to this index. (cutting)
974
975 16. Fixed a bug with IndexReader.lastModified(). Before, document
976 deletion did not update this. Now it does. (cutting)
977
978 17. Added Russian Analyzer.
979 (Boris Okner via otis)
980
981 18. Added a public, extensible scoring API. For details, see the
982 javadoc for org.apache.lucene.search.Similarity.
983
984 19. Fixed return of Hits.id() from float to int. (Terry Steichen via Peter).
985
986 20. Added getFieldNames() to IndexReader and Segment(s)Reader classes.
987 (Peter Mularien via otis)
988
989 21. Added getFields(String) and getValues(String) methods.
990 Contributed by Rasik Pandey on 2002-10-09
991 (Rasik Pandey via otis)
992
993 22. Revised internal search APIs. Changes include:
994
995 a. Queries are no longer modified during a search. This makes
996 it possible, e.g., to reuse the same query instance with
997 multiple indexes from multiple threads.
998
999 b. Term-expanding queries (e.g. PrefixQuery, WildcardQuery,
1000 etc.) now work correctly with MultiSearcher, fixing bugs 12619
1001 and 12667.
1002
1003 c. Boosting BooleanQuery's now works, and is supported by the
1004 query parser (problem reported by Lee Mallabone). Thus a query
1005 like "(+foo +bar)^2 +baz" is now supported and equivalent to
1006 "(+foo^2 +bar^2) +baz".
1007
1008 d. New method: Query.rewrite(IndexReader). This permits a
1009 query to re-write itself as an alternate, more primitive query.
1010 Most of the term-expanding query classes (PrefixQuery,
1011 WildcardQuery, etc.) are now implemented using this method.
1012
1013 e. New method: Searchable.explain(Query q, int doc). This
1014 returns an Explanation instance that describes how a particular
1015 document is scored against a query. An explanation can be
1016 displayed as either plain text, with the toString() method, or
1017 as HTML, with the toHtml() method. Note that computing an
1018 explanation is as expensive as executing the query over the
1019 entire index. This is intended to be used in developing
1020 Similarity implementations, and, for good performance, should
1021 not be displayed with every hit.
1022
1023 f. Scorer and Weight are public, not package protected. It now
1024 possible for someone to write a Scorer implementation that is
1025 not in the org.apache.lucene.search package. This is still
1026 fairly advanced programming, and I don't expect anyone to do
1027 this anytime soon, but at least now it is possible.
1028
1029 g. Added public accessors to the primitive query classes
1030 (TermQuery, PhraseQuery and BooleanQuery), permitting access to
1031 their terms and clauses.
1032
1033 Caution: These are extensive changes and they have not yet been
1034 tested extensively. Bug reports are appreciated.
1035 (cutting)
1036
1037 23. Added convenience RAMDirectory constructors taking File and String
1038 arguments, for easy FSDirectory to RAMDirectory conversion.
1039 (otis)
1040
1041 24. Added code for manual renaming of files in FSDirectory, since it
1042 has been reported that java.io.File's renameTo(File) method sometimes
1043 fails on Windows JVMs.
1044 (Matt Tucker via otis)
1045
1046 25. Refactored QueryParser to make it easier for people to extend it.
1047 Added the ability to automatically lower-case Wildcard terms in
1048 the QueryParser.
1049 (Tatu Saloranta via otis)
1050
1051
1052 1.2 RC6
1053
1054 1. Changed QueryParser.jj to have "?" be a special character which
1055 allowed it to be used as a wildcard term. Updated TestWildcard
1056 unit test also. (Ralf Hettesheimer via carlson)
1057
1058 1.2 RC5
1059
1060 1. Renamed build.properties to default.properties and updated
1061 the BUILD.txt document to describe how to override the
1062 default.property settings without having to edit the file. This
1063 brings the build process closer to Scarab's build process.
1064 (jon)
1065
1066 2. Added MultiFieldQueryParser class. (Kelvin Tan, via otis)
1067
1068 3. Updated "powered by" links. (otis)
1069
1070 4. Fixed instruction for setting up JavaCC - Bug #7017 (otis)
1071
1072 5. Added throwing exception if FSDirectory could not create diectory
1073 - Bug #6914 (Eugene Gluzberg via otis)
1074
1075 6. Update MultiSearcher, MultiFieldParse, Constants, DateFilter,
1076 LowerCaseTokenizer javadoc (otis)
1077
1078 7. Added fix to avoid NullPointerException in results.jsp
1079 (Mark Hayes via otis)
1080
1081 8. Changed Wildcard search to find 0 or more char instead of 1 or more
1082 (Lee Mallobone, via otis)
1083
1084 9. Fixed error in offset issue in GermanStemFilter - Bug #7412
1085 (Rodrigo Reyes, via otis)
1086
1087 10. Added unit tests for wildcard search and DateFilter (otis)
1088
1089 11. Allow co-existence of indexed and non-indexed fields with the same name
1090 (cutting/casper, via otis)
1091
1092 12. Add escape character to query parser.
1093 (briangoetz)
1094
1095 13. Applied a patch that ensures that searches that use DateFilter
1096 don't throw an exception when no matches are found. (David Smiley, via
1097 otis)
1098
1099 14. Fixed bugs in DateFilter and wildcardquery unit tests. (cutting, otis, carlson)
1100
1101
1102 1.2 RC4
1103
1104 1. Updated contributions section of website.
1105 Add XML Document #3 implementation to Document Section.
1106 Also added Term Highlighting to Misc Section. (carlson)
1107
1108 2. Fixed NullPointerException for phrase searches containing
1109 unindexed terms, introduced in 1.2RC3. (cutting)
1110
1111 3. Changed document deletion code to obtain the index write lock,
1112 enforcing the fact that document addition and deletion cannot be
1113 performed concurrently. (cutting)
1114
1115 4. Various documentation cleanups. (otis, acoliver)
1116
1117 5. Updated "powered by" links. (cutting, jon)
1118
1119 6. Fixed a bug in the GermanStemmer. (Bernhard Messer, via otis)
1120
1121 7. Changed Term and Query to implement Serializable. (scottganyo)
1122
1123 8. Fixed to never delete indexes added with IndexWriter.addIndexes().
1124 (cutting)
1125
1126 9. Upgraded to JUnit 3.7. (otis)
1127
1128 1.2 RC3
1129
1130 1. IndexWriter: fixed a bug where adding an optimized index to an
1131 empty index failed. This was encountered using addIndexes to copy
1132 a RAMDirectory index to an FSDirectory.
1133
1134 2. RAMDirectory: fixed a bug where RAMInputStream could not read
1135 across more than across a single buffer boundary.
1136
1137 3. Fix query parser so it accepts queries with unicode characters.
1138 (briangoetz)
1139
1140 4. Fix query parser so that PrefixQuery is used in preference to
1141 WildcardQuery when there's only an asterisk at the end of the
1142 term. Previously PrefixQuery would never be used.
1143
1144 5. Fix tests so they compile; fix ant file so it compiles tests
1145 properly. Added test cases for Analyzers and PriorityQueue.
1146
1147 6. Updated demos, added Getting Started documentation. (acoliver)
1148
1149 7. Added 'contributions' section to website & docs. (carlson)
1150
1151 8. Removed JavaCC from source distribution for copyright reasons.
1152 Folks must now download this separately from metamata in order to
1153 compile Lucene. (cutting)
1154
1155 9. Substantially improved the performance of DateFilter by adding the
1156 ability to reuse TermDocs objects. (cutting)
1157
1158 10. Added IndexReader methods:
1159 public static boolean indexExists(String directory);
1160 public static boolean indexExists(File directory);
1161 public static boolean indexExists(Directory directory);
1162 public static boolean isLocked(Directory directory);
1163 public static void unlock(Directory directory);
1164 (cutting, otis)
1165
1166 11. Fixed bugs in GermanAnalyzer (gschwarz)
1167
1168
1169 1.2 RC2, 19 October 2001:
1170 - added sources to distribution
1171 - removed broken build scripts and libraries from distribution
1172 - SegmentsReader: fixed potential race condition
1173 - FSDirectory: fixed so that getDirectory(xxx,true) correctly
1174 erases the directory contents, even when the directory
1175 has already been accessed in this JVM.
1176 - RangeQuery: Fix issue where an inclusive range query would
1177 include the nearest term in the index above a non-existant
1178 specified upper term.
1179 - SegmentTermEnum: Fix NullPointerException in clone() method
1180 when the Term is null.
1181 - JDK 1.1 compatibility fix: disabled lock files for JDK 1.1,
1182 since they rely on a feature added in JDK 1.2.
1183
1184 1.2 RC1 (first Apache release), 2 October 2001:
1185 - packages renamed from com.lucene to org.apache.lucene
1186 - license switched from LGPL to Apache
1187 - ant-only build -- no more makefiles
1188 - addition of lock files--now fully thread & process safe
1189 - addition of German stemmer
1190 - MultiSearcher now supports low-level search API
1191 - added RangeQuery, for term-range searching
1192 - Analyzers can choose tokenizer based on field name
1193 - misc bug fixes.
1194
1195 1.01b (last Sourceforge release), 2 July 2001
1196 . a few bug fixes
1197 . new Query Parser
1198 . new prefix query (search for "foo*" matches "food")
1199
1200 1.0, 2000-10-04
1201
1202 This release fixes a few serious bugs and also includes some
1203 performance optimizations, a stemmer, and a few other minor
1204 enhancements.
1205
1206 0.04 2000-04-19
1207
1208 Lucene now includes a grammar-based tokenizer, StandardTokenizer.
1209
1210 The only tokenizer included in the previous release (LetterTokenizer)
1211 identified terms consisting entirely of alphabetic characters. The
1212 new tokenizer uses a regular-expression grammar to identify more
1213 complex classes of terms, including numbers, acronyms, email
1214 addresses, etc.
1215
1216 StandardTokenizer serves two purposes:
1217
1218 1. It is a much better, general purpose tokenizer for use by
1219 applications as is.
1220
1221 The easiest way for applications to start using
1222 StandardTokenizer is to use StandardAnalyzer.
1223
1224 2. It provides a good example of grammar-based tokenization.
1225
1226 If an application has special tokenization requirements, it can
1227 implement a custom tokenizer by copying the directory containing
1228 the new tokenizer into the application and modifying it
1229 accordingly.
1230
1231 0.01, 2000-03-30
1232
1233 First open source release.
1234
1235 The code has been re-organized into a new package and directory
1236 structure for this release. It builds OK, but has not been tested
1237 beyond that since the re-organization.

Properties

Name Value
cvs2svn:cvs-rev 1.139
svn:eol-style native
svn:keywords Author Date Id Revision

apache@apache.org
ViewVC Help
Powered by ViewVC 1.1.2