$Id: CHANGES,v 1.5 2001/05/09 18:18:50 dfs Exp $ Version 2.0.2 o Fixed default behavior of '.' in awk package. Previously it wouldn't match newlines, which isn't how AWK behaves. The default behavior now matches any character, but a compilation mask (MULTILINE_MASK) has been added to AwkCompiler to enable the old behavior. o Replaced the use of Vector with ArrayList in Perl5Substitution. o Replaced the use of deprecated Perl5Util split method in printPasswd example with newer method. Also updated splitExample to use ArrayList instead of Vector. o Updated RegexFilenameFilter to also implement the Java 1.1 FileFilter interface. Version 2.0.2-dev-2 o Applied a modified version of Takashi Okamoto's unicode/posix patch. It adds unicode support to character classes and adds partial support for posix classes (it supports things like [:digit:] and [:print:], but not [:^digit:] and [:^print:]). It will be improved/optimized later, but gives people the functionality they need today. Version 2.0.2-dev-1 o Removed commented out code and changed OpCode._isWordCharacter() to use Character.isLetterOrDigit() o Some documentation fixes. o Perl5Matcher was not properly tracking the current offset in __interpret so PatternMatcherInput would not have its current offset properly updated. This bug was introduced when fixing the PatternMatcherInput anchor bug. When the currentOffset parameter was added to __interpret, not all of the necessary __currentOffset = beginOffset assignments were changed to __currentOffset = currentOffset assignments. This has been fixed by updating the remaining assignments. o The org.apache.oro.text.regex.Util.split() method was further generalized to accept a Collection reference argument to store the result. Version 2.0.1 o Jeff ? (jlb@houseofdistraction.com) and Peter Kronenberg (pkronenberg@predictivetechnologies.com) identified a bug in the behavior of PatternMatcherInput matching methods with respect to anchors (^). Matches were always being performed interpreting the beginning of the string from either a 0 or a current offset rather than from the beginOffset. Essentially, the beginning of the string for purposes of matching ^ wasn't being associated with beginOffset. This problem has been fixed. o A problem with multiline matches that was introduced in the transition from OROMatcher/Perltools/etc. to Jakarta-ORO was fixed. o The org.apache.oro.text.regex.Util.split() method was generalized to accept a List reference argument to store the result. Version 2.0 Many small, but important, changes have been made to this software between its last release as ORO software and its first release as part of the Apache Jakarta project. The last versions of the ORO software were: OROMatcher 1.1 (com.oroinc.text.regex) PerlTools 1.2 (com.oroinc.text.perl) AwkTools 1.0 (com.oroinc.text.awk) TextTools 1.1 (com.oroinc.io, com.oroinc.text, com.oroinc.util) OROMatcher 1.1 was fully compatible with Perl 5.003 regular expressions. Many changes have been made to the regular expression behavior in successive versions of Perl. The goal is to update org.apache.oro.text.regex to the latest version of Perl (5.6 at this time), but only some of this work has been done. o Guaranteed compatibility with JDK 1.1 is discontinued. JDK 1.2 features may be used indiscriminately, even though they may not have been used at this time. o Perl5StreamInput and methods manipulating Perl5StreamInput have been removed. For the technical reasons behind this decision see "On the Use of Regular Expressions for Searching Text", Clark and Cormack, ACM Transactions on Programming Languages and Systems, Vol 19, No. 3, pp 413-426. o Default behavior of '^' and '$' as matched by Perl5Matcher has changed to match as though in single line mode, which appears to be the Perl default. Previously if you did not specify single line or multi-line mode, '^' and '$' would act as though in multi-line mode. Now they act as though in single line mode, unless you specify otherwise with setMultiline() or by compiling the expression with an appropriate mask. However, '.' acts as in multi-line mode by default and its behavior can only be altered by compiling an expression with SINGLELINE_MASK. setMultiline() emulates the Perl $* variable, whereas the MASK variables emulate the ismx modifiers. o All deprecated methods have been removed. This includes various substitute() convenience methods from com.oroinc.text.regex.Util o Javadoc comments have been updated to use 1.2 standard doclet features.