For the various Java compilers which have been interfaced to Cocoon there is no unified method to tell the compiler the encoding of the Java source. Some compilers always assume UTF-8, others deduce it from the systems locale settings.
If there is a mismatch in the encodings, umlauts, accents, and other international characters will be mutilated in the output. In order to avoid this problem, XSP preprocesses all Java code and converts all non-ASCII characters to their \u1234 equivalent. Thus the Java compiler sees only ASCII characters, and the encoding becomes irrelevant (at least for character sets such as ISO-8859-1 which coincide which ASCII in the lower 128 characters).
Older Cocoon versions omitted to do the \u1234 conversion in some of the XSP contexts where Java string constants could be used.
In Cocoon 2.1.8 this is now fixed, as this test is supposed to show:
Context | Result |
---|---|
xsp:page/xsp:logic | |
xsp:page/xsp:init-page | |
xsp:init-page/xsp:expr | |
xsp:page/xsp:exit-page | |
xsp:exit-page/xsp:expr | |
xsp:logic | |
xsp:logic/xsp:expr | |
xsp:content/xsp:expr | |
xsp:expr | |
text() | Héllô "wörld"! 10 |
attribute | |
xsp:attribute/text() | |
xsp:attribute/xsp:expr |