Constraints
Item Names
The JCR specification already restricts what can go into the “local” part of an item name - see JCR v2.0 Specification, Section 3.2. In particular:
- Characters not allowed in XML are forbidden in names as well; this affects most control characters plus unpaired surrogates; see Extensible Markup Language (XML) 1.0 (Fifth Edition), Section 2.2.
- Furthermore, the names
.
and..
can not be used. - Finally, the characters
/
,:
,[
,]
,|
, and*
are forbidden. For these, the JCR v2.0 Specification, Section 3.2.5.4 proposes a mapping to “private-use” code points.
On top of that, Oak implements several additional restrictions (as per JCR v2.0 Specification, Section 3.2.4):
- The space character (U+0020) is disallowed at the beginning and the end of a (local) name (see JCR v2.0 Specification, Section 5.2.2.1 for motivation).
- Other ASCII whitespace characters (CR, LF, TAB) are always disallowed (before OAK 1.10, more were disallowed, see OAK-4857).
Finally, the chosen persistence implementation might restrict node names even further. See Node Name Length Limit.
The namespace for prefix rep
(internal
) is not a valid URI therefore you can only use the qualified names but not the expanded names (JCR v2.0 Specification, Section 3.2.5) when addressing items in that namespace(OAK-74).
Invalid Java Strings
Due to the way Java represents characters in strings, not every String is a valid sequence of Unicode code points. This is because two characters are needed to represent Unicode “supplementary characters”. If these “surrogate” characters do not appear as a well formed pair, the Java string can not be serialized to a sequence of Unicode characters, nor to a byte sequence (using UTF-8 character encoding).
The system behaviour for these strings is currently undefined. This means that they might get rejected, that they might get accepted but information is lost when they are stored, or they might be stored and retrieved faithfully.
See OAK-5505 fur further information.