[Standards] Proposed XMPP Extension: Character counting in message bodies

Ralph Meijer ralphm at ik.nu
Sat Dec 21 12:12:30 UTC 2019

On December 21, 2019 12:32:03 PM GMT+01:00, Andrew Nenakhov <andrew.nenakhov at redsolution.com> wrote:
>сб, 21 дек. 2019 г. в 16:21, Ralph Meijer <ralphm at ik.nu>:
>> Just making sure everyone has the same interpretation:
>> Case 1) The text has the sequence ]]>. In this case, in XML the >
>MUST be
>> escaped (with >, or equivalent character reference).
>> Case 2) All occurances of > not preceded by ]]. Here > MAY appear
>> or escaped. Both are valid.
>Well. We diverge here, and read it differently. MUST be escaped clause
>AND, it's is not optiona. The reason it MUST be escaped is _for
>compatibility_, and we are in a compatibility game, aren't we?

If this were the case, there'd be no reason for having the 'may' earlier in the sentence. The compatibility clause refers to case 1 above. FWIW, it would be entirely possible to detect when you're in a CDATA section or not, but the authors chose to make it explicit that you must escape  > for this case. I am going to assume this is an artifact of XML's SGML ancestry and this rule is to make parsing easier.

So, having unescaped > is valid for case 2, and serializers may choose to do so.



