# [Standards] Some issues in XEP-0106: JID Escaping

Peter Saint-Andre stpeter at stpeter.im
Mon Feb 15 17:06:12 UTC 2010

On 2/15/10 9:46 AM, Florian Zeitz wrote:
> Peter Saint-Andre wrote:
>> On 2/13/10 1:42 PM, Waqas Hussain wrote:
>>> Guus der Kinderen
>>> <http://jabber.markmail.org/message/kihf36azo2kvmczf?q=guus> pointed out
>>> some problems in XEP-0106 in the JDev chatroom.
>>>
>>> Section 1. of the XEP lists nine code points which are forbidden in a
>>> node identifier. The character '\' is not in that list, and is used as
>>> the escape character in the algorithms described in the XEP.
>>>
>>> Section 4.3 states:
>>>
>>>     "In order to maintain as much backward compatibility as possible,
>>>     partial escape sequences and escape sequences corresponding to
>>>     characters not on the list of disallowed characters MUST be ignored."
>
>> EXCEPT as described under the business rules. There is a special case
>> for '\5c'.
>
>>> The sequence '\5c' (corresponding to the character '\') as input to the
>>> escaping algorithm needs to be escaped. But it isn't a partial escape
>>> sequence, and it doesn't correspond to a disallowed character, so the
>>> above text dictates that it not be escaped.
>
>
>>> This breaks the algorithm.
>
>> There is a special case.
>
> As I was part of the JDev discussion waqas mentioned and brought this
> one up: There is of course a special case, our point is just that the
> XEP is contradicting itself as is right now. Therefore section 4.3
> should be changed.
>
>> The first paragraph of section 4.3 currently reads:
>
>> In order to maintain as much backward compatibility as possible, partial
>> escape sequences and escape sequences corresponding to characters not on
>> the list of disallowed characters MUST be ignored.
>
>
>> In order to maintain as much backward compatibility as possible, partial
>> escape sequences and escape sequences corresponding to characters not on
>> the list of disallowed characters MUST be ignored (with the exception of
>> the escaping character '\' itself when the source address includes the
>> sequence '\5c').
>
> That would be a welcome change. To be very clear it should probably
> state that not only does '\' need to be escaped, but also '\5c' needs to
> be unescaped. (The examples in this section all refer to escaping AND
> unescaping transformations, so this seems a necessary nitpick to me).

Correct.

In my working copy I have:

***

4.3 Exceptions

In order to maintain as much backward compatibility as possible, partial
escape sequences and escape sequences corresponding to characters not on
the list of disallowed characters MUST be ignored (with the exception of
the escaping character '\' itself in the rare case when the source

Example 3. Partial escape sequence

\2plus\2is\4 is not modified by escaping or unescaping transformations.

Example 4. Invalid escape sequence 1

foo\bar is not modified (to fooºr) by escaping or unescaping
transformations.

Example 5. Invalid escape sequence 2

foob\41r is not modified (to foobAr) by escaping or unescaping
transformations.

However, \5c would be escaped if found in the source address (e.g., a
source address of "c:\5commas at example.com" would be escaped to
"c\3a\5c5commas at example.com") and unescaped if contained in the
JID-on-the-wire (e.g., a JID-on-the-wire of "c\3a\5c5commas at example.com"
would be unescaped back to "c:\5commas at example.com").

***

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6820 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.jabber.org/pipermail/standards/attachments/20100215/b16cc559/attachment.bin>