[Standards] JID Escaping

Matthias Wimmer m at tthias.eu
Mon Jul 30 18:19:13 UTC 2007

Hi Peter!

Peter Saint-Andre schrieb:
> Well, it's interesting, on the ejabberd list today someone said they
> have an existing database of 45k email users and they want to offer
> Jabber services to that user population, but re-use the same usernames.
> I'm sure they have some users in there with addresses containing
> characters like single quote, e.g., tim.o'reilly at domain.tld. In which
> case I bet that they'll be interested in using JID Escaping.
> I really feel that this discussion is not going anywhere. The spec is
> IMHO pretty clear. If you don't like the spec, don't implement it.

I am not at all against reconsidering which characters are allowed in 
the node part of a JID. But if we go the proposed way, I think we are 
going the wrong way.

I think the arguments against this way already have been said. In short:
- We are used to type in full JIDs as a whole for addressing. Therefore 
with full escaping (including '@' and '/' ) we get ambiguous addresses.
- \20stpeter at jabber.org is an allowed JID by XMPP, but somehow 
prohibited by JID escaping XEP. It is not very clear how to handle this 
when doing unescaping. Not unescaping? Then we need an exception for the 
escaping rules again, that when reescaping, this does not have to be 
encoded as \2f20stpeter at jabber.org.
- We even do need more carefull exceptions/handling rules, so that 
\73tpeter at jabber.org (which is valid and different than 
stpeter at jabber.org at the XMPP level) does not get unescaped to 
stpeter at jabber.org.

Given that a proper way to do JID escaping at this level (introducing 
new characters) whould require a very big list of exceptions and 
recommondations on how to do it. - Not the way good standards are made.
Even if we would manage to get the standard without bugs, I am sure, 
that sooner or later there would be clients not doing everything 
correctly and having problems with one of the three problem cases 
mentioned above.

That's why my recommendation is:
- Use XEP-0106 only for gatewaying external identifiers to traditional 
JID addresses, but don't to unescaping at any other point then the 
gateway again. (Especially not in clients or other user interfaces.)
- For introducing needed characters in node parts (e.g. the "'" indeed 
is a good example what could be needed) find a better and cleaner way to 
introduce them.

May proposal for the second part (introducing new characters at the node 
part) would be to go the same way as the IETF is currently going with 
the local-part of internationalized e-mail addresses: Just extend the 
standard to allow them. This could be very easily done by RFC3920bis. We 
would just define a new stream feature (e.g. "<extended-addresses/>") 
that signals the other end of an xmpp stream, that the extended range of 
characters is allowed in the node part. If a from-JID (a to-JID probably 
never has to be delivered to such a host) containing such characters 
would have to be delivered to a host not offering <extended-addresses/> 
the message would have to be either bounced or the sending server would 
map the address using it's own an build in mapping. (Which again would 
not be unmapped by the receipient. So receipients on servers that do not 
upgrade to <extended-addresses/> would see mapped addresses, while the 
new addresses would get delivered natively to servers supporting them.)

Going that way would AFAICS only require us to define a new stringprep 
profile allowing the new characters as well as defining the new stream 
feature. - I consider this to be cleaner, easier to implement and being 
more solid.


More information about the Standards mailing list