[Standards-JIG] Still not sure ...

Justin Karneges justin-keyword-jabber.093179 at affinix.com
Thu Sep 9 08:39:36 UTC 2004

Yes, I believe this was intended to favor servers that don't actually decode 
the text (read: jabberd), and so they can quickly determine if a JID is too 

Since I'm working with a unicode-aware XML parser, I must convert JIDs back 
into UTF-8 in order to parse them for validity.  Incidentally, I also use 
libidn for stringprep, and it wants input in UTF-8, so I have to do this 
conversion anyway...


On Thursday 09 September 2004 1:08 am, Matthias Wimmer wrote:
> Hi list!
> I am still not sure if it has been a good idea, that xmpp core 3.1
> limits the length of the portions in a JID to 1023 B in UTF-8 encoding.
> This might seem to be a good choice for programs using 8 bit character
> types ... but it makes it hard to check if a JID is valid if you use
> wide character types like wchar_t in modern C/C++ or the standard character
> type of Java.
> For most modern languages it seems to be easier to check the number of
> characters in a string than the number of bytes in a corresponding UTF-8
> byte sequence.
> Tot kijk
>     Matthias

More information about the Standards mailing list