[Standards] roster schema
Joe Hildebrand
hildjj at gmail.com
Sun Jun 24 11:00:36 CDT 2007
What do you mean by character?
- Glyph?
- Codepoint?
Do you have to perform some sort of canonicalization before counting?
Combining characters make this particularly difficult, which is why
we settled on something easy to describe and understand in JIDs.
On Jun 24, 2007, at 7:39 AM, Matthias Wimmer wrote:
> Hi Joe!
>
> Joe Hildebrand schrieb:
>> +1 for limiting it.
>> However, 1024 octets please, rather than characters, like JIDs.
>
> +1 for limiting it
>
> ... but please based on characters, not on octets. (I also voted
> against limiting JIDs based on octets.)
>
> Reasons:
> - Modern database systems as well as modern programming languages
> do store characters, not bytes.
> - XMPP is based on top of XML and XML does handle characters, not
> bytes. (e.g. you cannot store a NULL byte in XML, even not as an
> entity)
> - A limitation based on characters is what a user will expect.
> (e.g. "Why can I enter 1024 times the letter 'a' here but only 341
> times the character €?")
> - In GUI forms you can often already limit the number of characters
> a user can enter, but mostly you cannot limit the number of octets
> the UTF-8 representation of the string the user has entered.
>
> ... I'd even propose that the JID limitation should be changed to
> characters in RFC3920bis.
>
>
> Matthias
More information about the Standards
mailing list