[Standards] Proposed XMPP Extension: Character counting in message bodies

Florian Schmaus flo at geekplace.eu
Wed Dec 18 10:59:58 UTC 2019


On 12/17/19 12:18 PM, pep at bouah.net wrote:
> The XMPP Extensions Editor has received a proposal for a new XEP.
> 
> Title: Character counting in message bodies
> Abstract:
> This document describes how to correctly count characters in message
> bodies. This is required when referencing a position in the body.
> 
> URL: https://xmpp.org/extensions/inbox/charcount.html

As others already said, that is something we need. So thanks Marvin for
submitting this.

I do like to point out that it is probably not really XMPP specific
(similar to XEP-0392: Consistent Color Generation), but I don't see a
reason why this shouldn't get XEP'ed up.

Codepoints as unit had been my first choice too. But I wonder if we
shouldn't require Unicode normalization, i.e. the sender and receiver
MUST normalize prior counting.

Given that nothing in XMPP guarantees you that the Unicode is not
transformed somewhere in the stanza processing and routing, e.g. gets
combined, this would be required so that sender and receiver operate on
the same Unicode data.

And I believe that there could be cases where such transformations
actually really happen, e.g. message archives which persist the Unicode
data in combined form for efficiency reasons.

- Florian


More information about the Standards mailing list