[Standards] Proposed XMPP Extension: Character counting in message bodies

Florian Schmaus flo at geekplace.eu
Fri Dec 4 19:00:28 UTC 2020


On 12/4/20 7:29 PM, Sam Whited wrote:
> I don't understand this, if you get out bytes why would they be
> different to what was in the stream?

Often you don't get raw bytes from your XML parser, but an instance of 
your programming language's native String type. But often your 
programming language provides an API to encode that String to UTF-8 
encoded bytes, which *should* match exactly the bytes on the wire.

My problem with your proposal is that it uses bytes. I don't get why you 
want to use bytes here. You most certainly will obtain from your XML 
parser a type that can be converted to a sequence of Unicode code points.

Hence I think your proposal should use code points instead. And then, if 
I am not mistaken, your proposal matches my proposal for opportunistic 
interoperability as fallback.

- Florian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://mail.jabber.org/pipermail/standards/attachments/20201204/eed92bb6/attachment-0001.sig>


More information about the Standards mailing list