[Standards] Proposed XMPP Extension: Character counting in message bodies

Sam Whited sam at samwhited.com
Fri Dec 4 18:29:24 UTC 2020


I don't understand this, if you get out bytes why would they be
different to what was in the stream? If you get a string in a language
that assumes strings have some specific format (ie. are valid UTF-8 or
UTF-16 or something) it makes sense that they might have had to be
different, but would anything change a raw byte slice before handing it
to you? That seems like a recipe for disaster that we can't (and
shouldn't) work around at a protocol level unless I'm seriously
misunderstanding something.

—Sam

On Fri, Dec 4, 2020, at 17:09, Kevin Smith wrote:
> Except that bytes are making significant assumptions about the
> libraries and languages being used. It’s assuming that what you get
> out of your parser corresponds to the same bytes that were on the
> stream, which seems particularly unlikely in languages that aren’t C
> at heart (C, C++, Go…).


More information about the Standards mailing list