[Standards] RTT, take 2
remko at el-tramo.be
Fri Jun 24 08:13:41 UTC 2011
[ I don't like writing me-too e-mails, but you beat me by a minute to
sending the exact same mail, so I'm doing it anyway ;-) ]
> So I'd say that we should refer to characters in a string, and deal with
> Unicode code-points in the abstract. I'd expect that implementations would
> convert this internally into whatever made sense for them.
I think it would be the first protocol to depend on knowing how to
count code points (I haven't needed it before), but I also think it's
the only sensible thing to do, because you could end up with incorrect
encodings using the protocol otherwise.
Anyway, for applications that don't use Unicode libraries, rolling
your own codepoint count isn't very hard, at least for utf-8.
More information about the Standards