[Standards] RTT, take 2
gunnar.hellstrom at omnitor.se
Fri Jun 24 10:10:18 UTC 2011
>> I'm wondering whether 'code points' are any better than UTF-8 based
>> > positioning. Isn't it possible that a codepoint position also points
>> > inside a character/glyph/...?
> A codepoint is the fundamental thing defined by Unicode, but there is a
> related concept which could be called a character (or grapheme?), consisting
> of one or more codepoints (a codepoint representing a non-combining character,
> followed by zero or more codepoints representing combining characters).
Yes, this why counting Unicode code points is the solution.
But it needs to be done at a sufficiently low level, close to the
transmission of messages.
For e.g. erasure of one combined character consisting of two code
points, the user interface action should at a low level result in
erasure of two codepoints. That fact can be captured and sent in the RTT
erasure element with an order to erase two code points.
The receiving client has its received rtt messages as reference, and
does the action in the received string, and then takes the result to
presentation. Then the operation is independent of any local Unicode
habits in the receiving environment. Two code points is still two code
points at that level, and the operation can be done without ambiguities.
More information about the Standards