[Standards] RTT, take 2

Gunnar Hellström gunnar.hellstrom at omnitor.se
Fri Jun 24 10:10:18 UTC 2011


>> I'm wondering whether 'code points' are any better than UTF-8 based
>> >  positioning. Isn't it possible that a codepoint position also points
>> >  inside a character/glyph/...?
> A codepoint is the fundamental thing defined by Unicode, but there is a
> related concept which could be called a character (or grapheme?), consisting
> of one or more codepoints (a codepoint representing a non-combining character,
> followed by zero or more codepoints representing combining characters).
>
Yes, this why counting Unicode code points is the solution.
But it needs to be done at a sufficiently low level, close to the 
transmission of messages.
For e.g. erasure of one combined character consisting of two code 
points, the user interface action should at a low level result in 
erasure of two codepoints. That fact can be captured and sent in the RTT 
erasure element with an order to erase two code points.

The receiving client has its received rtt messages as reference, and 
does the action in the received string, and then takes the result to 
presentation. Then the operation is independent of any local Unicode 
habits in the receiving environment. Two code points is still two code 
points at that level, and the operation can be done without ambiguities.

Gunnar



More information about the Standards mailing list