[xmppwg] Character set support & declaration

Iain Shigeoka iain.shigeoka at messaginglogic.com
Wed Oct 30 23:26:01 CST 2002

On 10/30/02 9:00 AM, "Lisa Dusseault" <lisa at xythos.com> wrote:

> Suggested tweaks to the xmpp core draft section 8.1:
> 1. Refer to XML [8] not to XML Namespaces [12].
> 2. Be more clear that the sender of the XML document must declare the
> encoding if it is not UTF-8 or UTF-16.

Editor... :)

> More possible changes:
> 3. Require the text declaration to be there?  I'm unclear on whether XML
> does (the language says "should" in 1.0).  At least show the text
> declaration in examples.
> 4. Say that if the text declaration is missing, or the encoding isn't
> specified, the format will be either UTF-8 or UTF-16 (and the recipient
> must deduce).

I think this points to the need for more specific/thorough coverage of
encoding issues in the doc.

IMO the IETF standardization will encourage implementations in "non-UTF8"
countries (something we haven't seen much yet). So these issues will become
more significant in the future. Would it be feasible to include a full
pseudo-code algorithm for determining encoding? That way implementers can
just follow the algorithm and be sure of getting it right.


