Character set support & declaration

Lisa Dusseault lisa at xythos.com
Wed Oct 30 11:00:51 CST 2002

As I understand it, the <?xml version="1.0"> line (the "text
declaration") is optional at the beginning of a stream (before the
stream element) in XMPP.  Normally in XML documents the text declaration
can be used to specify the encoding (e.g. UTF-8 or UTF-16) as well as
the XML version.

Section 8.1 of draft-miller-xmpp.core-01.txt only says that the
recipient should deduce character encoding (UTF8/UTF16) and refers to
[12], which is the XML namespaces recommendation.  However, the
namespaces document says nothing about encoding.  Should it instead
refer to [8], which is the XML recommendation?  That document does
discuss how to specify the character encoding in the text declaration

Suggested tweaks to the xmpp core draft section 8.1:
1. Refer to XML [8] not to XML Namespaces [12].
2. Be more clear that the sender of the XML document must declare the
encoding if it is not UTF-8 or UTF-16.

More possible changes:
3. Require the text declaration to be there?  I'm unclear on whether XML
does (the language says "should" in 1.0).  At least show the text
declaration in examples.
4. Say that if the text declaration is missing, or the encoding isn't
specified, the format will be either UTF-8 or UTF-16 (and the recipient
must deduce).


