[Standards] Handling for characters that have entities, but XML does not require them to be escaped

Matthias Wimmer m at tthias.eu
Sun Jul 22 14:30:13 UTC 2007


There are several characters, that have predefined entities in XML, but
that do not need to be escaped in XML.
Examples for such characters are > ' and " in text nodes.

E.g. due to the XML standard the following stanza would be valid XML:

<message to='user at example.com' from='user at example.net'><body>Yes, a >

... while RFC 3920 forbidds to generate such XML when used as XMPP.
RFC3920bis even requires a server to check that this type of XML is not
used and that a stream error has to be generated, if it is received.

So I have two questions regarding this:

Why at all do these characters have to be escaped?

I it really necessary, that RFC 3920bis mandates a server to reject such
XMPP streams? I very much dislike this requirement, as it would require
me to implement my own XML parser, as I don't know any parser I could
use, that could be configured to notice me that these characters have
been received unescaped.


Matthias Wimmer      Fon +49-700 77 00 77 70
Züricher Str. 243    Fax +49-89 95 89 91 56
81476 München        http://ma.tthias.eu/

More information about the Standards mailing list