[Standards] Handling for characters that have entities, but XML does not require them to be escaped

Matthias Wimmer m at tthias.eu
Mon Jul 23 10:58:01 UTC 2007


Hi Peter!

Peter Saint-Andre schrieb:
>> RFC3920bis even requires a server to check that this type of XML is not
>> used and that a stream error has to be generated, if it is received.
> We tried to clarify the error handling in rfc3920bis, and that text
> reflected list consensus.

Yes ... I even followed this discussion. But did not write anything
earlier, as I did not realize, that the text together with the rule to
forbid ' " and > to be in the XML document unescaped, would prevent us
from using standard XML parsers.

>> Why at all do these characters have to be escaped?
> 
> I don't know. IIRC, that was text from an early version of
> draft-ietf-xmpp-core and I think I agree with you that it should not be
> necessary to escape those characters in XMPP.
> 
>> I it really necessary, that RFC 3920bis mandates a server to reject such
>> XMPP streams? I very much dislike this requirement, as it would require
>> me to implement my own XML parser, as I don't know any parser I could
>> use, that could be configured to notice me that these characters have
>> been received unescaped.
> 
> If we change the text regarding restricted XML features (i.e., say that
> the characters that don't need to be escaped in XML don't need to be
> escaped in XMPP), would you still object to the error handling?

Yes. In that case we would be able to use most (push) SAX parsers.

(Well one question left: Is RFC3920bis forbidding numeric character
references? AFAIK numeric character references are NO entities and are
therfore not forbidden, but if they would be, I'd have a problem
generating the <restricted-xml/> error in all cases as well.)


Matthias

-- 
Matthias Wimmer      Fon +49-700 77 00 77 70
Züricher Str. 243    Fax +49-89 95 89 91 56
81476 München        http://ma.tthias.eu/




More information about the Standards mailing list