[Standards] About stream namespaces

Robin Redeker elmex at x-paste.de
Sun Mar 18 22:48:05 UTC 2007


First, before anyone gets something wrong...

I don't want to offend the people who are trying hard to specify XMPP and
extend it. They are all doing a hard work and doing their best. And the
deployment of XMPP thoughout the world is amazing. An open standard is better
than a closed one.

On Mon, Mar 19, 2007 at 07:23:46AM +1100, Daniel Noll wrote:
> On Monday 19 March 2007 00:52, Robin Redeker wrote:
> > The difference, which matters, is that when you read from harddisk you
> > are not forced to process the document before you have read the full
> > file. You are not _forced_ to parse a partial XML document when you read
> > a file.  If you parse it anyway (without special chunked parsing modes
> > which some sophisticated parsers have, which are by no means required by
> > the XML recommendation), the XML parser is allowed to bail out and call
> > it a 'not-well-formed XML document'.
> 
> Is it?  I don't know of any parsers which do, and I certainly don't know of 
> anything in the XML specification which demands that the entire document be 
> available up-front.

You maybe missed my quote from the XML recommendation earlier:
( '5 Conformance' http://www.w3.org/TR/2006/REC-xml-20060816/#sec-conformance )

   Non-validating processors are REQUIRED to check only the document entity,
   including the entire internal DTD subset, for well-formedness.
                                                 ^^^^^^^^^^^^^^^
   ...
   Note that when processing invalid documents with a non-validating processor the
   application may not be presented with consistent information. For example,
   several requirements for uniqueness within the document may not be met,
   including more than one element with the same id, duplicate declarations of
   elements or notations with the same name, etc. In these cases the behavior of
                                                                     ^^^^^^^^^^^
   the parser with respect to reporting such information to the application is
   ^^^^^^^^^^                                                               ^^
   undefined.
   ^^^^^^^^^

(It reads even stricter for validating parsers.)

The whole paragraph '5 Conformance' in
    http://www.w3.org/TR/2006/REC-xml-20060816/#sec-conformance
says that any parser has to check for well-formedness.

To check a XML document for well-formedness it first needs a _complete_ XML
document (the recommendation does not define 'partial' XML documents).

And the quoted paragraph above says that any information retrieved from
unchecked or even errornous documents is not defined.

> Most parsers will just sit there and do nothing until 
> they get all the data (which in this situation would never happen, if that 
> were the only parser reading from the stream.  If you had two parsers and one 
> of them was making responses, it might be a different story.)

I know that there are implementation of XML parser that are quite liberal in
what they accept. The point is: XMPP does exploit this beyond the XML
recommendation. It makes it _neccessary_ to have such parsers.

> While we're discussing things which will likely not happen for ages,
[.snip.]

Hehe, indeed. I don't expect anything to actually happen, except that maybe a
few (propably close to zero) people stop calling it XML what XMPP does.  Stop
using the term XML for something that does not really conform to the XML
recommendation and even relies on implementation behaviour of some XML
parsers.

All this is of course just a theoretical issue, and people also start calling
me a nitpicker, but I can't stand people calling this an application of XML.
It's a hack that bends the XML recommendation, rips out large parts and deals
with not-well-formed XML documents in the end.

At least for me as developer the usage of the term XML in the RFC was highly
confusing. And I didn't felt well implementing XMPP.  I get the impression that
I'm mostly alone with this experience and opinion. Or maybe people with similar
experiences or opinions don't speak out.

I realize that I can't stop people from crying out "XML" on stuff that looks
roughly like: <foo></foo>

I just present some facts and some quotes from the XML recommendation.  If
people think that recommendation and standards can be ignored they obviously
are knowing _very well_ what they are doing. But people who usually know what
they are doing when breaking standards don't call it standard conformant
afterwards.


Robin



More information about the Standards mailing list