[standards-jig] JNG Ramblings.

Nathaniel Borenstein nsb at guppylake.com
Wed Aug 14 05:09:13 UTC 2002

David Waite asked:

> I am a bit unclear: are you stating a case for binary framing, a full 
> binary protocol, or just for the capability of being able to send and 
> receive 8-bit clean data?

Actually, none of the above, but a variant on the 3rd choice:  the
capability of being able to send and receive pure binary data.  I think
the protocol as a whole should remain textual, but with an option for
negotiating/effecting the efficient delivery of chunks of pure binary
data.  (I differentiate between binary and 8-bit-clean in the same sense
of the MIME Content-Transfer-Encoding values -- 8-bit-clean data can
still be line oriented and subject to translation between newline
conventions.  I also include the word "negotiating" quite deliberately
-- the idea of continuing to interoperate with code that doesn't
implement binary transport is very appealing to me as likely to
significantly speed the adoption of any JNG.)

Mike Lin summarized his very clear exposition with:

> In conclusion, the key reason that I feel MIME is inappropriate at this
> level is that it is too hard to write something to parse MIME under
asynchronous input conditions. 

This is a fair point.  To avoid the problems you mention, we would
indeed have to move to a pure binary protocol.  However, I share the
concerns David Waite and others have expressed about the inflexibility
of binary protocols, and the corresponding difficulty of allowing such
protocols to evolve gradually.

Really, I think there are two separate issues:  1)  the inherent
tradeoff between the ease of parsing a fixed binary protocol and the
ease of evolving (and debugging) a more textual protocol, and 2) the
inefficiency of base64 transport for large binary objects.  My main
point was (is) that #2 can be addressed simply by augmenting the textual
protocol to permit the negotiated delivery of binary data, as in FTP,
and that #2 is therefore a red herring in discussions of the deeper and
tougher tradeoff  of issue #1.

For my part, I think that issue #2 is the critical one, and that as long
as efficient transport of large objects is *possible*, there's probably
no completely "right" answer to #1.  However, this leaves me rather
strongly biased towards a textual protocol that resembles all the other
successful Internet protocols.  I don't think the potential advantages
of a general binary framing protocol are sufficient to outweigh the
proven successes of primarily text-based protocols on the Internet.  

> Get over this notion that I'm dumping XML. I'm not. I'm sprinkling 4
> bytes here and there that vastly simplify things for the machine.
> Everything else that's not a binary attachment is XML.

Well, actually, Mike, I think you're advocating not just binary
attachments but also binary framing, so that's arguably 2 different
things that aren't XML.  But I'm being really nitpicky here, I know.

Mike also asks:

> For me, it's partly about
> having a framing protocol that is so simple as to have mathematical
> elegance.

At the risk of sounding flippant, can you point to one truly elegant
protocol that has succeeded on the Internet?  Elegance is often the
enemy of practicality.  To paraphrase Mao, when I hear people talk about
elegance in protocol design, I reach for my revolver....

> What are the chances that software
> designed today is going to properly handle a 16.01MB chunk of XML
without blocking unacceptably, crashing, or otherwise choking?

If one of my students were writing it, I would expect that it would work
as long as there was enough room on the disk!   (Real men don't use
fixed size *anything*!   :-)

However.... just to fan the flames and to prove that I don't agree
completely with *anyone*, I'll come down on Mike's side on this question
of philosophy:  Jabber's protocols should *not* be held hostage to
absolute standards of XML purity.  I distrust that brand of extremism at
least as much as I distrust designing protocols around a notion of
mathematical elegance.  Good protocols (like all good political
endeavors) are full of messy tradeoffs between competing pure
ideologies, as I think Jeremie also implied in his message.

Finally, I should say that I do in fact share some of Mike's frustration
at the idea of not being able to efficiently handle asynchronous XML
receipt in non-reentrant XML parsers.  Would it be totally outrageous to
consider some kind of clear "end of document fragment" marker (coming
*after* a completed XML fragment) that would allow the re-assemply of
the complete XML document (before passing it off to the XML parser)
without having to fully parse the XML?  -- Nathaniel

More information about the Standards mailing list