[Standards] Binary data over XMPP

Dawid Toton d0 at wp.pl
Wed Nov 7 23:36:46 UTC 2007

Robin Redeker wrote:
> On Mon, Nov 05, 2007 at 09:56:12AM -0800, Justin Karneges wrote:
>>  1) XML element to indicate binary mode: this is probably the least 
>> destructive approach.  Keep in mind that we already have an XML to binary 
>> protocol change in XMPP: the TLS and SASL encryption layers.  Your XML parser 
>> needs to be able to stop on a dime when it sees that final '>' character, so 
>> asking for that in this discussion should not be a big deal.
> Just keep in mind that we don't have a way to change "back".
> The current change is a very drastic one, like "flush the whole
> parser state and begin from start".
We can do exactly the same with the switching to binary mode: (I put 
comments in [] brackets)

[We are in XMPP stream. Here the new stanza begins:]
<message ...>
<binary>[at this point the parser drops its state; stops precisely after 
the closing '>' ]random-bytes-as-under-TLS-layer[we know which byte is 
the last - the length could be written as a prefix of the blob or as an 
attribute of opening XML tags]
</message>[These two closing tags mark the end of a stanza and have 
always to be the same - we can merely look for '</iq>' string or 
whatever. Here one doesn't need XML parser's intervention.]

The XML parser have lost its state (probably just removing 
<stream><message><binary> openings from the stack), but the XMPP layer 
still remembers that the stream is open and is able to receive next stanzas.

Suppose client and server agreed to use such a protocol as a replacement 
for base64. Since we can efficiently send binary data only as a topmost 
XML chunk, additional identifers are needed that indicate which blob 
goes where. I mean, instead of:


we could send two stanzas:

<message id="99" to="..."><binary>arbitrary-bytes</binary></message>
<someXML><reallyNested><binary id="99"/></reallyNested></someXML>

The overhead is roughly few hundred bytes, so for <1kB base64 works 
better. It doesn't matter, since we are looking for a way for midsize 

If we didn't care of breaking current implementations, it would be good 
solution to enforce all to do parsing in natural multilayer way - as 
AFAIK some XMPP software already does. I mean: when TLS starts,
* suspend the outer XML parser (don't flush)
* intercept next bytes to feed them into new inner TLS/XML stack
* continue with the outer parser - may parse the closing </stream>
This way we could do the binary mode switching just in places where 
base64 data would otherwise appear.


More information about the Standards mailing list