[Standards] Binary data over XMPP

Justin Karneges justin-keyword-jabber.093179 at affinix.com
Mon Nov 5 17:56:12 UTC 2007


On Monday 05 November 2007 3:40 am, Dave Cridland wrote:
> Now, we can't expect that the entire Internet will bend to our will
> and instantly upgrade, so we need a sane fallback - probably to IBB,
> or something fairly similar. The interesting question is whether we
> choose to have this negotiated end to end (which means we'll need to
> have each hop along the route tested), or whether we say that this
> down-conversion happens within servers.

Binary over XMPP has been on my TODO for awhile now, and I have some notes 
written up about it but nothing publicized.  I think a hop-by-hop approach is 
best, if we want to have any hope for compatibility.

Comments on the two formatting approaches:

 1) XML element to indicate binary mode: this is probably the least 
destructive approach.  Keep in mind that we already have an XML to binary 
protocol change in XMPP: the TLS and SASL encryption layers.  Your XML parser 
needs to be able to stop on a dime when it sees that final '>' character, so 
asking for that in this discussion should not be a big deal.

2) Framing mode: this is probably the most optimized approach, but then the 
protocol becomes very unlike XMPP, and yes it may be worth using BEEP then 
(although honestly I haven't read the BEEP RFC in awhile, it probably does 
more than we need).

For framing, I came up with two approaches: "interleaved binary" and "stream 
multiplexing".  Either way you have your TLV framing, and a very tight 
binding to what we're trying to accomplish.

For the interleaved binary, there are two types: XML (0) and binary (1). :)  
Either packet type can contain arbitrary amounts of data.  It would not be 
required for the XML type to contain a complete element, for example.

The following two transmissions would be equivalent (whitespace added for 
clarity).

C0: <iq from='romeo at montague.net/orchard' to='juliet at capulet.com/balcony'
C0:     type='set' id='ibb1'>
C0:   <data xmlns='http://jabber.org/protocol/ibb' sid='mySID' seq='0'>
C0:     SGVsbG8gd29ybGQ=
C0:   </data>
C0: </iq>

C0: <iq from='romeo at montague.net/orchard' to='juliet at capulet.com/balcony'
C0:     type='set' id='ibb1'>
C0:   <data xmlns='http://jabber.org/protocol/ibb' sid='mySID' seq='0'>
C1:     Hello world
C0:   </data>
C0: </iq>

The binary type could be converted to and from Base64 by any hop.  Thus, it is 
important to consider with this protocol that you're not sending a random 
blob of binary, you're sending Base64'd CDATA just in a more optimized 
format.  This simplifies integration into existing XMPP applications.  Stanza 
input and output would look exactly as they do today (containing binary that 
is Base64 encoded).  Only the transport layer would worry about converting 
back and forth.  Indeed, this means that if binary data is received on the 
network, it would probably be Base64 encoded and plugged into the stanza as 
CDATA before passing upwards to the application (to then be decoded 
again :) ).

The advantage of the interleaved approach is that anywhere there is Base64 we 
could do a binary transfer.  So not just IBB, but a presence signature, a 
vcard avatar, etc.

For the stream multiplexing approach, there would be a number of "channels".  
Channel 0 would be the XML stream, and would operate like normal.  Channel 1 
would be an IBB packet.  This gives is a very tight binding to IBB, but that 
may be fine since that's the main way you'd want to transfer binary anyway.

Typical IBB handshake:

C0: <iq type='set'
C0:     from='romeo at montague.net/orchard'
C0:     to='juliet at capulet.com/balcony'
C0:     id='inband_1'>
C0:   <open sid='mySID'
C0:       block-size='4096'
C0:       xmlns='http://jabber.org/protocol/ibb'/>
C0: </iq>

S0: <iq type='result'
S0:     from='juliet at capulet.com/balcony'
S0:     to='romeo at montague.net/orchard'
S0:     id='inband_1'/>

Client sets channel 1 to be used for this IBB stream:

C0: <ibbbind xmlns='ns:multiplex'
C0:     from='romeo at montague.net/orchard'
C0:     to='juliet at capulet.com/balcony'
C0:     sid='mySID'
C0:     channel='1'/>

Client sends some IBB packets:

C1: Hello world
C1: Data sent on this channel is not Base64 encoded

Server replies also using a channel:

S0: <ibbbind xmlns='ns:multiplex'
S0:     from='juliet at capulet.com/balcony'
S0:     to='romeo at montague.net/orchard'
S0:     sid='mySID'
S0:     channel='1'/>

S1: You're right, and neither is this data!

If the next hop does not support ibbbind, then you would transmit as a regular 
IBB packet.  Yes, this means a server supporting ibbbind would have to know 
the IBB protocol (it would not be enough to expand the binary back into 
Base64 and send, it would truly have to reconstruct the ibb iq packet with 
the right sequence number, etc).  However, this intimate binding would end up 
being very optimized.

-Justin



More information about the Standards mailing list