[Standards] Do we need STUN?

Rachel Blackman rcb at ceruleanstudios.com
Thu Mar 8 18:31:50 UTC 2007

> Bluntly put, it is outright impossible to create a high-quality,
> peer-to-peer, out-of-band, UDP connection entirely within a
> server-centralized, in-band, TCP, XMPP stream. The most popular
> complaints about ICE (it's complicated, it's taking so long to develop
> within IETF), are due to the fact that NAT traversal is a complicated
> problem. ICE is currently the best solution, and trying to reinvent it
> would be a fool's errand.

I think this whole recent Jingle kerfluffle comes down to about two  
or three things.

Firstly, we presently have a (server-proxied) TCP stream initiation  
protocol for doing file transfers and client-to-client TCP.  It's not  
a perfect system (far from it, as it basically falls apart if both  
people are behind NAT and neither of them has a server running an s5b  
proxy), but it's a functional (and save for the mutant SOCKS5  
portion, wholly XMPP) solution which is already in use by many clients.

Secondly, we have a (NAT-traversal) UDP stream initiation protocol  
for punching through NAT and firewalls and doing audio/video  
streaming.  It's a highly reliable system but also seems to  
intimidate many people with the apparent complexity of initiating it.

I don't think anyone will debate that we need Jingle in some form;  
it's our only presently-viable method for UDP communication between  
clients.  But it seems like lately we're debating several points at  

Some folks are debating what the UDP stream protocol should be.  I  
don't think that part's up for debate; Jingle is already accepted by  
the council, and fills that role spectacularly.  Jingle's here;  
cope.  It uses STUN/ICE, and it works.  And there's even an open- 
source implementation that Google is helpfully providing.  I'm not  
sure why this part is up for debate... :)

Other folks are debating if Jingle needs a file transfer  
specification.  This one's a bit of a headache since, really, it used  
to seem clearer that S5B and stream-profile were how you did client- 
to-client TCP (such as file transfer), while Jingle was client-to- 
client UDP.  Now we're defining a second completely incompatible file- 
transfer specification atop Jingle, which takes us back to the Dark  
Old Days of three different file transfer specifications.  (Anyone  
remember when we had different clients all doing iq:oob, DTCP or S5B?  
Whee!  Please let's not spend much time there again if we can avoid  

This sort of puts a gun to the head of existing clients, largely.  It  
forces them to adopt Jingle now even if they do not want to do AV, or  
they cannot exchange files with Google Talk clients.  Granted, people  
can probably use libjingle to get some of it up and running, but  
libjingle isn't a silver bullet.  (For instance, I'll have to roll my  
own Jingle implementation in Astra for various reasons.  I'm going to  
do it nonetheless, but I'll have to roll my own.)  So it does create  
a fair amount of work for client authors in the short (or maybe not- 
so-short) run.

That said, the advantages of using Jingle for client-to-client TCP  
(or psuedo-TCP, I suppose) are that down the road, future client  
authors only need to write ONE stream initiation system (Jingle) and  
they can do all the client-to-client stuff.  It should spur greater  
adoption of Jingle voice (and potentially Jingle video, which I'm  
certain someone out there is dreaming up a spec for *innocent  
whistle*), as well as making file transfers more reliable (after all,  
a fair number of servers still don't have S5B proxies running even now).

Right now, if you want both file transfer and AV, you have to  
implement two; deprecating S5B and using just Jingle (if it has a  
fleshed-out TCP method) simplifies the task for future client  
authors, at the cost of much more work right now for current client  
authors.  But the debate, as far as I can tell, should be about / 
that/ -- about what the state of XMPP stream initiation is -- not  
about whether or not Jingle should use STUN and ICE.  Those are part  
of firewall/NAT traversal, and the entire point of Jingle is that it  
negotiates streams that traverse a firewall or NAT.

I'm more concerned, personally, about a) what happens with S5B with  
the other stream system providing a new second incompatible file- 
transfer system, and b) if we do deprecate S5B in favor of just  
Jingle for all streaming, how do we handle the transition in a clean  
manner to avoid horrible incompatible-file-transfer issues all over?

Rachel Blackman <rcb at ceruleanstudios.com>
Trillian Messenger - http://www.trillianastra.com/

More information about the Standards mailing list