[Standards] Re: Jingle bootstrapping

Scott Ludwig scottlu at google.com
Sat Mar 3 21:24:57 UTC 2007

Good mail Matt.

Having gone through the creation of libjingle, I can comment on the
reasons for certain approaches and the implementation difficulties of
the various pieces.

For us it's important that the client can discover which servers to
talk to for derived address discovery (addresses on the outside NAT /
firewall), and for relay services. We use / propose a standard to
discover these through simple XMPP.

Once the server names are discovered, it is important for us that the
client do name resolution on the servers. This way clients can be
directed to the closest data center automatically (because the ip
returned is based on geographic proximity). If you go with a pure XMPP
approach to talking with the relay server for example, it is more
difficult to get geographic distribution. I understand in the case
where there is only one data center, doing it the XMPP way is easier.
But it doesn't scale as easily. It would be possible to mix the two -
have the client discover the server name over XMPP, do local name
resolution, and then negotiate over XMPP (passing the ip as an

>From a difficulty point of view, STUN is a very simple protocol. It
doesn't rank very high on the complexity scale. The relay server on
the other hand is more complex. Some providers want a way to allocate
relay resources only to users they know about. In our system, built
into the relay discovery protocol is a way to pass down a relay token
to the client, that the client can pass when allocating relay binding.
The server can authenticate that. I understand this wouldn't be
required if  you talk to the relay server over XMPP directly, since
the user would already be authenticated with the server.

Once the relay and stun services are available, the complex piece is
the ice transport implementation. While other transports are possible
(and encouraged), we do need a basic ICE design that is agreed on. I
don't think we need to track ICE X standard, where X is always
increasing in number. The basic ICE algorithm hasn't changed much for
awhile. We make a cut, and we get clients to use that cut. The way
Jingle is designed, if there is a desire to make another ICE design at
some point in the future, it can be refered to as a different
transport. The important thing to do now is to standardize around an
ICE design and call it "ICE for XMPP v1.0". Google Talk uses a design
that I think is more than adequate.  The important thing about ICE vs.
other approches is that it attempts connectivity interactively. This
makes sure direct connectivity is possible when one side is symmetric
and the other isn't. It results in a high % traversal rate (90%+).
Pure STUN or STUN like approaches typically don't handle this case and
get lower traversal rates (70%+). This is why I think the xmpp ice
transport, once agreed on, should be a requirement for Jingle apps
wishing to negotiate common media sessions. This should be discussed
of course.

I don't think it is necessary for the xmpp community to "be compatible
with company X's ICE/STUN/TURN design" as a design requirement. From
my experience, if we can build momentum around a standard, the
companies are incented to build in compatibility with that standard.
This is something they typically have full time engineers working on.
I think taking this to heart would help Jingle reach v1.0.

On {$date}, Matt Tucker <matt at jivesoftware.com> wrote:
> Alex and all,
> I'm a bit behind technically on the latest work Thiago has been doing.
> So, my apologies for not bringing it up during the DevCon meeting. One
> thing I did bring up was the alternative media proxy approach. If you
> remember that discussion, the gist was that since we already have a
> signalling mechanism with the XMPP server, implementing the media proxy
> is *very* simple --  the XMPP server can just open the necessary ports
> and tell the client what to use. Robert McQueen was going to investigate
> the difficulty of actually implementing a TURN-based server over the
> next several weeks so that we have it as a point of comparison.
> > the problem is to pass NAT and firewalls. Currently it looks
> > like STUN and ICE the best protocols out there. The problem
> > is that they are complex, and no ICE libraries are available yet.
> >
> > We also discussed this in detail on the DevCon in Brussels.
> > Our conclusion was that we should evaluate all possibilities
> > out there, and also take a closer look at Teredo.
> >From what Thiago told me yesterday, he's tested his technique to pass
> through three levels of NAT devices. That's not bad. :) In any case, the
> main point is that we see value in choosing techniques that work best
> for the XMPP community. Just because it was defined somewhere else,
> doesn't mean we have to use it (that's why we're not all using SIP).
> I see several major (good) arguments for why we should focus on
>  1) The standards work is already being done by smart people.
>  2) Libraries will be created to do this stuff, which will hide the
> complexity.
>  3) For things like media proxies, it's reasonable to assume that
> vendors like Cisco will make some standard hardware we can interop with.
> At the end of the day, those are all good arguments, but let me pick
> them apart a bit.
> > 1) The standards work is already being done.
> a) The standards are still highly in flux and a moving target.
> b) The existing standards work is bound by the constraints of SIP. That
> may mean that we can create alternatives that are much simpler given an
> existing XMPP stack.
> b) From a market perspective, we're in a relatively narrow time band to
> establish our relevancy. The SIP juggernaut marches on and we need to
> clearly articulate the place for Jingle and its advantages. The argument
> that we could exploit XMPP to create something better and simple than
> what the SIP community is doing for NAT traversal really resonates with
> me for that reason.
> > 2) Libraries will be created to do this stuff, which will hide the
> complexity.
> These just don't exist yet. :) I think there are a couple reasons.
> First, the standards like ICE are a moving target and people haven't
> caught up yet. Second, it's mainly telcoms and large orgs that are using
> this stuff. I think that's why there's still such relatively poor open
> source support for STUN, which is a technology that's been around for a
> long time. Finally, the complexity of the protocols is keeping people
> away from doing implementations.
> > 3) For things like media proxies, it's reasonable to assume that
> vendors like Cisco
> > will make some standard hardware we can interop with.
> Given the points above, who knows when this will happen. :) By that
> time, we'll likely have lost our chance to make Jingle relevant.
> -------------------
> I'd like to suggest the following:
>  1) More people dive deep into the Jingle issues as soon as possible.
> Yes, it's quite complicated, but having more voices at the table would
> make the effort worth it.
>  2) Continue experiments down all possible paths. We'll document some of
> the techniques we're trying so that others can understand them more
> clearly and try them as well. If Robert is able to do the TURN work,
> we'll have that info as well.
>  3) Set some clear goals for what we're trying to do with Jingle so that
> we have some criteria to evaluate the different approaches. Just to get
> the conversation started, here are some ideas (nothing particulary new):
>  * Make Jingle NAT traversal easier to implement than any competing
> technology.
>  * Deliver working implementations and standards ahead of everybody
> else.
>  * Push Jingle to support a broad range of real-time interactions (not
> just VoIP). That could include file transfer, screen share, etc.
> Thanks,
> Matt

More information about the Standards mailing list