[Jingle] <iq/> versus <message/> (Was: Tomorrow's meeting)

Emil Ivov emcho at jitsi.org
Wed Jul 24 09:55:19 UTC 2013

Hey Dave, all,

I suppose the main issue I have with migrating toward <message/>  is 
that it is a lot of work and the chances of everyone rushing to do this 
are not exactly outstanding.

Jingle adoption hasn't been quite what we've hoped for and there are 
only a handful of clients that support it out there. Many (most?) of 
those are stuck with the early Gingle variant and never bothered to 
update to Jingle.

With the above in mind, I am not particularly optimistic about the 
adoption of yet another major shift.

I think it would be a lot safer if we tried to resolve the issues we 
have by building on top of the current spec rather than starting from 

More below.

On 24.07.13, 10:30, Dave Cridland wrote:
> On Wed, Jul 24, 2013 at 6:26 AM, Philipp Hancke
> <fippo at goodadvice.pages.de <mailto:fippo at goodadvice.pages.de>> wrote:
>     Am 24.07.2013 00 <tel:24.07.2013%2000>:23, schrieb Emil Iov:
>             Right. We can discuss those sdp-over-jingle variants. I'd
>             also love to
>             see some proposals of why one would use <message/> instead
>             of <iq/>.
>             Likely reasons: carbons and via
>         Could you please expand a bit more on these?
>     I think we need full protocol flows. Any takers?
> I don't think it'll be much different. As a startpoint to the discussion...
> I call Fippo:
> <message to='fippo at fippomatic.example'
> from='dwd at dave.cridland.net/Office
> <http://dwd@dave.cridland.net/Office>' id='jingle-init'>
>    <jingle action='session-initiate' sid='blah' xmlns='jingle'/>
> </message>
> Fippo's devices responds if they're ringing (rather, willing to ring -
> of course, there's session-info to indicate they're really ringing), as
> before, but with a new action type to replace the iq/result we had
> before - perhaps we could actually use session-info for this from the
> outset, which'd be neater:
> <message from='fippo at fippomatic.example/poolside-phone'
> to='dwd at dave.cridland.net/Office <http://dwd@dave.cridland.net/Office>'
> id='jingle-ring-1'>
>    <jingle action='session-acknowledge' sid='blah' xmlns='jingle'/>
> </message>
> <message from='fippo at fippomatic.example/private-jet'
> to='dwd at dave.cridland.net/Office <http://dwd@dave.cridland.net/Office>'
> id='jingle-ring-2'>
>    <jingle action='session-acknowledge' sid='blah' xmlns='jingle'/>
> </message>
> Perhaps also - or instead - a device might explicitly reject the call.
> <message from='fippo at fippomatic.example/cinema'
> to='dwd at dave.cridland.net/Office <http://dwd@dave.cridland.net/Office>'
> id='jingle-init' type='error'>
>    <error type='cancel'>
>      <service-unavailable xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
>    </error>
> </message>
> The nice thing about a bare jid, here, is that even if Fippo is in his
> personal cinema, unable to take the call, or perhaps simply doesn't
> answer, the server can respond to take it to voicemail.

This does look appealing but I am afraid that the feeling might be a bit 
superficial. One has two ways of handling forking:

A) you either leave it entirely to the originating endpoint
B) or your do it in the server, hiding it from the client and only 
letting it see the leg that produced a successful call.

We can already do "A" with IQs: A client can loop through available 
resources and send session-initiates to everyone who's there. It would 
then need to properly handle all the resulting call legs ... I don't 
think many clients would go for this but if someone wants to: they have 
the option.

"B" is how most SIP deployments do this and it implies specifically 
implementing this on the server. The promise of <message/> is that we 
might be able to pull it off without server modifications.

I believe however that this is deceptive because <message/> would only 
auto fork the "session-initiate" and it wouldn't mask all the replies. 
This means that clients would need to have the code that handles all the 
legs that a call produced, which brings us back to the "A" case. As 
mentioned above, the "A" case is already implementable with IQs.

If, on the other hand, a server is willing to actually do the work so 
that forking would be hidden from the client then wouldn't it be able to 
also do the same with IQs?

I haven't thought that much about this what if a server that understands 
Jingle simply "ignores" the resource indicated at the origin and does 
the fork. Would it be a big issue if a session-accept arrives from a 
resource that was different from the one we addressed the 
session-initiate to?

> I imagine the
> session-acknowledge could be elided entirely here, but maybe it should
> always be sent early to ensure the caller knows not to time out too quick.

Having acks is actually quite nice and an advantage to SDP offer answer. 
Getting an ACK means that you know your offer or answer have been 
received. You know that everything in there is now understood which is 
quite important especially in the case of an answer.

SIP+SDP offer/answer don't give you this. You never know if what you 
just said is already received and that has come up as a problem in a 
number of the recent WebRTC discussions.

> <message from='fippo at fippomatic.example'
> to='dwd at dave.cridland.net/Office <http://dwd@dave.cridland.net/Office>'
> id='jingle-voicemail'>
>    <jingle action='session-acknowledge' sid='blah' xmlns='jingle'/>
> </message>
> <iq from='fippo at fippomatic.example' to='dwd at dave.cridland.net/Office
> <http://dwd@dave.cridland.net/Office>' id='jingle-voicemail-setup'>
>    <jingle action='session-accept' sid='blah' xmlns='jingle'/>
> </iq>
> Of course, we need additional rules to ensure that multiple devices
> don't accept the call, but this is where PEP or Carbons come in.

How do PEP and Carbons help here? More specifically, how are they better 
than simply ending the call to everyone who's still ringing as soon as 
we get the first answer?



More information about the Jingle mailing list