[Jingle] <iq/> versus <message/> (Was: Tomorrow's meeting)

Emil Ivov emcho at jitsi.org
Wed Jul 24 11:17:34 UTC 2013

On 24.07.13, 12:22, Dave Cridland wrote:
> On Wed, Jul 24, 2013 at 10:55 AM, Emil Ivov <emcho at jitsi.org
> <mailto:emcho at jitsi.org>> wrote:
>     I suppose the main issue I have with migrating toward <message/>  is
>     that it is a lot of work and the chances of everyone rushing to do
>     this are not exactly outstanding.
>     Jingle adoption hasn't been quite what we've hoped for and there are
>     only a handful of clients that support it out there. Many (most?) of
>     those are stuck with the early Gingle variant and never bothered to
>     update to Jingle.
> Right, and adoption is something we've got to work on.

Agreed and just deprecating all existing implementations with a wave of 
our collective hand does not sound very much as nurturing adoption.

> I suspect we need
> to do some interop work here.

Any time.

>     With the above in mind, I am not particularly optimistic about the
>     adoption of yet another major shift.
>     I think it would be a lot safer if we tried to resolve the issues we
>     have by building on top of the current spec rather than starting
>     from scratch.
> Absolutely, but I don't think this is a major shift.

I guess we all have our perceptions. I am just afraid that if we were to 
implement something like this in Jitsi it could end up lingering on our 
todo list for quite a long while. I think we are not the only ones in 
that position.

I'd really like it noted that I wouldn't be making *any* of these 
objections if we were currently designing Jingle 1.0 and there weren't 
any implementations our there. In fact I would be rooting for <message/> 
myself. I do see the benefits.

Unfortunately that's not where we are.

>     On 24.07.13, 10:30, Dave Cridland wrote:
>     This does look appealing but I am afraid that the feeling might be a
>     bit superficial. One has two ways of handling forking:
> "forking" seems like an overloaded term here. Is there a term of art we
> could use specifically for multiple devices ringing for the same
> (conceptual) call? If not, can we coin one?

I feel "forking" fits rather well. Do you think that it could be 
interpreted as referring to something different from what we are 

>     A) you either leave it entirely to the originating endpoint
>     B) or your do it in the server, hiding it from the client and only
>     letting it see the leg that produced a successful call.
>     We can already do "A" with IQs: A client can loop through available
>     resources and send session-initiates to everyone who's there. It
>     would then need to properly handle all the resulting call legs ... I
>     don't think many clients would go for this but if someone wants to:
>     they have the option.
> Right, you can do this, but only if you're already sharing presence. So
> either you need to prod the other side into sharing presence first via
> XEP-0276, or else you need to get a subscription done.

True. Although people have been known to work around this. Jingle calls 
to PSTN numbers through Google Voice are one example. I am not saying 
everyone should do this, but I am confident we could find solutions.

Also ... can we consider allowing IQs to bare JIDs? Let those server 
guys do some of the VoIP work for a change ;).

Seriously though, is this conceivable?

>     "B" is how most SIP deployments do this and it implies specifically
>     implementing this on the server. The promise of <message/> is that
>     we might be able to pull it off without server modifications.
>     I believe however that this is deceptive because <message/> would
>     only auto fork the "session-initiate" and it wouldn't mask all the
>     replies. This means that clients would need to have the code that
>     handles all the legs that a call produced, which brings us back to
>     the "A" case. As mentioned above, the "A" case is already
>     implementable with IQs.
>     If, on the other hand, a server is willing to actually do the work
>     so that forking would be hidden from the client then wouldn't it be
>     able to also do the same with IQs?
> Not legally.
> Well, there's two issues there.
> Firstly, it breaks <iq/> routing rules, which'd be painful,

It doesn't need to break routing as in "require different routing 
rules". It simply won't be routed by the servers that support this and 
will be replaced with valid IQs by the server that's performing the fork.

Another option would be to have the server return an error indicating 
that the <iq/> will not be routed as requested but then adding 
additional elements indicating that the <jingle/> call is still in 
progress so there's nothing to worry about.

> but even if
> not, you'd need to spoof the session-initiate to each called device, and
> moderate the replies.

You would have to do this with any kind of "B" style implementation. 
That was my point.

> If you stretch things a bit, you could do a bare-jid <iq/>, but then you
> still have the problem of handling session-accept from an unexpected
> source - it's not clear to me that would work reliably, since it's not
> documented behaviour.

I must be missing something, but how is this different from a 
session-initiate which is just as much of an <iq/> from an unexpected 

You could argue that it is a "session-accept" from an unexpected source 
but that's something a spec update could take care of (and the 
implementation work is not comparable to a shift toward <message/>s)

> However, we can mandate that kind of handling with <message/> - new
> protocol construct, new behaviour - and so we can move from an A-style
> to a B-style with considerable ease.

The would also be true for IQs if forking is a discoverable feature. 
Clients can check if servers have it and rely on them to do it when 
that's the case, while looping through resources when it's not.

>         I imagine the
>         session-acknowledge could be elided entirely here, but maybe it
>         should
>         always be sent early to ensure the caller knows not to time out
>         too quick.
>     Having acks is actually quite nice and an advantage to SDP offer
>     answer. Getting an ACK means that you know your offer or answer have
>     been received. You know that everything in there is now understood
>     which is quite important especially in the case of an answer.
> I meant that in the specific case of a server stepping in with a
> voicemail system, but yes, I think ACKs are generally useful. The
> question is really whether you want the voicemail system to ack the call
> before your devices have had a chance to ring - I doubt it'd do any
> harm, mind.
>         Of course, we need additional rules to ensure that multiple devices
>         don't accept the call, but this is where PEP or Carbons come in.
>     How do PEP and Carbons help here? More specifically, how are they
>     better than simply ending the call to everyone who's still ringing
>     as soon as we get the first answer?
> My under-considered opinion is that there may be races involved, and
> that called endpoints would benefit from telling each other about
> answers rather than hoping that the caller's device is well-behaved.

Unless you implement some sort of locking (which I don't think would be 
reasonable), you would still have racing. I don't think this is a big 
problem. SIP servers have this working they are in a worse position 
given how a server needs to choose between CANCEL and BYE depending on 
whether the remote party answered.

In our case it's simply a "session-terminate" no matter what.

> I
> think there's UX benefits in having your devices aware of your personal
> call state anyway.

Definitely, but I see this fitting better as an optional feature, rather 
than a requirement in order to have forking work.



More information about the Jingle mailing list