[Standards] Jingle drafts

Paul Witty paulrw at codian.com
Fri Apr 11 10:43:11 UTC 2008

Olivier Crête wrote:
> Hello,
> I'm one of the developers of Farsight, a media streaming library.
> Farsight is used as part of Telepathy to implement Jingle audio/video.
> I've recently read the jingle draft and I have a few questions and
> suggestions.
> Jingle ICE-UDP
> Is it really required to send candidates separately instead of sending
> them in one batch? Sending them in one batch like the ICE-19 draft says
> would make having a single implementation for Jingle/SIP more simple.
> Also, ICE-19 needs to order all of the candidates pair before it does
> anything..
The spec doesn't make it clear if it is acceptable to send multiple 
candidates in one message; I can't see any reason why it shouldn't be 
permitted.  However, ICE will inevitably cause candidates to be 
generated in multiple events (some instantly, some waiting for responses 
from STUN and TURN servers).  Because the instantly generated candidates 
will be local, and therefore the highest priority, if an aggressive 
implementation of ICE is used, when the two clients are on the same 
network, it would be possible for ICE to complete before a STUN binding 
response is ever received.
> Jingle audio
> 4. Application format
> Why make the name attribute of the payload-type tag optional at all? Why
> is the profile optional? and if it stays optional a default should be
> specified (probably RTP/AVP) ?
The name is optional for static payload types because we know the codec 
simply from the payload number.

I agree that we need to always know the profile type.  I'd prefer to 
have it a required attribute.
> 5. Negotiation
> Why make the semantics slightly different from those proposed in RFC
> 3264 (SDP Offer/Answer) ? The "declare what we can receive" differs from
> how SOA is used with some codecs (eg. H.264, see RFC 3984 section
> 8.2.2). That also means that it does not accommodate codecs such as
> H.264 has have config-data that has to be sent from the sender to the
> receiver.
I believe that it should be possible to do H.264 without any information 
being send from the sender to the receiver, although this means forgoing 
the symmetry in capabilities which RFC 3984 mandates.
> I'm very much in favor of recommending PCMA/U, but mandating it would be
> a problem because its relatively high bandwidth. And RFC4733 should
> probably be mandated for audio/tone and audio/telephone-event. In the
> case of audio/telephone-event, the optional properties (the fmtp line in
> SDP) does not have the a=b format, we should probably mandate the
> parameter name "event" for the list of supported event types.
There's no need to mandate the "events" parameter; If absent, we assume 
0-9, *, # and A-D.  It should be possible to restrict this though, 
(probably to 0-9, * and #), in which case putting:
<parameter name='events' value='0-11'/>
within the payload type tag would be the way to do this.  Note 'events', 
not 'event', as in 2.4.1 of RFC 4733.
> 4. Application format
> Why is the height/width specified? Why most payload types, it can change
> dynamically without the signalling being notified, for example in the
> case of H.263. How does width/height related to x/y? Are x/y coordinates
> inside a width/height sized area or is width/height the size of the
> rectangle displayed at x/y ? In either case, both the size of the
> picture and of the full frame should probably be included? And what is
> the use case for these?
Height and width are required for some codecs (H.261) to specify the 
maximum we can receive, while others do crazier things (H.264).  In 
fact, most of the none-required attributes seem to be codec-specific, 
and should probably be outside the scope of XEP-0180.
> 7. Error Handling
> Why is unsupported-codecs here but not in Jingle audio ?
Because everything will have G.711 in common? :-D
> Jingle DTMF
> Why is RFC4733 negotiated separately from others audio codecs? It seems
> to be redundant with the regular negotiation of codecs.
> Maybe there should just be an "on/off" negotiation of the XMPP DTMF
> method separate from the use of RFC 4733. Also, sine, XMPP dtmf doesnt
> not include any timing information, it could be argued that it is
> actually less real-time than RFC 4733 DTMF.
Because we negotiate one audio channel, one video channel, and one DTMF 

XMPP DTMF has timing information: all the messages are sent in real time 
(within the constraints of TCP), so button press durations can be 
reasonably accurately recovered.



More information about the Standards mailing list