[Standards-JIG] RE: Jingle channel naming (was Jingle extensibility)

Jean-Louis Seguineau jean-louis.seguineau at laposte.net
Wed Apr 5 19:02:14 UTC 2006

Joe, don't take me wrong, these are not 'my definitions' but those I have
copied from the RFC (except media group). I believe that a proper agreement
on complex matters always starts by speaking the same language ;)

In addition, far from me the idea of 'binding ourselves to RTP'. I frankly
don't care what the actual media transport protocol is. It is just the way
things were defined in this RFC. I just took this RFC as a basis because it
is about SDP which is an established protocol, and one we would have to deal
with if we want to bridge Jingle and SIP. But this is not the only one. I
only believe we just need a common set of definitions, something that would
not conflict too much with other existing technologies. This was my only

JEP-166 Media Type is defined as a formal declaration of the purpose of the
session. Common session types might be "voice", "voice+video", and "file
sharing". A session consists of one active negotiated media type at a time.
Depending on the media type and the media description, 1+ channels will be
negotiated and used. This is the 'what' of the session. In Jingle XML syntax
this is the namespace of the <description/> element.

With this definition, it implies we can only have one <description/> element
at a given time in a Jingle construct. And this element is qualified by a
namespace. From the JEPs specifying some of these <description/> elements I
only see individual media types being defined (such as audio in JEP-167, and
video in JEP-180) each in their own namespace.
So how would we go about describing audio+video? If the answer is allowing
several media types (i.e. <description/>) in a Jingle construct won't we
have to amend the definitions in JEP-166. Am I missing anything here? 

The original point for me was trying to understand what a channel was. And
I'm still in the dark ;) You are using several time the word stream and
channel in your post, but I still cannot figure out what they mean, because
I cannot relate them to any definition. And my other knowledge does not help
either. The definition I used for "media stream" from RFC RFC3388 is
definitively what we call "media type" in JEP-166, my bad. But from your
post you said that this is what you call channel. So what is the difference
between a channel and a media type? 

Maybe I should list my understanding of the various different cases we have
to take into account to establish a Jingle multimedia session (i.e
identified by one Jingle sid):

Case 1: one media type/one physical transport
Case 2: one media type/many physical transports (voice and DTMF from two
different sources f.ex)
Case 3: many media types/one physical transport (I don't know if this is
valid ?)
Case 4: many media types/many physical transport (traditional RTP/SDP usage)

Do we agree on these use cases, or am I totally outside the subject? 
If these are the different use cases, shan't we make sure we do not overlook
our definitions in Jingle JEP-166 and leave all the extension possibilities

If these cases are what we want to represent, what is a channel: a
combination of one <description/> (media type) and a <transport/>, a
combination of one <description/> (media type) and several <transport/>, or
something else?

Don't get me wrong either on the nature of these XML elements. I am not
implying that they define static and immutable definitions. I completely
agree that media and physical transport parameters may change during the
time of the Jingle session.

Once again, I'm not saying anything you said is wrong, I'm just trying to
figure out what you mean when you write 'stream', 'channel', etc... so you
and I, like everybody else, are on the same page. And we'll rely on StPeter
to diligently add the definitions to the appropriate JEP...



-----Original Message-----
Message: 3
Date: Wed, 5 Apr 2006 09:53:29 -0700
From: "Joe Beda" <jbeda at google.com>
Subject: Re: [Standards-JIG] RE: Jingle channel naming (was Jingle
	schema	extensibility)
To: "Jabber protocol discussion list" <standards-jig at jabber.org>
	<dbfdfee20604050953k2eb6e452rfb8289578d7a9d11 at mail.google.com>
Content-Type: text/plain; charset="iso-8859-1"

I don't necessarily agree with your definitions.  I think we need to answer
some questions before we go too deep.  Some questions/discussion inline

I'll follow this mail up with a more concrete example for my option #3,
which, after talking with Scott, we are think is a promising way to go.


On 4/5/06, Jean-Louis Seguineau <jean-louis.seguineau at laposte.net> wrote:
> Let's try to dig into this further. But first, I believe we should agree
> on
> wording definition. I believe RFC3388 http://www.ietf.org/rfc/rfc3388.txt
> addresses some of these naming definitions.
> Media stream is defined as a single media instance, e.g., an audio stream
> or
> a video stream as well as a single whiteboard or shared application group.
> This is mapped to the Jingle <description/> element. A media stream is
> identified by a mid (media ID)

This implies that each description element is bound to one and only one
"media stream" (what we've been calling a channel).  I don't think that this
is necessarily what we want.  It should be possible for one description to
define multiple streams/channels.  I also think that we don't want to bind
ourselves to tightly to RDP.  I would argue that a stream consists of a way
to unreliably get small packets of information between endpoints (similar to
UDP, but could be carried over TCP).  I do agree that we need a way to name
and refer to these streams.

Media flow is defined as the association of a single media instance, e.g.,
> an audio stream or a video stream as well as a single whiteboard or shared
> application group.  When using RTP, a media flow comprises one or more RTP
> sessions. This can be mapped to a tuple comprised of a Jingle
> <description/>
> element and one or more <transport/> elements. A media flow is identified
> by
> a fid (flow ID)

There are two issues we need to clear up in this paragraph.
1) Up until now, we've specified that there is one and only one transport
negotiated per session.  This transport is responsible for implementing all
of the streams/channels specified in various <description/> elements.
2) I'm not convinced that we need to define anything like a flow in the
Jingle spec proper.  If we want to support more than one physical
stream/channel being merged to support a single virtual stream/channel we
should probalby do so in the transport.  The idea of a flow (binding
multiple phsyical means of transmitting packets into one combined stream)
should be completely defined by the transport.  If there is a compelling
scenario for this now we should look at adding it to an an existing
transport or perhaps creating a new transport.  I'm inclined to wait and get
to it later.

This is not enough to address the cases you mention in the context of
> Jingle. So I believe we need to introduce another definition to group
> Media
> Flows together.
> Media group is defined as the association of two or more media flows. A
> media group is identified by a gid (group ID)
The way Jingle is defined in JEP-166 implies that Media Streams i.e.
> <description/> are defined by a single xmlns. This is in line with XMPP.
> That makes it difficult "for a single description to define more than one
> stream" (quote). But using the definition above we can have the following
> construct to achieve the expected description.
> <description gid='0' mid='0' fid='0'
>    xmlns='http://jabber.org/protocol/jingle/media/audio'>
> ... description for audio media
> </description>
> <description gid='0' mid='1' fid='0'
>    xmlns='http://jabber.org/protocol/jingle/media/video'>
> ... description for video media
> </description>
> <transport gid='0' fid='0'
>    xmlns='http://jabber.org/protocol/jingle/transport/raw-udp'>
> </transport>
> We would use the same gid for bind the two atomic description together.
> This
> is in line with the current JEPs, where descriptions are bound to a single
> media. In the above example, we have only one transport for the two media
> streams. If we try to apply the same approach to your example, we may end
> up
> with something like:
> <description gid='0' fid='0'
>    xmlns='http://jabber.org/protocol/jingle/media/audio'>
> ... description for audio media
> </description>
> <description gid='0' fid='1'
>    xmlns='http://jabber.org/protocol/jingle/media/video'>
> ... description for video media
> </description>
> <description fid='2'
>    xmlns='http://jabber.org/protocol/jingle/media/wb'>
> ... description for whiteboard media
> </description>
> <transport gid='0' fid='0'
>    xmlns='http://jabber.org/protocol/jingle/transport/raw-udp'>
>    ... transport for audio stream
> </transport>
> <transport gid='0' fid='1'
>    xmlns='http://jabber.org/protocol/jingle/transport/raw-udp'>
>    ... transport for video stream
> </transport>
> <transport fid='2'
>    xmlns='http://jabber.org/protocol/jingle/transport/raw-udp'>
>    ... transport for whiteboard stream
> </transport>
> Does this approach address your concerns? Is it fair to say in this case
> the
> channel is also a Media group?

I don't think I understand why we would need a media group here.  From what
you've said here, I would say that a media stream and a channel are
essentially the same thing.  I've hesitated to use the term "media stream"
before as we are not bound to RTP and can care more than just media.

More information about the Standards mailing list