[JDEV] Videoconferencing with jabber / Re:[speex-dev]Videoconferencing with speex and jabber
thoutbeckers at splendo.com
Sun Nov 30 18:28:15 CST 2003
On Sun, 30 Nov 2003 22:36:38 -0000, Richard Dobson <richard at dobson-i.net>
>> I had a larger reply to this, but somewhere it got lost.
>> Using a client/server model has no heavyer requirments than a p2p based
>> mode. Nor is it any complexer, but it will allow much easyer to
>> participate in client/server based conferencing.
> I still dispute your opinion on this, a client having to act as a server
> create more problems than it solves IMO (as already discussed), and IMO
> not really any easier to participate in.
Having one user assume the role as server, and one of client is really no
harder than a model in wich you asume both are equal peers. It's simply a
matter of different roles. If you can think of any reason why this is not
true, please share it with the rest of us!
However, using a client/model will allow you to participate in a
conference on a server with more people *with no extra effort at all*. Yet
you still state you don't believe it will be easyer?
>> I also think building p2p based conferincing into the protocol from day
>> one is unnecisarly complex, that should belong in an extention in any
> Have you changed your mind on the complexity? you say at the top of this
> email that its no complexor to implement p2p than it is client server,
> a bit of a contradiction.
No contradiction at all. I've never disputed that for 2 persons talking
over a direct link p2p is just as easy as a client server model. In fact
I've done quite the opposite, I've stated over and over it's not quite the
same thing, but neither one is complexer than the other. Read back my
previous posts and you'll see.
What I *am* saying, that an entirely p2p based conferencing model (with
more than 2 persons involved) is a lot more complex than a client/server
model. Even more so, if you only have to implement the client portion.
That's why this allows "thin" clients to still participate. It was you
yourself who argued against mixing and bandwith req. on thin clients such
as a pocket PC.
I think from the discussion it's pretty obvious what's needed/wanted most
are 2 things:
- person to person over a direct link
- conferencing with multiple persons on a server
This can both be handeled, without overlap, with a simple JEP based on a
c/s model. P2P won't cover this, nor will it be any simpeler.
Conferencing over induvidual direct links between persons is intresting
too, but too complex to be included in the basic JEP if you ask me.
Conferencing over direct links doesn't have to be p2p either. You can base
it on the c/s JEP with every induvidual participant acting as a server.
Not that more complex than doing this on a p2p based model.
>> It's odd though, that you completly put aside the arguments you tried to
>> make about 1 on 1 chat, when you talk about conferencing. Namely,
>> bandwith (in the case of 20 people talking this would be almost 10 times
>> as much as as c/s based) and lack of mixing capabilities (even if your
>> pocketpc does have that much bandwith it'll have to mix 20 channels!).
> Its just as bad for the client acting as the server (in bandwidth terms)
> it is to go p2p,
With conferencing the requirment of a (fast enough) server is way more
reasonable than for a person to person conversation (I completly agree
with you there a direct link should be used when possible!). However, by
going with a c2s model you'll still provide a fallback method for when a
direct link fails, by using a component that hosts a conference.
The total amount of bandwith used in a c/s conference is always smaller
than a conference based on direct links between all participants. For
obvious reasons ofcourse, I don't need to explain here.
Another difference is with c/s you'll need very little bandwith on all
machines, except for the server.
The server will require the same amount of bandwith as the peak requirment
of bandwith for a single p2p node. This is assuming you use silence
detection, else we're not talking about the peak requirment but just the
normal requirment. Ofcourse, silence detection can also be used in c/s,
but for the server it will be only half as effective as p2p (the server
will only benifit on incoming connections)
Basically this means that in a direct link based conference "weak" client
with limited bandwith, limited CPU will not be able to participate a
direct-link based conference (or will have a bad user experiance). Opposed
to this, if you want to do c/s conferencing, you'll need *1* server with
the same *peak* requirments as a single node (but it's the same for all
nodes) in a direct-link based conference, and generally around 50% more
bandwith usage on *average*.
So let's apply this to some real world situations. In how many cases are
all the clients have about the same available bandwith, CPU, etc. With Joe
Consumer this is unlikely.. it's a mix of dailup and broadband users. If
I'd want to talk to my mother, sister and brother at the same time, I have
a 1 mbit link, 1 will have a cheap DSL account, and the other 2 will be on
dailup most likely.
In a corporate enviroment having a dedicated component for conferences is
much more likely as in consumerland (for benefits mentioned already here),
and even if not, bandwith availabilty would generaly be high enough for
most users to host.
So what's a situation where all users have about the same specs concerning
bandwith and CPU, and there is no 1 machine that sticks out. Well, XBox
Live! ofcourse. All machines are identical, and broadband is required to
participate I think.
Again I don't think direct-link style conferncing is unintresting or
unneeded, but it's a much more specific application than c/s conferencing.
And *again*, a c/s style approach will not prevent this from being an
> also I would disbute that it would be 10 times as much
> bandwidth for the rest, adding silence detection (which you seem to have
> oddly put aside and ignored) reduces the p2p bandwidth use massively,
Hopefully I adressed this now to your liking.
> as I have shown previously the mixing requirements are less on p2p
> than on the "server client".
And how's that? When 4 people talk at once, *all* client will have to mix
4 streams in the case of direct links. In the case of c/s only the server
will have to mix 4 streams. Explain..
(only thing I could think of is if you want to create a seperate mix for
each client, without their own channel in it to prevent echo. Rather than
mixing new streams for each client you should just surpress echo for each
clients. Admitted, it increases demands on the server if you want this,
but not as bad as having to mix a new stream for each client)
> Also having a server client creates a single
> point of failure which you also seem to have completely put aside,
Yes, when the server quits the conference the other will get booted. If
this is a big issue for you, you could devise a fallback system to another
server (one of the clients for example) and still have a massivly less
complex system than direct-link based conferencing. Since servers are most
likely to be the best machines with the best connections this isn't such a
big problem, but it's still easily solved if you want.
When there are a few clients with bad connections in the conversation
reliability will probably improve a bit too. Bad connection <-> Good
connection <-> bad connection is generally more reliable than bad
connection <-> bad connection. Escp. when you consider bandwith usage
> there is
> also the latency issue that you have yet to address and until you do
> this satisfactorly
Latency is an intresting case, but in practise the results would probably
surprise you. Because on low-bandwith nodes to bandwith requirments
dramatically drop when they act as a client rather than a node in the
direct link conference, latency in many cases will actually improve in a
lot of cases! So you can have the situation where a node in a direct-link
conference with 3 persons talking is barely able to keep up, with horrible
latency. While a client with the exact same quality connection is enjoying
a conference where 6 people are talking with lower latency! (it wouldn't
even be able to participate when 6 people are talking in a direct link
Now lets talk about out-of-sync mixing. With direct-link based conferences
every client will produce a different "mix" based on the latency /
bandwith of their connections, and that of the other nodes. This means
when we're in a meeting, for me it can sound like 3 people were talking at
once, while for you it can sound like they didn't at all. (that means I
didn't hear what they said and I'll ask them to repeat, while you'll be
annoyed with me (even more ;) cause for you it sounded like I could have
Ofcourse there is a solution for this, syncing the mixes between nodes.
But then you loose all latentcy advantages, you'll be as slow as the
"weakest link". (and the weakest link will be a lot more stressed than it
would be in a c/s model). Ofcourse compromises are possible..
You'll always have problems with out of sync mixes if you don't do
something about it, but there are cases where it's less likely to occur or
just not so important. For example when it's only about a game anyway ;)
and all clients have about the same bandwith and CPU available.. :)
> I will not be convinced by your strange need to make this
> client server mode only, for which you still havent provided sufficent
I've presented many reasons for you. Maybe you don't agree with them (then
I wonder what you think of Jabber and it's client/server architecture),
but I'd appriciate it if you do not refer to them as "strange".
Escp. considering I didn't quite just make em up either, they are well
known issues with audioconferncing (hardly "strange" issues), and if you'd
have looked into it a little yourself you'd know that. (For example, I'm
not on the speex list, but way at the beginning of the discussion someone
already mentioned these thing have been discussed to death there, I guess
he didn't take the bait and I did ;)
More information about the JDev