[JDEV] Videoconferencing with jabber / Re:[speex-dev]Videoconferencing with speex and jabber

Tijl Houtbeckers thoutbeckers at splendo.com
Sun Nov 30 18:28:15 CST 2003

On Sun, 30 Nov 2003 22:36:38 -0000, Richard Dobson <richard at dobson-i.net> 

>> I had a larger reply to this, but somewhere it got lost.
>> Using a client/server model has no heavyer requirments than a p2p based
>> mode. Nor is it any complexer, but it will allow much easyer to
>> participate in client/server based conferencing.
> I still dispute your opinion on this, a client having to act as a server 
> can
> create more problems than it solves IMO (as already discussed), and IMO 
> its
> not really any easier to participate in.

Having one user assume the role as server, and one of client is really no 
harder than a model in wich you asume both are equal peers. It's simply a 
matter of different roles. If you can think of any reason why this is not 
true, please share it with the rest of us!

However, using a client/model will allow you to participate in a 
conference on a server with more people *with no extra effort at all*. Yet 
you still state you don't believe it will be easyer?

>> I also think building p2p based conferincing into the protocol from day
>> one is unnecisarly complex, that should belong in an extention in any 
>> case
>> then.
> Have you changed your mind on the complexity? you say at the top of this
> email that its no complexor to implement p2p than it is client server, 
> seems
> a bit of a contradiction.

No contradiction at all. I've never disputed that for 2 persons talking 
over a direct link p2p is just as easy as a client server model. In fact 
I've done quite the opposite, I've stated over and over it's not quite the 
same thing, but neither one is complexer than the other. Read back my 
previous posts and you'll see.

What I *am* saying, that an entirely p2p based conferencing model (with 
more than 2 persons involved) is a lot more complex than a client/server 
model. Even more so, if you only have to implement the client portion. 
That's why this allows "thin" clients to still participate. It was you 
yourself who argued against mixing and bandwith req. on thin clients such 
as a pocket PC.

I think from the discussion it's pretty obvious what's needed/wanted most 
are 2 things:
- person to person over a direct link
- conferencing with multiple persons on a server

This can both be handeled, without overlap, with a simple JEP based on a 
c/s model. P2P won't cover this, nor will it be any simpeler.

Conferencing over induvidual direct links between persons is intresting 
too, but too complex to be included in the basic JEP if you ask me.
Conferencing over direct links doesn't have to be p2p either. You can base 
it on the c/s JEP with every induvidual participant acting as a server. 
Not that more complex than doing this on a p2p based model.

>> It's odd though, that you completly put aside the arguments you tried to
>> make about 1 on 1 chat, when you talk about conferencing. Namely, 
>> limited
>> bandwith (in the case of 20 people talking this would be almost 10 times
>> as much as as c/s based) and lack of mixing capabilities (even if your
>> pocketpc does have that much bandwith it'll have to mix 20 channels!).
> Its just as bad for the client acting as the server (in bandwidth terms) 
> as
> it is to go p2p,

With conferencing the requirment of a (fast enough) server is way more 
reasonable than for a person to person conversation (I completly agree 
with you there a direct link should be used when possible!). However, by 
going with a c2s model you'll still provide a fallback method for when a 
direct link fails, by using a component that hosts a conference.

The total amount of bandwith used in a c/s conference is always smaller 
than a conference based on direct links between all participants. For 
obvious reasons ofcourse, I don't need to explain here.
Another difference is with c/s you'll need very little bandwith on all 
machines, except for the server.

The server will require the same amount of bandwith as the peak requirment 
of bandwith for a single p2p node. This is assuming you use silence 
detection, else we're not talking about the peak requirment but just the 
normal requirment. Ofcourse, silence detection can also be used in c/s, 
but for the server it will be only half as effective as p2p (the server 
will only benifit on incoming connections)

Basically this means that in a direct link based conference "weak" client 
with limited bandwith, limited CPU will not be able to participate a 
direct-link based conference (or will have a bad user experiance). Opposed 
to this, if you want to do c/s conferencing, you'll need *1* server with 
the same *peak* requirments as a single node (but it's the same for all 
nodes) in a direct-link based conference, and generally around 50% more 
bandwith usage on *average*.

So let's apply this to some real world situations. In how many cases are 
all the clients have about the same available bandwith, CPU, etc. With Joe 
Consumer this is unlikely.. it's a mix of dailup and broadband users. If 
I'd want to talk to my mother, sister and brother at the same time, I have 
a 1 mbit link, 1 will have a cheap DSL account, and the other 2 will be on 
dailup most likely.

In a corporate enviroment having a dedicated component for conferences is 
much more likely as in consumerland (for benefits mentioned already here), 
and even if not, bandwith availabilty would generaly be high enough for 
most users to host.

So what's a situation where all users have about the same specs concerning 
bandwith and CPU, and there is no 1 machine that sticks out. Well, XBox 
Live! ofcourse. All machines are identical, and broadband is required to 
participate I think.

Again I don't think direct-link style conferncing is unintresting or 
unneeded, but it's a much more specific application than c/s conferencing. 
And *again*, a c/s style approach will not prevent this from being an 

> also I would disbute that it would be 10 times as much
> bandwidth for the rest, adding silence detection (which you seem to have
> oddly put aside and ignored) reduces the p2p bandwidth use massively,

Hopefully I adressed this now to your liking.

> also
> as I have shown previously the mixing requirements are less on p2p 
> clients
> than on the "server client".

And how's that? When 4 people talk at once, *all* client will have to mix 
4 streams in the case of direct links. In the case of c/s only the server 
will have to mix 4 streams. Explain..
(only thing I could think of is if you want to create a seperate mix for 
each client, without their own channel in it to prevent echo. Rather than 
mixing new streams for each client you should just surpress echo for each 
clients. Admitted, it increases demands on the server if you want this, 
but not as bad as having to mix a new stream for each client)

> Also having a server client creates a single
> point of failure which you also seem to have completely put aside,

Yes, when the server quits the conference the other will get booted. If 
this is a big issue for you, you could devise a fallback system to another 
server (one of the clients for example) and still have a massivly less 
complex system than direct-link based conferencing. Since servers are most 
likely to be the best machines with the best connections this isn't such a 
big problem, but it's still easily solved if you want.

When there are a few clients with bad connections in the conversation 
reliability will probably improve a bit too. Bad connection <-> Good 
connection <-> bad connection is generally more reliable than bad 
connection <-> bad connection. Escp. when you consider bandwith usage 
drops too.

> there is
> also the latency issue that you have yet to address and until you do 
> address
> this satisfactorly

Latency is an intresting case, but in practise the results would probably 
surprise you. Because on low-bandwith nodes to bandwith requirments 
dramatically drop when they act as a client rather than a node in the 
direct link conference, latency in many cases will actually improve in a 
lot of cases! So you can have the situation where a node in a direct-link 
conference with 3 persons talking is barely able to keep up, with horrible 
latency. While a client with the exact same quality connection is enjoying 
a conference where 6 people are talking with lower latency! (it wouldn't 
even be able to participate when 6 people are talking in a direct link 

Now lets talk about out-of-sync mixing. With direct-link based conferences 
every client will produce a different "mix" based on the latency / 
bandwith of their connections, and that of the other nodes. This means 
when we're in a meeting, for me it can sound like 3 people were talking at 
once, while for you it can sound like they didn't at all. (that means I 
didn't hear what they said and I'll ask them to repeat, while you'll be 
annoyed with me (even more ;) cause for you it sounded like I could have 
heard perfectly).

Ofcourse there is a solution for this, syncing the mixes between nodes. 
But then you loose all latentcy advantages, you'll be as slow as the 
"weakest link". (and the weakest link will be a lot more stressed than it 
would be in a c/s model). Ofcourse compromises are possible..

You'll always have problems with out of sync mixes if you don't do 
something about it, but there are cases where it's less likely to occur or 
just not so important. For example when it's only about a game anyway ;) 
and all clients have about the same bandwith and CPU available.. :)

> I will not be convinced by your strange need to make this
> client server mode only, for which you still havent provided sufficent
> reason.

I've presented many reasons for you. Maybe you don't agree with them (then 
I wonder what you think of Jabber and it's client/server architecture), 
but I'd appriciate it if you do not refer to them as "strange".

Escp. considering I didn't quite just make em up either, they are well 
known issues with audioconferncing (hardly "strange" issues), and if you'd 
have looked into it a little yourself you'd know that. (For example, I'm 
not on the speex list, but way at the beginning of the discussion someone 
already mentioned these thing have been discussed to death there, I guess 
he didn't take the bait and I did ;)

More information about the JDev mailing list