[jdev] voicechat again

Peter Saint-Andre stpeter at jabber.org
Tue Mar 2 15:39:18 CST 2004

OK, I have to apologize for re-starting this thread but I'm still
catching up on mail from November and December! So...


There is a long, long thread starting there. Lots of talk about Speex,
H.323, p2p vs. client-server, and so on. As far as I can see, no
consensus ever emerged. It seems that people want some kind of voice
integration (maybe video too, but I think that's farther out). They want
to do 1-to-1 voice chat and maybe even multi-user voice-conferencing.
They want to be able to negotiate that over Jabber and then go out of
band to do the voice stuff. They want this to work from behind NATs and
firewalls. They don't want to open crazy ports in the firewall (or turn
off the firewall entirely!) in order to get this done. The only message
I posted in that thread pointed out that research indicates people don't
actually upgrade from IM to voice or video all that often (by "upgrade"
I mean something as simple as picking up the phone or meeting f2f, not
necessarily switching from IM to VoIP or whatever). So I still have my
doubts about how necessary or important this really is, but I do hear
the question more and more: "When is Jabber going to support voice?" 

It seems to me that first of all we need to get clear on the use cases 
and requirements. Do we want the ability to negotiate telephone-quality 
voice chat between two IM users? That seems to be the base case (after
all we treat chat and groupchat differently in Jabber, why not treat
voicechat and voice-conference differently?). [Of course maybe it is
stupid to treat chat and groupchat differently, but we burned that
bridge a long, long time ago! :-)] So how do we negotiate one-to-one
voicechat via Jabber? Is it just a stream initiation profile (see
JEP-0095)? Can we treat this in a similar fashion to file transfer
and send data through a SOCKS5 Bytestreams (JEP-0065) proxy as a 
fallback if p2p won't work? Can SOCKS5 Bytestreams handle something like
Speex? I notice in draft-herlein-speex-rtp-profile-02.txt that the
author mentions sending Speex data over TCP:

   This transport type signifies that the content is to be 
   interpreted according to this document if the contents are 
   transmitted over RTP.  Should this transport type appear 
   over a lossless streaming protocol such as TCP, the content 
   encapsulation should be interpreted as an Ogg Stream in 
   accordance with RFC 3534, with the exception that the content 
   of the Ogg Stream may be assumed to be Speex audio and Speex 
   audio only.  

So could we potentially do Speex over TCP using a JEP-0065 proxy (or p2p
as defined in that JEP) for voicechat? I realize that it would not work
for voice-conference and might not be perfect, but is it possible? Just
curious. Again, I'm sorry if we've hashed all this out already -- that
was a long thread to catch up on and I am not deeply knowledgeable about
this voice/video stuff.


More information about the JDev mailing list