[standards-jig] Pub/Sub for JNG?

David Waite mass at akuma.org
Wed May 1 18:54:35 UTC 2002


Iain Shigeoka wrote:

>Wow, this thread sure has taken off.  :)  
>
>1) compressed streams
>
>I'm mixed on this.  As mentioned earlier, it was shown
>that you can compress Jabber streams to lower bandwidth.
>IM servers are typically network bound so the computation
>overhead of compression should not slow the overall
>performance of the server.
>
>I believe the last time this came up, it was eventually dropped
>because the benefits just didn't justify the added complexity.
>
>Ideally, I suppose we should try to make this a negotiable
>transport feature.  It seems to come up enough that no
>matter what the group decides, there will be enough people
>in the other camp that will want to do the opposite.
>
>I believe we can design a transport system that can accomodate
>this.
>
I'll add a few more comments.

The effects of using something like gzip over a 100-200 byte text block 
(a typical message) will not give any significant performance gains. 
Using gzip over multiple messages might, but that will increase latency 
if you cache messages for transmission in a block. Transmission as a 
gzip stream will not work, because you cannot guarantee that a message 
will end on a byte boundary; a message might not be transmitted until 
additional data is sent down the pipe. Finally, I believe the gzip/bzip 
dictionaries have a minimum in-memory size of about a megabyte; AFAIK 
this is one reason why a lot of linux installation systems used to 
require more memory to install than to actually run linux.

For something like server to server, this might be justified. You would 
still want to make sure to have blocks of traffic, but the conception is 
that you have less server connections than client connections, so the 
memory usage will be less noticable, and the higher CPU usage may 
balence itself off.

I still think the best solution to the bandwidth problem (assuming for 
the moment that it is a problem) is a jabber-specific compression 
system, with a default (negotiated) dictionary which is appended by both 
sides as traffic continues. Ideally this compression system would also 
be a binary representation of the XML, but this becomes significantly 
difficult with things like prefixed namespaces :-)

In the end, its a really cool project, I would be really interested in 
seeing it done as a project and probably would contribute a bit of 
brainpower to the project. It might never be an actual standard 
recognised by the JSF though.

>2) UDP
>
>Technically, I think UDP is good idea.  Especially in situations
>where we could exploit multicast.
>
>I imagine that the main reason people avoid it though is difficulty
>in making a good implementation based on UDP, and firewall
>issues.  The latter would seem to me to be the largest issue.
>For all its technical advantages, I think UDP is simply a non-starter
>for us if we want to get inside enterprises.
>
UDP issues include:
- Difficulty (impossibility) getting through firewalls in either direction
- datagram-based sending requires more logic
- datagrams are recognised as being disposable during periods of congestion.
- things like traffic retrying, sequencing, congestion notification and 
other flow control are all lost, and also all needed by Jabber. 
Basically for something which _is_ stream-based like the protocol of 
Jabber, you wind up trying to reimplement a tweaked, non-conformant TCP 
 implementation within a client. I might be crazy, but I think that the 
Linux kernel group, the *BSD respective kernel groups, and even 
Microsoft can design a better TCP implementation than I can, especially 
when I work in langauges other than C or C++.

Again, it would be cool to see, but you aren't going to get a lot of 
people to jump at it until there is an actual implementation. Even, then 
it is not suitable for a lot of applications - the fact that every 
commercial instant messaging system now uses TCP illustrates this quite 
well.

-David Waite




More information about the Standards mailing list