[standards-jig] Improving network integrity

Justin Karneges justin-keyword-jabber.093179 at affinix.com
Wed Dec 31 14:09:45 UTC 2003

After being involved with Jabber all this time, I feel we should make some 
improvements towards better network 'integrity'.

I'm sure all of us by now have experienced a situation where we send a message 
to a contact we think is available, only to have the message trashed.  This 
can happen for a variety of reasons, such as if the recipient is no longer 
connected to Jabber, possible server-to-server issues, or if the sender 
doesn't yet realize his connection is lost!  The lack of proper bounce 
messages further keep the sender in the dark.  I call this the 'black hole' 

Here are 3 ways to improve accountability in Jabber:

1) Keep track of successful data exchanged between TCP connections.  This goes 
for both c2s and s2s.  If a packet does not safely make it across the 
network, then the sending implementation should be aware, so that it can 
bounce if needed.  TCP is actually quite aware of what data has been 
successfully transmitted.  Consider Linux 2.4, in which you can do this:


  /* get TCP send-Queue size */
  int size;
  ioctl(sockfd, SIOCOUTQ, &size);

If this were easily possible everywhere, and implemented by all servers and 
clients, we could nearly eliminate the 'black hole' effect in Jabber.  
Despite the fact that this is not portable, it is interesting to note that 
keeping track of the TCP send queue doesn't change the XMPP protocol.  This 
means that servers lucky enough to run on Linux (like jabberd) could be 
easily made more accountable using this ioctl without bothering anyone.  
Really, it shouldn't hurt.  This goes for clients too.

However, if we want a general solution for all platforms, we would need an 
application-level 'ack' of some sort.  This does not have to be complex, it 
could simply be a newline in response to every stanza.  However, this would 
involve changing the XMPP spec, which is a discussion for another day ...

2) Server-to-server presence probe 'pings'.  While a disconnected c2s session 
indicates that a resource is no longer available, a disconnected s2s session 
doesn't imply anything (as I understand it).  When an s2s link breaks, it is 
easily possible for one of the servers to later 'fall off the network'.  This 
leaves the other servers thinking the resources of the missing server are 
still available when they are not.  To solve this, we need the servers to 
connect to each other on a regular basis, so that dead servers can be 
accounted for.

3) Fully end-to-end message acking.  Currently this is present with JEP-22 - 
Message Events.  This is the only true way to ensure a message makes it to 
the recipient.  Note that #3 is not a replacement for #1.  Session-level 
reliability is still useful for offline messages and an overall sense of 
network solidity.  Ideally, we want all of these things implemented in our 
clients and servers.

Feedback welcome,

More information about the Standards mailing list