[Standards-JIG] Dead participants in MU-conf, JEP-0045

JD Conley jconley at winfessor.com
Wed Dec 15 21:27:20 UTC 2004


> It is very simple to test: Setup C <-> S <-> MUC, enter a room, cut
the S
> <-> MUC cable, terminate C. S will notice, but is unable to connect to
> MUC. Reboot S. MUC keeps the participant forever.

But if MUC ever attempted to send a message or presence to C, it would
get an error back from its hosting server saying that C is unreachable
and should remove C from any rooms they're in.  Obviously this isn't
happening today in the MUC service and server you are testing with, but
it is how it should be done.

If the room is completely idle then this situation will never be
resolved.  If we take a step back from MUC we can see that this problem
is actually much more general.  I often have orphaned users in my roster
due to s2s connection problems.  This has been discussed many times.

The solution I find the most appealing from a protocol and
implementation standpoint is to use presence probes.  However, the
server hosting MUC should do the probing, not the MUC service itself.  A
server should track the sender and recipient of all presence packets it
receives.  It should have a "watchdog" system which runs occasionally
(every 5 or 10 minutes) and probes any nodes on behalf of the original
recipient of the presence packet.  Of course, the probe would only occur
if the original sender hasn't been active for a set period of time.  If
any of the probes going to remote servers through S2S can't be delivered
the MUC service will be notified with a presence error from the node on
the other server.  Any delivery related errors received should be
treated as unavailable presence by the server.  If any nodes have not
responded to probes in a certain timeframe they should be treated as
unavailable as well.  If the timeout was the same as the watchdog
interval this would happen with every other run of the watchdog.

JD



More information about the Standards mailing list