[Standards] Distributed chatrooms in PSYC

Carlo v. Loesch CvL at mail.symlynX.com
Fri Feb 19 13:41:00 UTC 2010

Since both stpeter and Dave Cridland are asking us for an opinion,
here it is. As it is about the distributedmuc proposed XEP, I'd
rather discuss it here than in the MUC forum.

IRC channels are kind-of cool, but they are built on a shaky
foundation of having just one multicast tree over a set of trusted
servers with a huge shared database. The three enabling factors
I just mentioned are the worst of IRC that shouldn't be brought
to XMPP. They make IRC channels possible, but the come with a
plethora of problems.

In http://mail.jabber.org/pipermail/muc/2010-February/000144.html
Dave Cridland says "[..] the key thing is that Peter's idea is
  to have an IRC-like structure, where a room is conceptually shared
  between trusted servers by an agreement, whereas in our model,
  routing is delegated whereas control is not."

Bingo. That's a very important point here.

 "Another distinction between the two approaches is what the core aims
  are - in PSA-style, it's to provide resilience between servers,
  whereas in KD-style, it's largely to reduce redundant message traffic
  from being repeated redundantly repeated."

I'm glad great minds think alike. So the master/slave approach you
describe here is how PSYC has been solving this problem since 2000.
The difference is only that it is a fundamental routing function of
the protocol and therefore being used for presence, newscasting,
chatrooms and whatever else needs to go to more than one person.
As a side effect multicast combined with the subscription mechanism
creates a trust situation very effective for weeding out spam.

In the original designs for PSYC there was a greater emphasis on
resilience, by allowing for back-up masters and stuff to be handled
in slaves - but experience led us the opposite way. Until 2003 we
had a master/junction/slave system with pretty intelligent slaves
acting like chatrooms themselves and junctions to implement
multicast trees, but we found this whole set-up to be more complex
than necessary and too administration intensive.

In http://mail.jabber.org/pipermail/muc/2010-February/000143.html
Kevin Smith said on the MUC list:
 "There's another approach to this, discussed by Dave Cridland and
  myself offlist, which is interesting. The basic premise being that
  instead of pre-arranging with conference.jabber.org to shadow mucs
  there, it could do it on the fly"

That's one step we took in 2003 - instead of configuring slaves for
each chatroom on each server, handknitting multicast trees, we
realized that every server that has users should act as a slave in
distribution trees. What we call the "context slave" is this
transparent automatic thing that implements a slave for any chatroom
on the PSYC network on the local server.

 "Then you've got the master/slave split, where some messages are
  broadcast locally (message/presence) as well as broadcast up the
  chain, while others are passed straight up (config changes) and don't
  take effect when you're in a netsplit."

That's one of the complexities we found to be unnecessary hassle.
We defined that a "multipeer multicast" scenario *is* possible, where
each slave has a way to send to all other slaves in a most efficient
way, but that we would not use it in current PSYC deployment. In
current PSYC, all traffic is submitted to the master, then distributed
down the tree. Gone are all the issues of getting those things
sorted out.

What about the resilience you say? Well, first of all, in our large
scale deployments we never ran into any problems. The key thing we
learned here was to keep the master out of trouble.

If the master only deals with slaves, no users, and is on a healthy
server infrastructure, it will operate reliably. If you are afraid of
DDoS attacks, you can still hide the master from view, a bit like a
"hidden primary" in DNS. For the end user, if a slave goes down, she
hits the reload button and finds herself on some other slave. Or the
client does automatically without her noticing.

But you could also keep a spare master ready to take over - something
we had planned, but then never really needed. In open federation it is
however important to avoid heavy centralization of chatroom masters.
If all cool chats are on conference.jabber.org, then someone may
target that for a DDoS just because of one chatroom where she got
kicked out. If chatroom masters are spread all over the net, peaceful
masters will live a peaceful life.

If instead you empower the slaves to take administrative actions you
may run into all kinds of IRC horrors like chatroom takeovers by
rogue slave servers.

The XEP says:
  "However, this assumption introduces a single point of failure for
   the conference room, since if occupants at a using domain lose
   connectivity to the hosting domain then they also lose connectivity
   to the room."

Have you indeed run into failures because of this, or was it rather
because there are too many chatrooms hosted on a single host? The
Internet is full of single points of failure: our home pages are
usually not decentralized, our chat identities are hosted on a single
server in many cases. We even rely on Twitter.com to get "the news".
If we are to tackle this, why just for chatrooms?

In 2006 we wrote up a XEP for decentralized chatrooms based on the
experience gained with IRC and PSYC. It was very simple and only
addressed the traffic overhead MUC had and still has today. It avoids
delegating any administrative powers and requiring shared state.
Here it is:

More information about the Standards mailing list