[Standards] [Operators] Future of XMPP Re: The Google issue

Dave Cridland dave at cridland.net
Wed Dec 4 21:55:33 UTC 2013

On Wed, Dec 4, 2013 at 4:18 PM, Ivan Vučica <ivan at vucica.net> wrote:

> There aren't many servers with thousands of concurrent users behind a
> single domain, and it seems to me there is a good reason: it's not
> something supported by XMPP itself. Large installations have separate
> backend infrastructure which often uses different payload delivery
> mechanisms, unrelated to XMPP itself. I'm not familiar with what ejabberd
> does, but Facebook quite openly states that their internal servers don't
> speak XMPP.
Let's skip over the fact that there are plenty of servers in the thousands;
there are few in the millions, and I'm assuming that's what you meant.

I've seen that particular quote from Facebook get plastered about a lot. It
does *not* mean what you think it means. As far as I can tell, it
originates with the suggestion that they were using a stock XMPP server -
they're not, and never were, though it's possible they ripped apart an XMPP
server in order to build their C2S handling.

But in practise, no XMPP service anywhere speaks XMPP internally, for the
very simple reason that XMPP is spoken *between* servers.

So the problem then is that XMPP is fundamentally an end-to-end protocol -
that is, for many of our XEPs, a server should be unaware of them and just
pass-through the data. That causes problems when you have an existing
messaging infrastructure that's based around, say, text/plain only, and you
want to glue XMPP onto it. It doesn't make any difference if you're
building services around a gateway model, or if you're building an XMPP
service from scratch.

Revisiting that design would essentially involve throwing away XMPP as a
whole - that is, assuming that a service had to fully mediate, with full
knowledge, between client and remote service. That's not a wrong design,
but it leads you down a very different path, and removes the ability for
clients to develop their own extensions, and so on - I think it'd remove a
lot that's good about XMPP.

> How to approach this? I don't know. Here's a thought. Since XMPP does make
> significant use of DNS, does have s2s and components, how about some type
> of "connection-handling slaves"? Have the initial c2s connection redirected
> to a slave specific to a JID. Have the possibility of a received s2s stanza
> being distributed to appropriate c2s, but have the receiving server also
> respond with "in future, for this JID, talk to this slave".
This is a largely unrelated problem, but it's interesting. Some servers
address this but having a proxy which terminates the XMPP session (with TLS
etc), handled authentication, and then passes the remainder through to a
specific cluster node best suited to the jid that's just authenticated.
It's not a bad design; it gives you locality of reference through the
cluster, which is what you're driving toward I think.

> Another issue that should somehow be elegantly solved in the core protocol
> is reducing chattiness through presence filtering and stanza multicasting.
> In s2s connections, why not introduce the concept of "Here's a list of all
> targets for the following iq stanza" or "Please deliver this presence
> stanza to everyone in the 'from' list".
Ah, that design. That one doesn't make any change in octets sent after
compression, and while it does reduce the amount of XML elements parsed, I
don't think that's a big enough issue. I think you want to look at the
design Philipp Hancke et al proposed about 5 years ago, actually, but that
one breaks down with privacy lists and things. I think with strong identity
being considerably more prevalent that it was, it might be worth
revisiting, but it's littered with corner cases that don't quite work. But
yeah, we could reopen that whole can of worms, I love me a good argument
with Philipp. ;-)

For MUC, on the other hand, life is much simpler, and the fundamental
design (remote fanout) works fine for some chatrooms - those which are
public, and the bulk of room occupants are either visitors or ordinary

Luckily, these chatrooms are by far the common case. It still requires a
degree of delegation of policy enactment (made up phrase) but as I say, I
think in a world where we have generally stronger identity, we can do more
useful things with reputation services to make this work for lower-grade
things - of which public chatrooms are one such case I think.

> Finally, when implementing a client, we have XML namespaces and their
> inconsistent (or with some parsers hard-to-do) implementation. Namespaces
> are an awesome solution, but since they are not implemented completely
> consistently, XMPP 2.0 would have to ensure that their value is more
> obvious and better tested with a compliance suite.
I'm not entirely clear what you mean by inconsistent here. My experience
has been that bar XMLNS-well-formedness, actors on the XMPP network
generally do standard XMLNS, and have done for many years.

The XMLNS well-formedness rules that are sometimes broken are that:

 - Some servers (fewer than there used to be) will pass through unbound
prefixes. These are well-formed XML, but not well-formed under XMLNS rules.
 - With XMLNS, it is possible to have an element with the same attribute
twice, by using different prefixes. This rule is also not always enforced
on pass-through.

These are bugs, of course, but do cause a certain degree of expense in

Otherwise, everything I've ever bothered looking into has just worked (or
broken the server I worked on, in which case I just fixed it). Could be I
missed some cases, but I've seen some pretty weird stuff.

> If XMPP didn't depend on some XMLisms like namespaces, it'd be easy to
> switch to a different transport mechanism, if someone prefers it. JSON?
> Plists? Protobufs? Custom binary? Doesn't matter. Whether XML is used is, I
> think, far from the most troublesome problem with deploying large XMPP
> installations and federating. If you want to scale, you have to use
> non-standardized solutions that are not supported by a lot of otherwise
> interesting server software.
I'm not entirely sure that's true, or at least, not axiomatically so. That
is, I hold that it would be possible to build a highly scalable service
that communicated externally by XMPP and also stuck to the RFCs and XEPs.
The hardest problem is actually that the web world has generally moved
toward an eventually consistent model for scalable data storage, whereas we
have a number of cases where that's quite painful to do - I think Google's
implementation shows it's possible, but there's some slightly odd behaviour
even there - but that's nothing that can't be solved with a DHT or two. :-)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/standards/attachments/20131204/64e49452/attachment.html>

More information about the Standards mailing list