[Standards] Council on Stanza Repeaters without Multicast

CvL at mail.symlynX.com CvL at mail.symlynX.com
Wed Apr 2 22:22:12 UTC 2008


I was pointed to your council meeting log at
http://logs.jabber.org/council@conference.jabber.org/2008-04-02.html

Some thoughts and numbers about
http://www.xmpp.org/extensions/inbox/repeaters.html
you discussed between 14:40 and 14:55.

<Kev> this is an odd spec - it loses us the 'no spoofing' which we
      currently enjoy

I notice what you mean.. repeaters have double 'from'.. the sending
server and the originating sender. In a simple no-trust situation
these SHOULD be on the same host or domain, then my server is merely
sending stanzas in my name in a more efficient way. That's something
to add to "10. Security Considerations".

Also <relaying-denied/> needs to go into "4. Creating a Repeater" with
"10. Security Considerations" mentioning in the "A repeater service
SHOULD restrict the JIDs that are included in a repeater list to local
entities ..." paragraph, that if such condition is not met, a
<relaying-denied/> SHOULD be generated.

<Kev> on list I said I'd be happy with this if we included proof that
      it's a valid forwarded stanza

This should be a separate XEP on top of repeaters which define a form of
multicast repeaters with digital proof. Repeaters without multicast
do fine without proof if the security implications are properly
implemented. Repeaters in closed trust systems also do fine without proof.

<Dave> I'm not wholly convinced that, in the current design, it will
       actually result in "better" network usage.
<Dave> The single thing that worries me most on this is simply that
       nobody seems to have done any figures on whether this genuinely saves
       bandwidth.

I know Fippo's postings aren't the easiest read. He expects everyone to
be as deep in it as he is himself. He has used stpeter's model from
http://www.xmpp.org/internet-drafts/draft-saintandre-xmpp-presence-analysis-03.html
found a bug in the formulas, then with the corrected formula figured out
numbers for scenario 5.1 as follows:

	580000000 for the "standard" behaviour of xmpp (also dubbed "rfc")
	251200000 for repeaters XEP
	197200000 for smart presence XEP from 2004

For completeness, smart presence is here:
http://www.xmpp.org/extensions/inbox/smartpresence.html
It basically does the same as repeaters, but focused on presence only. 

He resumes:
| Regardless of protocol details, the concept of remote fan-out using
| lists stored on the remote side is superior to the "current" way of
| doing things.

That's essentially what's in
http://mail.jabber.org/pipermail/standards/2008-March/018276.html

He adds some more numbers in the following posting at
http://mail.jabber.org/pipermail/standards/2008-March/018320.html

| B6 for scenario 5:
| rfc:       199444444 bytes/second
| bis:        45000000 bytes/second
| smart/rep:   4544444 bytes/second

This means, the optimizations suggested in RFC "bis" will reduce
presence overhead according to scenario 5 to a fourth of the
current RFC style. When applying repeaters or smart unicast we
end up at less than a 40th of the current network traffic.

This isn't even taking into account what a huge gain repeaters
represent when applied to MUCs. large pubsubs also start having
a chance to actually scale.

All of this gets even better when you apply real multicast to it,
but you don't need to worry about that now. You are already
improving a lot by being more strategic in the way you unicast.
That is what the repeaters XEP and similar earlier protoXEPs propose.

<Kev> right, we don't have, still, afaik a decent model for testing
      the effect of such things on 'real' usage

I hope stpeter's draft isn't the worst of choices.

<Kev> if people want to use this in their closed ecosystems, it
      doesn't seem a bad method really
<ralphm> Kev: if that's the only usage model, I'm not sure if we
         should publish a XEP on it

I'm afraid you are confusing my Multicast Repeaters extension idea
with the actual Stanza Repeaters protoXEP as it stands. The normal
usage model is to optimize traffic on any given S2S link. By thinking
ahead I confused your vision of what the protoXEP is proposing, sorry
for that. I thought it was sufficiently clear.

<Kev> at least a decent number of people think we need multicast
      though, and I don't have much reason to doubt them

Yes, but we aren't even discussing real multicast yet. We need this
building block first. This one, or one of the older ones from 2004.

<Dave> In closed ecosystems, there are better solutions, since you can
       use trust relationships between servers much easier.

You are right. You can throw XMPP-S2S into the dumpster and use an
optimized proprietary binary protocol or something like that. But I
don't think it's the XSF's job to come up with something to replace XMPP.

<ralphm> I /would/ like to see work done on pubsub repeaters, though.

Shouldn't be hard to apply repeaters to pubsub.

<Kev> I do believe them - I have quite some doubt on us being able to
      scale indefinitely, server deployment-wise

My experience with multicast tells me you are || <- this close from
taking a major leap into free scalability land. You are holding the
keys in your hands.

<Kev> this model works for email, but nothing persists there

Not at all. My sendmail is busy until next day to distribute to a few
thousand recipients. Ok, sendmail is stupid, but I wouldn't want it to
try to connect a thousand hosts at the same time. IRC can handle that
much better, so does PSYC. How? It just forwards to some thirty hosts
which themselves forward to some thirty hosts. Stuff arrives in near
real-time, even with thousands of recipients. That's why multicast is
the only way to go. And persisting recipient lists is fundamental to
have a ghost of a chance of being scalable. I've done this stuff for
a decade now, you can trust I know what I'm talking about.

<Dave> I'm pretty sure it'd be easy to write a repeater that did allow
       spoofing, and moreover, bypassing of privacy lists, etc.
       But I don't think it's a fundamental property of repeaters.

With the security set-up in place, you can only mess up your local
data, as always. Repeaters, if properly defined, are just an optimization.
Why shouldn't the packets inside be treated to the same scrutiny as
any packet?

<Kev> it can't verify the original sender, or even the original sending
      server, so just has to blind trust the message

Same mistake is before I presume. We aren't talking about Multicast yet.
It does not need to trust any message blindly.

<stpeter> Do you think that the repeater concept should be specific to a
	  particular application type, like pubsub or muc?

Repeaters implement an essential one-to-many routing mechanism like
Jabber has been missing for a decade now. It needs to work for any
existing and future one-to-many application. Presence, MUC and Pubsub
to start with. Thank god I no longer have to argue that presence is
a one-to-many operation.

<Kev> Dave: and if I want to fake a message from you, I just create a
      repeater, send an insulting message to peter from the repeater with
      your jid, and peter will believe that you've called him a squashed turnip

Hehe. If you simply check the inner stanza as if it came over the wire
by itself, there is no such problem. That's the change I suggested above.
I would even consider moving the inner 'from' to the outer stanza.
Why should the outer source be a server if it can be the actual originator?
Without an inner 'from' attribute, there is no need to do any checks at
all. Merely copy the 'from' over from the wrapper.

<stpeter> Dave: it saves bandwidth compared to XEP-0033
<stpeter> (which may not be saying much)
<Dave> stpeter, No, I mean, does it save bandwidth compared no not having it.

No comment.

<Kev> ok, I propose we voice our concerns on list, and move on

Your voice was heard.

Meow.




More information about the Standards mailing list