[Standards-JIG] proto-JEP: Smart Presence Distribution

Richard Dobson richard at dobson-i.net
Wed May 31 22:00:35 UTC 2006


Carlo v. Loesch wrote:
> Richard Dobson typeth:
> | Yes but we also cannot have protocols that assume something that is not 
> | true, i.e. that the network is reliable when it isn't, this is just 
> | something that you are going to have to accept.
> 
> You either have reliability, or an error situation.

Yes but you dont always get TCP errors instantly.

> What I suspect is that Jabber is having an error situation
> each time an idle connection is closed. This obviously makes
> it impossible to operate along traditional network application
> design lines.
> 
> Luckily the Wildfires and I have found a solution to that problem,
> so if we fix all the servers, then the proto-JEP can continue
> operating in a sane network protocol design fashion.

Please read later on in this email about why fixing that bug is unlikely 
to make a significant impact.

> | > As long as there wasn't an error in transmissions, there is no reason to
> | > presume any data is lost. It's like presuming the sun could rise from the
> | > west tomorrow. 
> | The reverse is also true, if you don't know if data is lost or not you 
> | cannot presume that it was received either, when designing network 
> 
> This logic is adverse to the definition of TCP. TCP doesn't guarantee
> you, that everything will always be fine, but it will always tell you
> that something went wrong, unless of course you don't know how to close
> a socket properly. Then it's your own mistake, but the TCP spec surely
> told you that. Apparently several Jabber servers aren't using TCP according
> to its spec.

Yes TCP will tell you when something went wrong, but it will not 
necessarily do it instantly, you cannot rely on people always correctly 
closing sockets, you also have to take account of sockets that just 
timeout, what happens when routers fail, dsl connections drop, leased 
lines drop, firewalls drop sockets without properly closing them, 
someone pulls out a network cable, I could go on as there are many 
situations where you can get broken sockets, for more info read the 
discussion about jep-ack.

> | As many people stated in response to your previous proposal the roster 
> | is certainly something that can get out of sync for various reasons, 
> 
> The primary reason being that the receiving server slashed down on your
> socket while you were sending your presence. As I found out, jabber.com
> and formerly jabber.org killed connections after only a few minutes -
> which makes loss of messages and presence etc. very likely. So if we
> fix these implementations soon, Jabber will discover true TCP reliability.

Unfortunately servers slashing down sockets uncleanly is not the primary 
cause of stanza loss, it is broken un-timedout TCP sockets, please read 
the jep-ack discussion.

> | even if you manage to detect that there was a problem and subsequently 
> | reset the list, don't you think it would be better to have a protocol 
> | that can recover from errors just like TCP can (i.e. retransmits the 
> | lost packets) without having to start again from scratch?
> 
> No, because TCP already does that part for me. I have to deal with the
> case when TCP fails, and it's not a solution to put another TCP on top.
> 
> In the case of Jabber it may instead be a solution, should the developers
> of those faulty 4 implementations decide to keep it that way. Since they
> make XMPP unreliable, you have to re-invent TCP to obtain reliability.  ;-)

You have to add something on top of TCP as TCP does not ensure 
reliability, heres a link for more information that recently cropped up 
in the jep-ack discussion:

http://iang.org/ssl/reliable_connections_are_not.html

Some particularly relevant sections are the following:

"The guarantee is not complete. If I open a connection and write X data 
and then the connection drops, I do not have a guarantee that it got 
there - any, all or none! Consider that for TCP/IP if the amount of data 
is less than a full window length, and the connection closes without a 
positive acknowledgement then the sender is SOL ("somewhat out of luck")."

"The sender has no reliable way at the application layer to know how 
much data has been sent, partly because any failure may or may not 
overtake any acknowledgements, but also because fundamentally, there is 
no way for all those layers to actually pass back the reliability 
information of how many bytes have been sent, except in the narrow case 
of "I want this again." That is, the implementation of the guaranteed 
delivery is all oriented to the layer's needs and ultimately the 
receiver's needs but not to the sender's needs."

"For reliable applications you have to do it yourself. That means 
unfortunately layering a TCP-like protocol across the top of TCP. Boring 
and stupid but that's the price of reliability."

"But if you really truly need reliability (like we do in financial 
cryptography) you will probably find yourself adding extra reliability 
in at the higher layers. So consider how that effects the entire 
application - and make the app's needs drive your use of network 
protocols, not popular myths on reliability."

> | but even so something should be in place to account for this so that you 
> | can detect that something has been lost, you cannot rely on every point 
> | in the network doing the right thing and being bug free. What if there 
> | is a bug in a server that is causing stanza's to get randomly dropped? 
> 
> Then you get an error back, that something went wrong.
> At least you get to send back the queue of outgoing things, which
> makes it likely that involved people will find out something went wrong.

But the point is you wont necesarily get an error back, read further 
down about broken socket and jep-ack.

> | You really need something that works reliably even in those kind of 
> | situations, i.e. can detect if the list has gotten out of sync.
> 
> You cannot generalize that every application needs a TCP on top of TCP
> to be safer. And certainly not a light-weight temporary context list.

You can if its important both sides remain in sync, which in this case 
it is as this protocol does not provide any mechanism to recover from 
problems without having to start entirely from scratch. Also it need not 
use up much bandwidth at all, as using an IQ based protocol, you can 
easily bunch up all of your additions and deletions into a single stanza 
and will only require a single tiny IQ result ack, which could actually 
end up using less bandwidth overall (or at worse not any more) than your 
proposal, e.g. here is an example:

request
---
<iq type="set" id="3433" to="multicast.receiverdomain" from="senderdomain">
	<list xmlns="http://jabber.org/protocol/multicast">
		<add>jid1 at receiverdomain</add>
		<add>jid2 at receiverdomain</add>
		<add>jid3 at receiverdomain</add>
		<add>jid4 at receiverdomain</add>
		<add>jid5 at receiverdomain</add>
		<add>jid6 at receiverdomain</add>
		<del>jid6 at receiverdomain</del>
	</list>
</iq>

ack
---
<iq type="result" id="3433" to="senderdomain" 
from="multicast.receiverdomain" />

> | Its not reinventing TCP as TCP only guarantees the ordering the data 
> | will be in once it reaches the other site, it doesn't guarantee the 
> | delivery of everything sent to the socket, and plus when dealing with 
> 
> It does, or it returns an error.

It will return an error, but it wont necessarily return this error 
immediately, if a socket ends up getting broken without being cleanly 
closed by the TCP stack you can end up sending data to the socket for 
possibly up to 20 minutes or more (read te jep-ack discussion) before 
the TCP stack times out the socket and fires you an error, all the data 
that you have sent to the socket in the mean time will then be 
completely lost.

> | this kind of thing it goes outside of the boundary of the TCP connection 
> | and into the server and you have no control over what might happen 
> | there, this is how the jep-ack and related proposals work, they dont 
> | just rely on TCP as that's the whole route of the problem in the first 
> | place.
> 
> Don't put the blame on TCP. It is your style of slashing down sockets
> which is causing most of the reliability problems of Jabber.

No its not at all, the unclean closing of TCP connections when you had 
the opportunity to do it cleanly is just a bug and is only a tiny cause 
of stanzas being lost, please fully read the ack discussions in more 
depth as you're saying this seems to indicate that havent read and 
understood it fully, in particular read the stuff about broken TCP 
connections which is by far the majority cause of stanza loss.

> It is a fundamental design failure that XMPP doesn't clearly suggest a
> sane way how to close a socket.

It does though, just possibly not as explicitly as it should, when 
implementing I managed to find the part about the connection-timeout 
stream error, its just a bug if other servers arnt doing this as they 
should.

> | that? And what's wrong with having something de-coupled if you can? 
> | Please explain.
> 
> Yeah sure go ahead, de-couple one-to-many routing from the server core.
> I just doubt you can, or it will be useful that way, but that's just me.
> Only IM developers think one-to-many messaging can be an add-on feature.
> Anyway, this is of far lesser relevance than the TCP handling bug.

Well JEP-0033 proves that you can easily de-couple one-to-many routing 
from the server core, but this isnt really an explaination as to why we 
cant or should de-couple it if we want to, could you actually explain 
your reasoning rather than just insulting us all?

(FYI, I would suggest for your own sake that in future if you dont have 
an answer to something that you not go into a rant and which ends up 
insulting this whole community, its not exactly something that is likely 
to get people to take you very seriously)

> | That's not a very good solution as it is potentially against the RFC as 
> | there is the possibility that in between a presence broadcast could 
> | sneak its way through before you deleted the directed presences, they 
> | need to not get there in the first place.
> 
> Oh, you mean when writing the two stanzas in one write() operation to
> the socket, there is a realistic chance something else may come between.
> I can see you are really familiar with TCP technology.  :-)

Yes I am familiar with TCP technology, but I am also familiar with the 
architecture of XMPP implementations too, the majority of XMPP 
implementations ive come across de-couple the session management code 
(i.e. the processing of the stanza's) from the sockets allowing for 
increased scalability by allowing the session manager to be on a 
separate physical machine from the one that the socket is connected to, 
so it is very possible that stanza's could sneak in in-between.

> | Not you cant use the fact that the presences are going to a resource or 
> | a bare JID to determine if something is directed presence, that would 
> | break the specification as I haven't seen any restrictions of that sort 
> | in there meaning directed presence can be to a base JID or a resource.
> 
> Ok, so it has to be something else. Now that with the TCP fix rosters
> are going to be a lot more reliable, we can use the recipients roster
> to figure out if he's only getting directed presence. I mean, it's a
> borderline case that someone would hack his roster in order to receive
> follow-up presence on a server with other people getting it *and* talked
> the sender into sending him directed presence first.

Your suggestion for cleanly closing the sockets if possible is good, but 
its very unlikely to make any difference to the reliability of rosters 
as the primary reason they get out of sync is that the servers do not 
retry sending the stanzas if they dont get through the first time, 
stanza loss due to connections not being closed cleanly when they could 
have been is very unlikely, the majority of stanza loss happens on 
connections that have been lost and then eventually timed out (which can 
be quite a long time in some TCP implementations), when this happens all 
of the stanzas that have been sent to the socket after the socket was 
lost and before it actually timed out are lost into a black hole (for 
more information have a look at the jep-ack discussion further).

> | Only the addressing information, that's very different from re-writing 
> | the content of a message potentially changing what it means once it 
> | reaches the other side
> 
> I could understand that argumentation if you were using a protocol capable
> of framing. Then you would route the message contents without looking at
> it. But you are routing XML, and you are forced to parse every packet
> anyway. Looking into the DOM at that point is trivial, there is no efficiency
> gain in forbidding applications from doing that. PSYC is a framing capable
> protocol, so PSYC routers indeed aren't permitted to change the content of
> the packet, because they aren't even parsing it, but hey, that's just PSYC.
> In Jabber this kind of logic is futile.

Its not a matter of the ease of changing the content of a stanza, its a 
matter of the desirability of doing so, allowing the content of the 
stanza to be changed could change the original meaning of the stanza 
once it reaches the destination causing it to be processed differently 
from what it should have been had it not been altered, whereas if you 
are just changing the addressing information it does not change the 
meaning of the stanza at all.

> | enough that you can honestly and truly say the jabber network now has 
> | reliable delivery at every point as it requires every point in the 
> | network to have implemented this enhancement. So until then whatever 
> 
> Yes, any routing improvement for Jabber will have to wait until
> this absurd TCP bug is fixed. This makes all of our discussion on
> reliability irrelevant, as there is no data or experience on how Jabber
> operates once the bug is fixed.

As detailed above, stopping the loss of stanza's is far more difficult 
than simply ensuring sockets are closed cleanly when possible, please 
read the jep-ack discussion from the few weeks ago for more details.

> Luckily our proto-JEP requires negotiation, so we can add another
> requirement, that the TCP handling needs to be fixed before this
> JEP can be used. Simple as that.   ;)

You will also need to require a jep-ack type protocol as well as the 
fixes to the cleanly closing of sockets, but yes, although I am not sure 
what you will think of jep-ack and the related proposals as they all 
require acking the traffic you send over the TCP socket, which you seem 
to hate the idea of for some reason.

Richard




More information about the Standards mailing list