[Standards] s2s and gracelessly broken streams

Dave Cridland dave at cridland.net
Wed Apr 4 08:24:07 CDT 2007


Bear with me, this goes back on topic toward the end. :-)

On Tue Apr  3 17:50:16 2007, Chris Mullins wrote:
> Dave Cridland Wrote:
> > I'm curious as to where you get these figures from. Aside > from 
> anything else, your 100k figure appears to contradict > your 65k 
> figure above.
> 
> Hours and Hours of network testing to figure out the practical 
> limits of
> "how many connections can a server handle". We're lucky enough to 
> have
> partners (HP, AMD) who give us access to labs with very big 
> machines to
> test on. We've been able to test on a wide variety of hardware (x86,
> x64, IA64) and Windows versions. Some of our tests were pure socket
> tests, other were using our Server. 
> The 100k figure requires additional IP Addresses.
> 
> 
Hmmm... I think I see. It certainly requires additional addresses in 
a testing environment. TCP addresses (ie, port numbers), or IP 
addresses at either end would be fine for this, but there's a clear 
theoretical limit of the number of TCP connections from a fixed IP 
address to a fixed service on a single remote IP address due to the 
only variable part of the TCP connection identifier space being the 
source port, which you'd typically get in a test environment.

For many operating systems, the limit is substantially lower than 
65536 - on Linux, it seems to be 28227.

In the real world, this simply doesn't happen. In the lab, yes, you'd 
have to arrange for multiple addresses somewhere to avoid this.

FWIW, I did some testing under Linux, and found that I could open 
99997 non-blocking sockets very quickly onto a dual Pent-III with 1G 
of RAM - hardly a powerhouse - using multiple desdtination ports. The 
limit was the compiled-in limit on the number of open file 
descriptors, and there was no other apparent impending exhaustion, as 
far as I could tell. (I wasn't pointinng these at an XMPP server, but 
an MSA, an IMAP server, a POP3 server, and an LDAP DSA, by the way, 
in roughly equal measure. All the services survived fine, but that 
doesn't tell me much of interest.)

> It depends on a number of factors. For example, in Windows (x86), 
> the
> limitations are imposed by the size of the Non-Pages Memory Pool 
> deep in
> the Windows Kernel. Each async socket that's opened reserves a 
> little
> bit of this memory, a when this memory is exhausted, you're out of 
> luck.
> You can get around this by NOT doing async sockets, but this has 
> other
> limitations and drawbacks. 
> 
And again, I see - Windows is somewhat notorious for having a very 
costly async socket. I'd have to say that I don't consider running 
large-scale services on Windows to be terribly practical, but that's 
a whole other religious war. :-)

> > It's quite practical to have every outgoing > connection use the 
> same source port number > on the same source IP address - some > 
> protocols, like FTP, even mandate this.
> 
> I don't know as I've ever done this, but you're right - it's 
> certainly
> possible. 
> 
Of course, for incoming connections, you're using the same port 
number all the time at your end.


> In all the network programming I've been a part of, letting the O/S
> choose the outgoing port has been what we've done. I sometimes 
> specify
> the local address from which to send, but even then I try to let 
> the O/S
> worry about this, as it has a lot more knowledge (route tables, etc)
> than I've got. 
> 
Absolutely, I agree with everything you've written here. I didn't 
pick my source address or port number at all in my testing. The only 
occasion where you do is when the protocol mandates it - this is 
exceedingly rare, it's only FTP that does this in the IETF protocols 
now, I think.


> Many high level programming Interfaces won't let you do this, so 
> I'm not
> sure how practical it is. 

No, I disagree, most will. You need to set the SO_REUSEADDR socket 
option before the connect() call, that's all.

For C, that's just setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, ...), 
for Java it's sock.setReuseAddress(true), for Python it's 
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1), etc. In 
other words, it varies, but it's almost always there in any language 
where the intent is to write networking code, if for no other reason 
than it's pretty much vital to do this for listening sockets.

So in summary, there is no inherent limit on the number of TCP 
connections worth worrying about - there's only a limit on the number 
of async connections you can have on Windows (because it's broken), 
or a limit on the number of connections the application (in this case 
a server) is allowed to make.

Moreover, as Tony Finch said, I doubt that even the operating system 
limits are likely to be an issue for s2s connections in most servers. 
I don't doubt that for popular public servers, including Jabber.org, 
things might be very different, but if Cambridge University only 
contacts around 5,000 unique external ADMDs in a given 24 hour 
period, I'd be inclined to assume that even Jabber.org's server is 
likely to be in the same order of magnitude.

Now for s2s connections, I'd question the wisdom of closing them when 
idle. The primary cost of keeping such a connection open is going to 
be twofold:

1) The memory required to support the connection data.

2) The lookup time in internal stanza routing tables.

In the case of the first, there shouldn't be too much memory used. In 
the case of the second, I'd be very curious as to how many 
connections you save by killing off idle connections. This being a 
(hopefully) log(n) lookup, then you'd have to drop the majority of 
your connections to halve the lookup time. I don't see this as being 
a good trade-off, but it depends on the figures of course.

Finally, I'd be wanting to see what kinds of behaviour you actually 
get - do the remote servers reconnect quite quickly, for instance. If 
so, then as JD Conley touched on, the saving you make by dropping the 
connection might well be vastly outweighed by the increased 
connection cost - it depends on the cost of maintaining the 
connection over time as compared to the cost of re-establishing it 
and the frequency with which it's re-established.

The best reason for dropping idle connections is when you have reason 
to suspect that the network layer has been severed. In this case, I'm 
of the opinion that the better idea is to periodically probe the 
connection using something like XEP-0198 or XEP-0199. Again, this is 
a cost/benefit trade-off, and figures rather than speculation are 
needed.

By the way, as far as I can determine, the "idle connection timeouts" 
in other long-running protocols such as IMAP mostly date from a time 
when one extra TCP connection actually had a substantial cost. These 
days they're partly there to prevent blackholing by particularly 
unpleasant NATs, but more because it's too difficult to remove them 
now.

You'll note that I'm not offering much in the way of answers and 
solutions here - I don't believe that we're in a situation to provide 
those yet.

Dave.
-- 
Dave Cridland - mailto:dave at cridland.net - xmpp:dwd at jabber.org
  - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
  - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


More information about the Standards mailing list