[standards-jig] JNG Ramblings.
mikelin at MIT.EDU
Tue Aug 13 02:37:54 UTC 2002
> The seven least significant bits of the first byte in the header defines the
> total message length EXCEPT if the most significant bit is set. In that
> case, the seven least significant bits indicate the size (in octets) of the
> integer that will describe the packet size. This gives you up to a 127 byte
> integer to describe the packet size (I don't see that _ever_ being
> practically exceeded).
It might not be a terrible idea (high praise from me :-), although it
would approximately double the number of states in a protocol framer
state machine, and complicate the way the headers are decoded. I'll
consider it, but I personally am very fond of having all the information
you need sitting in a CPU register.
> One concern I have is that there is no way to interrupt or interleave
> parallel data with a particularly large packet. I guess multiple
> connections could be used (although I'm not entirely sure which is a worse
> resource drain on the server, more connections or having to track parallel
> packet streams on the same connection). With Jabber's persistent connection
> design, it would seem that connections would be the bottleneck over server
> packet processing...
Well, one thing that Joe very helpfully pointed out to me is that
connection management and some XML processing are things that can be
done at the edge (re: multiplexed jpolld's), and thus have the load
distributed to multiple machines, whereas some tasks can only be done in
the singleton server core. But in any case I don't think it is fair for
either of us (lacking the firsthand experience) to say what causes the
first scalability bottleneck in the server.
The alternative, which is to do asynchrony in-band like BEEP, I think is
too much work. TCP is designed to provide these services, so I'm in
favor of using it.
> PS - partly unrelated, but if we're going binary, I wonder if it wouldn't be
> prudent to integrate addressing into the headers. Perhaps some address
> dictionary so that you could establish a per session mapping of routing ID
> numbers to particular Jabber nodes/resources. That way routing and
> processing are completely separate activities and the router doesn't have to
> understand XML.
Yes, one thing we've been toying with is where we now have "Unspecified"
and "XML" payload types, let's roll in a few specific things like
"Destination Address GUID" as payload types, thus greatly reducing the
need for router XML parsing. There are a lot of open questions related
to doing this properly, however; for the purposes of addressing, it
seems like it would be useful to have XML extensibility, particularly if
we are going to be doing things like store-and-forward. So while this is
an idea that may be worth looking into, there are open questions that
I'm not going to consider in the proof-of-concept phase.
Lastly, as a more general comment, although I've been coming on strong,
and have been testy at points, I'm really glad that we're finally having
this argument - it's a discussion we've needed to have for a long time.
More information about the Standards