[standards-jig] JNG Ramblings.

Iain Shigeoka iain.shigeoka at messaginglogic.com
Mon Aug 12 18:05:07 UTC 2002


If you're going binary, why not provide an extensible binary format?  How's
this for a minor header format change for your suggested structure:

The seven least significant bits of the first byte in the header defines the
total message length EXCEPT if the most significant bit is set.  In that
case, the seven least significant bits indicate the size (in octets) of the
integer that will describe the packet size.  This gives you up to a 127 byte
integer to describe the packet size (I don't see that _ever_ being
practically exceeded).

If you want, you can always hard code an escape to define a 32 bit or 64 bit
integer for payload size if that tickles your fancy and helps to simplify
packet formation.  In the normal case, I think many individual jabber
packets will fall under the 127 bytes saving you data.  Certainly the
majority of IM and presence traffic will be under 16 bits of data which will
be headered in 3 bytes still saving data over the fixed 4 bytes.

A similar scheme could be used to escape out the part type, and part length
header fields.  You'll have variable length headers, but it lets you read in
a byte, then know exactly what you need to read in next so its FSM friendly.

I think the added extensibility would be well worth the added complexity and
won't affect performance much (although it will make assembly language
register tricks harder).

One concern I have is that there is no way to interrupt or interleave
parallel data with a particularly large packet.  I guess multiple
connections could be used (although I'm not entirely sure which is a worse
resource drain on the server, more connections or having to track parallel
packet streams on the same connection).  With Jabber's persistent connection
design, it would seem that connections would be the bottleneck over server
packet processing...


PS - partly unrelated, but if we're going binary, I wonder if it wouldn't be
prudent to integrate addressing into the headers.  Perhaps some address
dictionary so that you could establish a per session mapping of routing ID
numbers to particular Jabber nodes/resources.  That way routing and
processing are completely separate activities and the router doesn't have to
understand XML.

More information about the Standards mailing list