[Standards-JIG] XMPP bandwidth compression

Bob Wyman bob at wyman.us
Thu Jul 1 15:13:16 UTC 2004


Fabrice Desré wrote:
> You cant have a dynamic table-based compression 
> scheme for XML that is not schema aware.
	I think there is a typo here or something lost in the translation or
double-negative... I believe this should say: "You CAN have a dynamic
table-based compression scheme for XML that is NOT schema aware." 
	Fast Infoset provides precisely such a system. While it is
"XML-aware" it is not "aware" of any application-specific schemas. (Note: I
realize that some purists will probably say that Fast Infoset is "aware" of
the one, single base schema for all XML objects -- i.e. the Infoset.
However, that isn't what people usually mean when they say "schema aware".)

Ralph Heijer wrote:
> So, essentially just like gzip.
	No. It isn't "like" gzip. Gzip works at the character level while
Fast Infoset is aware of XML and uses its knowledge of XML to provide both
good compression and fast encoding/decoding. There are significant
advantages to the Fast Infoset approach -- for instance, it should allow
people to do "streaming" (i.e. you can start decoding the front of a message
before you have read or buffered the whole thing.) which gzip won't support.
This means that it will be friendlier on space constrained systems.
 
	One of the interesting applications for Fast Infoset will actually
be in a case where it is *not* doing compression! For instance, if you are
trying to pass binary data over XML (like voice, images, etc.) you are
forced to do base64 conversions or some such thing to turn the binary into
text. However, this encoding tends to massively increase the size of the
binary data. Well, if you were encoding the same data using Fast Infoset,
what you would do is insert the binary data directly rather than converting
it to text. The result is, of course, compression as well as faster
encoding/decode times on the data. 

		bob wyman





More information about the Standards mailing list