[standards-jig] JNG Ramblings.

David Waite mass at akuma.org
Sat Aug 10 05:50:41 UTC 2002


Mike Lin wrote:

>>It wasn't the protocol.  It was some lameness in the implementation of
>>the XML stuff in .Net.  Basically, even though you are pull parsing,
>>you can't ask the parser if there is anything left pending to parse.
>>    
>>
>
>It wouldn't have mattered if you knew ahead of time how many bytes to
>parse. We should not be bound by the lameness of our XML stuff. Any XML
>parser is perfectly capable of parsing a fixed number of bytes.
>
>It's the protocol's fault because it forces us to use XML processing
>suites in ways they were never designed to be used. And this isn't Bob
>Schmuck's XML parser that had to be hacked around. This is the Microsoft
>.NET CLR, which uses XML pervasively.
>
XML is a _document_ markup language. The way in which jabber uses it is 
_extremely_ convoluted, because rather than considering a message as a 
document, it considers the entire session as a document. There is 
nothing which requires an XML parser inform the application of 
_anything_ until the the document has been completely parsed, or an 
error has occured with the formatting (non-wellformedness, schema 
violation).  As long as our messages are a subset of an XML document, we 
will have a subset of tools which support the protocol.

>>In case anyone is interested, the work-around I used was to break up
>>the input on each '>'.  I didn't need to do angle-bracket counting, I
>>didn't need a framing protocol, and it didn't add much overhead at
>>all.
>>    
>>
>What about the part where you disassembled a class from the CLR and
>translated it back to C#? That's a particularly impressive technical
>feat, of course, but it should not have been necessary.
>
That actually was done for two reasons:

1. There was a bug in the Framework when the initial code for jabber was 
written in a private class, which required that modification
2. Jabber[.-]Net returns subclasses of various XmlElement types in order 
to include additional accessors, such as message.Body

It could however be argued that the bug wouldn't have shown up if the 
protocol used full documents, since it was apparently a caching issue.

Just as an aside, it isn't uncommon to have to disassemble classes from 
the framework. There are sections of the framework which pretty much 
require reverse-engineering due to lack of documentation (prime examples 
being System.Xml.XPath and System.Xml.Xsl). Luckily, I have been able to 
escape reimplementing classes in the above two up to this point.

-David Waite




More information about the Standards mailing list