[Standards-JIG] Pre-Proto XEP - Karma

Justin Karneges justin-keyword-jabber.093179 at affinix.com
Fri Jan 5 19:48:48 UTC 2007


On Friday 05 January 2007 5:59 am, Pedro Melo wrote:
> On top of the other two I've already sent, there are also limits at
> the XML parser that should be considered:
>
>   - max node name size: sending <screeeeeeeeeee(insert enourmus
> amounts of e's here)eeam> is probably going to kill you XML parser;
>   - max number of node attributes;
>   - max attribute name and attribute value sizes;
>   - max size for char sequences between elements.
>
> this ones should make sure that you at least receive a SAX event
> before exausting your memory.
>
> FYI,  I don't know any XML parser that implements this.

I'm not aware of any such parser either.  It is hard enough finding parsers 
that are tolerant to byte-by-byte input for network use. :)

However, it should be relatively easy to get the effect you want without 
modifying the parser.  You probably already have code that reads from a 
network socket and passes this data to a SAX parser.  Simply count the bytes 
you read, and reset the counter whenever you receive a SAX event from your 
parser.  If the counter gets really large, you kill the connection.

A very advanced parser might be able to start ignoring data in an attempt 
to "skip over" a stanza.  For example, if "screeeeeeeeeam" gets too long, 
then the parser goes into ignore-mode and only cares about finding the next 
whitespace character.  However, you'll have trouble reading the closing tag 
and matching it...

Since an "ignoring" parser is not practical (and perhaps not possible either), 
I don't think there's a need to be so granular in the sizes of various XML 
bits.  A simple max byte size between SAX events is plenty, and you 
disconnect if a violation is detected.

-Justin



More information about the Standards mailing list