[Standards] Addressing Security Concerns in XEP-0115 Entity Capabilities

Waqas Hussain waqas20 at gmail.com
Wed Sep 14 19:29:21 UTC 2011

On Tue, Sep 13, 2011 at 2:22 AM, Peter Saint-Andre <stpeter at stpeter.im> wrote:
> On 9/7/11 8:51 PM, Peter Saint-Andre wrote:
>> On 9/7/11 2:33 PM, Joe Hildebrand wrote:
>>> On 9/5/11 6:39 AM, "Dave Cridland" <dave at cridland.net> wrote:
>>>> Of course, it may be simplest just to bite the bullet and switch hash
>>>> algorithm - or even change the 'hash' attribute name - because then
>>>> it'll get treated as a pre-1.4 caps by the vast majority of entities
>>>> and everything will happen right (or at least, no worse than it often
>>>> does today anyway).
>>> A bunch of our software already assumes that if you're doing old caps, you
>>> don't have any caps we care about.
>>>> I'm gradually leaning toward this, because although it's *quite*
>>>> violent, the downside is not impossible.
>>>> BTW, anyone any idea what happens if you include more than one <c/>
>>>> in a presence, in practical terms?
>>> I imagine you'd break enough stuff that my vote would be to use a different
>>> namespace.  And then all of the people who complain to me about the *VAST*
>>> number of octets that caps takes will redouble their bitching and moaning.
>> That's one reason I'd prefer to patch up XEP-0115. Including both caps
>> and son-of-caps in presence broadcast strikes me as a bad idea.
> Joe Hildebrand, Matt Miller, and I talked about caps off-list last week
> over a high-bandwidth connection (i.e., in person). We came up with one
> possible approach for discussion on the list.
> One of the major problems with the current approach is that there's no
> hard border between identities and features, and between features and
> extensions. As a result, malicious software can define certain clever
> identities and features and extensions that bleed over into adjacent
> parts of the input string.
> One way to solve this problem is to define a new algorithm, either in a
> revision to XEP-0115 or in a new spec with a new namespace. To conserve
> space in presence, we'd prefer to avoid a new namespace. So that it is
> possible to continue using the vast majority of existing caps hashes,
> we'd prefer to keep the algorithm the same, if that can be accomplished
> in a secure way.
> We thought of another approach, which has two parts...
> First, we need a way to mark the border between identities and features.
> We can do so by defining a new feature that is *always* sorted first
> according to i;octet collation (RFC 4790). Now, the 'var' attribute in
> disco#info is defined as xs:string [1] so any Char [2] from the XML spec
> is allowed. The first-sorted Char would be U+0009 but that's not
> printable. The first printable Char would be U+0020 but that might be
> difficult to debug (var=" ") and certainly would be difficult to list in
> the relevant XMPP registry [3]. Thus I would propose U+0021:
> <feature var='!'/>
> (Nit: yes, you'd have to check for the existence of features that start
> with U+0009, U+000A, U+000D, and U+0020 and treat such features as an
> attack; if people don't like that special-casing we can just go with
> U+0009 instead of U+0021.)
> Any software supporting this approach would need to advertise support
> for the first-sorted disco feature. If an application processes a caps
> input string that *not* contain the first-sorted feature, based on local
> service policy it could either reject it as an attack or accept it as
> using the old-style caps algorithm. (Eventually everyone would regard it
> as an attack, but that might not happen for a few years.)
> Second, we need a way to mark the border between features and
> extensions, or more generally to make sure that extensions can't leak
> into the feature space. Because there is no last-sorted disco feature
> (since the Unicode Consortium might always add new code points), we
> thought of defining a new FORM_TYPE "urn:xmpp:clarkform" or somesuch. In
> forms of this type, the field names MUST use Clark notation [4], like so:
>    <x xmlns='jabber:x:data' type='result'>
>      <field var='FORM_TYPE' type='hidden'>
>        <value>urn:xmpp:formtype</value>
>      </field>
>      <field var='{http://www.example.com/foo}bar'>
>        <value>baz</value>
>        <value>qux</value>
>      </field>
>    </x>
> We then legislate that XEP-0128 extensions MUST use this FORM_TYPE (this
> is not a major imposition since there is very little use of XEP-0128 in
> the wild). Here again, if an application processes an input string that
> contains a FORM_TYPE other than "urn:xmpp:clarkform", based on local
> service policy it could either reject it as an attack or accept it as
> using the old-style caps algorithm.
> Joe and Matt, do correct me if I've misrepresented our discussion.
> Peter
> [1] http://www.w3.org/TR/xmlschema-2/#string
> [2] http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char
> [3] http://xmpp.org/registrar/disco-features.html
> [4] http://www.jclark.com/xml/xmlns.htm

Okay.. let's see... while that was a nice try, it still doesn't work..

First, are you mandating that every client include that disco form in
their caps, even if they don't use any dataforms? If yes, ugly. If no,
they are still open to attack.

Second, I'm assuming you want to keep backwards compatibility, i.e.,
clients with the current exploitable caps still get their caps cached.
If so, then I can simply bypass your protection:

assume you have
feature: "!"
feature: "f1"
feature: "f2"
feature: "f3"

then I can replace that all with:

<x xmlns='jabber:x:data' type='result'>
  <field var='FORM_TYPE' type='hidden'>
  <field var='f1'>

I suspect I may have missed something, because this was too easy.

Waqas Hussain

More information about the Standards mailing list