[Standards] Addressing Security Concerns in XEP-0115 Entity Capabilities

Peter Saint-Andre stpeter at stpeter.im
Mon Sep 12 21:22:01 UTC 2011

On 9/7/11 8:51 PM, Peter Saint-Andre wrote:
> On 9/7/11 2:33 PM, Joe Hildebrand wrote:
>> On 9/5/11 6:39 AM, "Dave Cridland" <dave at cridland.net> wrote:
>>> Of course, it may be simplest just to bite the bullet and switch hash
>>> algorithm - or even change the 'hash' attribute name - because then
>>> it'll get treated as a pre-1.4 caps by the vast majority of entities
>>> and everything will happen right (or at least, no worse than it often
>>> does today anyway).
>> A bunch of our software already assumes that if you're doing old caps, you
>> don't have any caps we care about.
>>> I'm gradually leaning toward this, because although it's *quite*
>>> violent, the downside is not impossible.
>>> BTW, anyone any idea what happens if you include more than one <c/>
>>> in a presence, in practical terms?
>> I imagine you'd break enough stuff that my vote would be to use a different
>> namespace.  And then all of the people who complain to me about the *VAST*
>> number of octets that caps takes will redouble their bitching and moaning.
> That's one reason I'd prefer to patch up XEP-0115. Including both caps
> and son-of-caps in presence broadcast strikes me as a bad idea.

Joe Hildebrand, Matt Miller, and I talked about caps off-list last week
over a high-bandwidth connection (i.e., in person). We came up with one
possible approach for discussion on the list.

One of the major problems with the current approach is that there's no
hard border between identities and features, and between features and
extensions. As a result, malicious software can define certain clever
identities and features and extensions that bleed over into adjacent
parts of the input string.

One way to solve this problem is to define a new algorithm, either in a
revision to XEP-0115 or in a new spec with a new namespace. To conserve
space in presence, we'd prefer to avoid a new namespace. So that it is
possible to continue using the vast majority of existing caps hashes,
we'd prefer to keep the algorithm the same, if that can be accomplished
in a secure way.

We thought of another approach, which has two parts...

First, we need a way to mark the border between identities and features.
We can do so by defining a new feature that is *always* sorted first
according to i;octet collation (RFC 4790). Now, the 'var' attribute in
disco#info is defined as xs:string [1] so any Char [2] from the XML spec
is allowed. The first-sorted Char would be U+0009 but that's not
printable. The first printable Char would be U+0020 but that might be
difficult to debug (var=" ") and certainly would be difficult to list in
the relevant XMPP registry [3]. Thus I would propose U+0021:

<feature var='!'/>

(Nit: yes, you'd have to check for the existence of features that start
with U+0009, U+000A, U+000D, and U+0020 and treat such features as an
attack; if people don't like that special-casing we can just go with
U+0009 instead of U+0021.)

Any software supporting this approach would need to advertise support
for the first-sorted disco feature. If an application processes a caps
input string that *not* contain the first-sorted feature, based on local
service policy it could either reject it as an attack or accept it as
using the old-style caps algorithm. (Eventually everyone would regard it
as an attack, but that might not happen for a few years.)

Second, we need a way to mark the border between features and
extensions, or more generally to make sure that extensions can't leak
into the feature space. Because there is no last-sorted disco feature
(since the Unicode Consortium might always add new code points), we
thought of defining a new FORM_TYPE "urn:xmpp:clarkform" or somesuch. In
forms of this type, the field names MUST use Clark notation [4], like so:

    <x xmlns='jabber:x:data' type='result'>
      <field var='FORM_TYPE' type='hidden'>
      <field var='{http://www.example.com/foo}bar'>

We then legislate that XEP-0128 extensions MUST use this FORM_TYPE (this
is not a major imposition since there is very little use of XEP-0128 in
the wild). Here again, if an application processes an input string that
contains a FORM_TYPE other than "urn:xmpp:clarkform", based on local
service policy it could either reject it as an attack or accept it as
using the old-style caps algorithm.

Joe and Matt, do correct me if I've misrepresented our discussion.


[1] http://www.w3.org/TR/xmlschema-2/#string
[2] http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char
[3] http://xmpp.org/registrar/disco-features.html
[4] http://www.jclark.com/xml/xmlns.htm

More information about the Standards mailing list