[Standards] LAST CALL: XEP-0332 (HTTP over XMPP transport)

Kevin Smith kevin at kismith.co.uk
Tue Oct 21 12:09:44 UTC 2014

On Wed, Oct 8, 2014 at 5:33 PM, XMPP Extensions Editor <editor at xmpp.org> wrote:
> This message constitutes notice of a Last Call for comments on XEP-0332 (HTTP over XMPP transport).
> Abstract: This specification defines how XMPP can be used to transport HTTP communication over peer-to-peer networks.
> URL: http://xmpp.org/extensions/xep-0332.html
> This Last Call begins today and shall end at the close of business on 2014-10-21.
> Please consider the following questions during this Last Call and send your feedback to the standards at xmpp.org discussion list:
> 1. Is this specification needed to fill gaps in the XMPP protocol stack or to clarify an existing protocol?

I do not know.

> 2. Does the specification solve the problem stated in the introduction and requirements?

Possibly, subject to comments below.

> 3. Do you plan to implement this specification in your code? If not, why not?

Not immediately - I don't have a need for this.

> 4. Do you have any security concerns related to this specification?

Several, mostly around out of band unencrypted transfers,
possibilities for resource exhaustion attacks, how identity
verification for HTTPS is established, what happens if you try to use

> 5. Is the specification accurate and clearly written?

I don't think so yet. Some comments follow:

Throughout should HTTP references be to 7230?

The requirements say that the HTTP and XMPP servers must be
collocated, but this doesn't need to be true, does it?

As the glossary is entirely web terms, can this not be elided with a
reference to Terminology in 2616/7230? No-one's going to be
implementing 332 without also reading HTTP.

"telegram" is introduced as a term of art without explanation.Use
Cases needs some reworking for consistency of terms and ensuring those
used are introduced or referenced.

"friendship", similarly, is not generally a term of art in XMPP
(applies particularly later on to etc.).

I note that requiring encoding of > isn't usually necessary in XML (we
once had such a requirement for XMPP, but I'm fairly sure we got rid
of it).

The 'xml' encoding method needs somewhat more thought to prevent
illegal XMPP being encoded (e.g. sending of data in existing
namespaces, using features disallowed by XMPP). It is mentioned in
passing, but doesn't provide much guidance. Considering that the
amount of processing required to encode here is greater than to encode
as 'text', does this encoding add value? Is EXI here that much of a

The motivations for chunkedBase64 are unclear - the introductory
paragraph seems to say to never use it - streams shouldn't use it, it
shouldn't transfer files. It then talks about moderate sizes without a
guide as to what this means.

(SI) "This transfer mechanism is of course the logical choice, if the
content is already stored in a file on the server" Is it? Isn't Jingle
for file transfer the more obvious choice?

IBB - Is having this in addition to Jingle a little redundant? IBB can
naturally be included in Jingle proposals and the appropriate stream
method selected.

jingle=false - requiring optional parts to be disabled rather than
enabled seems poorly extensible.

"Note: Content encoded using chunkedBase64 encoding method can be
terminated, either by the receptor going off-line, or by sending a
close command to the sender." - does this mean that presence
subscriptions (or presence decloaking) is required before 332 can be
used, or just this mechanism?

Example 1:
"<iq type='set'

       from='httpclient at clayster.com/browser'
       to='httpserver at clayster.com'

Does this mean that it must be the server that supports this, rather
than a client? It's unusual for server-provided services to be on JIDs
with a localpart other than the user's, so some explanation here is

In all the examples some text is needed to explain what behaviour must
be - examples are not normative, but seem to be being used as the
definition of behaviour here.

What versions of HTTP are supported? Does it matter?

"The XMPP/HTTP bridge at the server only transmits headers literally
as they are reported, as if it was normal HTTP over TCP that was used.
In the HTTP over XMPP case, connections are not handled in the same
way" What is the distinction here?

Example 4.1.4 - what does an ellipsis for content-length mean? It
seems to be illegal in the 2616 BNF.

Is statusMessage properly defined anywhere? Or the mapping from HTTP
onto the req/resp (why not request/response?) elements?

Throughout, the urn:xmpp:http namespace should be versioned.

EXI is mentioned extensively, but other compression methods also
reduce the b64 overhead and similar.

Doing chunking twice (one at the HTTP and once at the XMPP layer)
seems quite inefficient - is this a sensible model?

Can you receive HTTP chunked data over XMPP as type text?

4.2.5 - this text seems imprecise - if the content 'can be represented
as a file' then you have to use sipub (although the language isn't
2119ish); can't most content be represented as a file?

I'm not sure that the claim that all streaming content is infinite is
necessarily correct. Many audio/video/etc. streams are finite (in
fact, I suspect there are few truly infinite streams).

"Such content must use the ibb transfer mechanism, if used " - makes
little sense to me.

I can't work out the intent of "The first candidate should however
correspond to the same stream that would have been returned if the
request had been made using normal HTTP over TCP." properly. - is trying to cover http URI encoding here worthwhile, when
just leaving the reference to 2616/3986 etc. would likely suffice?

I think the term used in XMPP is 'bare JID' rather than 'resourceless JID'.

Is this the best URI scheme to use? I can see the appeal of them
looking like HTTP URIs, but there are already XMPP URIs too, and these
HTTPX URIs seem to lose a lot of the precision of HTTP URIs. They also
seem to require that the HTTP servers internally serve XMPP,
disallowing gateways - was this intentional? This, combined with
needing it to be on the XMPP server directly, together with the
examples showing the XMPP server for serving HTTP and the user's XMPP
server being the same entity seems to present a confusing
architecture. Regardless, is the URI scheme necessarily tied to the
rest of the document?

In general, it's not clear to me that this is a better design than
going with a more traditional proxy model, with the client asking a
proxy to fetch the resources - then no URI schemes etc. are necessary.

I'm not sure what 4.3.2 is adding over the previous examples? 4.3.3 similarly.

The discovery section seems to need a little work - particularly, are
requesters, responders or both expected to advertise support, and how
are optional features discovered (or are all entities required to
implement all the mechanisms defined in 332)?

Is automatically granting subscription requests not a massive privacy
hole? Directed presence seems more appropriate for most instances.

7.2.4 - I'm not entirely comfortable referencing out to 324 here.

8 seems to define the URI scheme again. Previous comments apply here,
together with a question of who the appropriate contact would be.


More information about the Standards mailing list