[Standards] XEP-0313: why it is *really* not a good idea to use MAM with Pubsub

Goffi goffi at goffi.org
Wed Jan 27 09:56:47 UTC 2016

Hi Kev,

thanks for your answer, I put a few notes here so we can talk about it if 
needed tomorrow.

Le dimanche 24 janvier 2016, 17:25:44 Kevin Smith a écrit :
> On 6 Jan 2016, at 11:08, Goffi <goffi at goffi.org> wrote:
> > - All items a returned in separate <message> stanza, wrapped in a
> > <forwarded> element, one item per stanza. This both is a waste of
> > bandwidth and make the task more difficult for the client as it must
> > track each <message> and the <iq> result to known when a page has been
> > received. A simple <iq> query like for a PubSub items retrieval would be
> > much more better.
> Aren’t you going to have huge troubles with stanza sizes in that case? It
> seems like once you start wrapping multiple pubsub items together you’re
> going to start exceeding stanza sizes and needing to deal with the code for
> merging them anyway.

That's actually what PubSub itself do, so if we have issue with stanza size, 
we can start to worry about XEP-0060.

> > - Requests are made on one node. But it is desirable to be able to do
> > requests on several nodes, or on nodes which match a pattern. For
> > instance, in XEP-0277 comments node are in the form
> > "urn:xmpp:microblog:0:comments/dd88c9bc58886fce0049ed050df0c5f2" and it
> > would be usefull to request all items from a node starting with
> > "urn:xmpp:microblog:0:comments". With MAM I can't request all comments
> > published by Romeo.
> I think that’s a fairly simple extension for someone to spec, isn’t it?

MAM request detect if it is a pubsub request by checking the node attribute.
A wild card could be used for the use case I have given. But what if I want to 
look several  nodes ? Or ignore the node ? We can always write XEPs to 
workaround this, but it can quickly complicates the request.

> > - There is no way when a service offer MAM both for message and PubSub
> > (e.g.: a MUC component with PubSub abilities (MUC 2 ?), or the server
> > itself when it offers PEP) to know if the filtering fields apply to
> > messages, or PubSub, or both.
> > Look at section 4.1.5 "Retrieving form fields", how can I know if
> > "urn:example:xmpp:free-text-search" can be used for PubSub or not?
> I imagine you request the form for the node you’re interested in querying.
> If that’s not clear, we should make it so.

but we go back to our problem with querying multiple nodes at once, or nodes 
starting with a namespace.

> > - section 4.2 says that "The archive results MUST be sorted in
> > chronological order", that totally make sense for message archives, but
> > in the case of PubSub this is incoherent with the classic items retrieval
> > ordering (most recent item first), and we may want to sort on other
> > fields than publication date: for instance item updating date vs
> > publishing date, or size of files tracked with pubsub.
> > Of course we can reverse order easily with RSM, but though it's not
> > natural, and we can't sort on other fields.
> This doesn’t seem insurmountable. We have data forms for the queries if we
> want to change behaviour.

If the MUST disappear, this one is easily fixable indeed

> > - overall, PubSub already manages archives by design, but it is lacking a
> > good searching tool. Even if it is tempting to use MAM with PubSub
> > because we can have filtering "for free", I really think it is not
> > adapted at all, and PubSub deserve a real dedicated searching/filtering
> > tool.
> I would be very keen to move towards one method for doing history queries
> and not having the current plethora (offline messages, MUC context, PubSub,
> …).
> > If other people are interested, I would like to work on a "PubSub
> > searching" protoXEP. PubSub will probably be the core of many major
> > features in XMPP in the future, so we need a good, generic, and
> > extendable way to search/filter items.
> I think the effort would be much better spent adding MAM extensions as
> necessary.

I'm also thinking about way to do complex queries (with AND/OR filtering), and 
I don't have the feeling it's a goal for MAM. But again this can be fixable by 
an other XEP.
My two main grievances are about the items returned in <message/> stanzas and 
the impossibility to query multiple nodes or nodes with a wildcard. If these 
two are fixed, I guess MAM can start to be a better option.

> /K


More information about the Standards mailing list