[Standards] NEW: XEP-0313 (Message Archive Management)

Matthew Wild mwild1 at gmail.com
Thu Apr 19 17:01:26 UTC 2012


On 19 April 2012 02:12, XMPP Extensions Editor <editor at xmpp.org> wrote:
> Version 0.1 of XEP-0313 (Message Archive Management) has been released.
>
> Abstract: This document defines a protocol to query and control and archive of messages stored on a server.
>
> Changelog: Initial version, to much rejoicing. (mw)
>
> Diff: N/A
>
> URL: http://xmpp.org/extensions/xep-0313.html

There are some sections still remaining, and some things that need
specifying further, which I have begun on. I should be able to submit
an updated version by [REDACTED].

One of the substantial changes would be better specifying the use of
Result Set Management. Currently only <limit> is required, but I think
full RSM support should be a MUST to allow for accurate paging and
queries based on message UIDs.

I also have an open question, that perhaps warrants some discussion
here... (warning: brain dump ahead)

Lots of clients already store local history - and it is expected they
will continue to use that, as a cache. MAM allows these clients to
fetch  history from the archive that happened while they were offline,
or messages from other resources (though these can be caught while
online with Carbons).

The difficult part is how to identify the exact messages that the
client doesn't yet have cached. Timestamps are not unique identifiers,
as we all know. The problem here is that the client doesn't know the
ID of the last message it has in its history, otherwise it could ask
MAM for all messages since that ID. Using the timestamp could end up
with duplicates, even with accurate clocks (which don't exist).

One solution I came up with was for an entity that relays and archives
messages to stamp the message with: <archived by="capulet.lit"
id="1234-5678"/> or <archived by="conference.jabber.org"
id="8765-4321"/>. I'd be interested in feedback on this idea.

However even <archived/> doesn't cover the case of the client knowing
the id of its *outgoing* messages. The server could echo them back
with <archived/>... but then things start to get a bit muddy.

The alternative is to not solve this, and clients should treat the MAM
archive as the canonical source of history - (therefore fetching
messages from the archive that have already been sent/received by it).
A waste of bandwidth if nothing else.

I'll also mention here that in my mind archiving and carbons are very
related. They are both about replicating history across clients, only
that Carbons just works while online. Originally MAM was to allow
'subscribing' to an archive, as a way to receive messages
sent/received by other resources while online, and even allow
following a MUC room in realtime without joining it. This would be a
separate XEP if I submitted it, but now that we have Carbons there
would be more than a little overlap there. Thoughts welcomed.

Regards,
Matthew



More information about the Standards mailing list