[Standards] LAST CALL: XEP-0313 (Message Archive Management)

Dave Cridland dave at cridland.net
Wed Sep 8 08:49:07 UTC 2021

On Tue, 7 Sept 2021 at 16:23, Georg Lukas <georg at op-co.de> wrote:

> Hello everyone,
> given that XEP-0313 is up for recall in tomorrow's Council meeting, I'd
> like to sort my long list of issues into blocking and non-blocking.
> Fixing the blocking ones is a requirement for me to change my vote to
> something different than -1.
> Two blocking (non-Editorial) issues:
> * Georg Lukas <georg at op-co.de> [2021-03-31 18:50]:
> > Time and again, specifications that use Message Forwarding have
> > fallen victim to impersonation attacks (there is a number of CVEs
> > around, like CVE-2017-5589, CVE-2019-16235, and CVE-2020-26547).
> >
> > The XEP urgently needs a respective section in the Security
> > Considerations, and ideally also a negative example like
> > https://xmpp.org/extensions/xep-0280.html#example-11
> I'd also copy&paste a bunch of <cve/> elements from XEP-0280, in
> addition to the above text.
I can agree with this in principle, though the query id within MAM
queries dramatically lessens the scope for an attack here.

> > §6.1.1: the business rules for user archives are inadequate:
> >
> > MUC messages in user archive: I think that implementation practice has
> > clearly shown that storing MUC messages in the user archive is a Bad
> > Idea, and nobody is doing it anyway. Also the server is probably not in
> > a position to track a user's MUC activity and query all MUCs for whether
> > they implement some sort of message storage. This part should be
> > converted into "SHOULD NOT" or "MUST NOT".
> After some discussion it was suggested to change the scope of this from
> "should not store" to "should not return by default", which is also fine
> with me.
I'm unconvinced that trying to "fix" XEP-0045's various deficiencies is a
useful use of what energy we have.

For persistent groupchat protocols (MUC-SUB, Muclight, MIX) storing
groupchat messages in the personal archive is very useful indeed. This is
particularly the case for client implementations that operate in a "low
cache" mode, including (but not limited to) web clients.

For XEP-0045, it's complicated, sometimes entirely undesirable, and
sometimes very useful, and usually weird.

How would a "if you didn't specify, we won't either" model go with you?
That is, we add a flag in the query to specify whether or not
groupchat messages are included, but we explicitly do not include a default

> One probably-non-blocking can-of-worms issue:
> > Furthermore, I'm not sure if messages received by a client from offline
> > history are supposed to contain the respective MAM-ID, so deduplicating
> > here might be very adventurous, as the same message might arrive from
> > offline history without a MAM ID and from MAM with a MAM ID, and the @id
> > attribute might not be unique.
> I'm not sure how implementations other than prosody handle this, but I'd
> love to see MAM servers to also inject MAM-IDs into offline messages,
> and have that explicitly written in 0313.
> Non-blocking issues:
> > Storage rules: those look very much like the original ones from the
> > initial specification, and I think we have learned much since then.
> >
> > Prosody will store "normal" messages with a body, or "chat"
> > messages that are not empty after stripped. By default, it will strip
> > chat states, but it will count origin-id or <x muc> as elements that are
> > worth of storing.
> >
> > Part of the problem is an implementation issue with storing the stripped
> > message and not the original <https://issues.prosody.im/1423> but the
> > general problem of clearly defined storage rules remains.
> >
> > This XEP needs something like the 0280 Recommended Rules
> > <https://xmpp.org/extensions/xep-0280.html#recommended-rules> but it
> > should be part of the XEP and not a later addition guarded by a separate
> > namespace. Maybe.
> >
> > Also it would be great to persist message errors for sent messages. But
> > this is a separate can of worms.
> >
> >
> > My comment from the last 0313 LC about letting the client know if the
> > MAM preferences are "undefined" yet, so that the client can ask the user
> > once, now applies to XEP-0441, so I hope I'll think of bringing it up in
> > the respective Last Call again.
> >
> >
> > The Business Rules section needs clear guidance to client
> > implementations that want to do "full sync", i.e. obtain all messages
> > received by the account since the client was last online, without too
> > many duplicates.
> >
> > This is a complex problem because offline messages might contain
> > everything that is also in MAM, or might have been drained by another
> > client in the meantime, so that offline messages will only give you the
> > end of chat history.
> >
> >
> > There is a separate standards@ thread regarding how to treat offline
> > history for MAM-enabled clients, but it only solves part of the above
> > problem.
> >
> >
> > There is no "atomic" switch between fetching messages from MAM and
> > receiving live traffic, so a client needs to either remember the last
> > locally known MAM ID before sending initial presence, then request MAM
> > after that last-known-MAM-ID until it catches up with offline history,
> > or until it depletes the MAM archive, duplicating messages between
> > offline history and MAM fetching.
> >
> > The naive approach of first fetching MAM, then sending initial presence
> > causes a subtle race condition for messages that are delivered to your
> > account in the brief moment after you completed fetching from MAM.
> >
> > There is also a problem if a client crashes during this catch-up phase
> > (this is more common on mobile systems than you'd hope), as it needs to
> > either persist the last-known-MAM-ID or keep the incoming MAM fetch in
> > memory until everything is processed, as otherwise it would populate the
> > message database with new MAM-IDs that would be incorrectly considered
> > as the new last-known-MAM-ID after a client restart.
> >
> >
> > We are still missing a "MAM subscription" mechanism, where a client
> > would automatically receive the MAM-ID of all messages it sends, so that
> > it can properly de-duplicate them from a later MAM fetch. As it is now,
> > a client needs to exclude sent messages from the "obtain
> > last-known-MAM-ID" algorithm, and then assign the MAM-ID for sent
> > messages that are reflected to it on the next MAM fetch.
> >
> >
> > Not sure which parts of that belong into bind2 though.
> _______________________________________________
> Standards mailing list
> Info: https://mail.jabber.org/mailman/listinfo/standards
> Unsubscribe: Standards-unsubscribe at xmpp.org
> _______________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/standards/attachments/20210908/2644c3f7/attachment.html>

More information about the Standards mailing list