[Standards] LAST CALL: XEP-0313 (Message Archive Management)

Georg Lukas georg at op-co.de
Tue Sep 7 15:22:44 UTC 2021

Hello everyone,

given that XEP-0313 is up for recall in tomorrow's Council meeting, I'd
like to sort my long list of issues into blocking and non-blocking.
Fixing the blocking ones is a requirement for me to change my vote to
something different than -1.

Two blocking (non-Editorial) issues:

* Georg Lukas <georg at op-co.de> [2021-03-31 18:50]:
> Time and again, specifications that use Message Forwarding have
> fallen victim to impersonation attacks (there is a number of CVEs
> around, like CVE-2017-5589, CVE-2019-16235, and CVE-2020-26547).
> The XEP urgently needs a respective section in the Security
> Considerations, and ideally also a negative example like
> https://xmpp.org/extensions/xep-0280.html#example-11

I'd also copy&paste a bunch of <cve/> elements from XEP-0280, in
addition to the above text.

> §6.1.1: the business rules for user archives are inadequate:
> MUC messages in user archive: I think that implementation practice has
> clearly shown that storing MUC messages in the user archive is a Bad
> Idea, and nobody is doing it anyway. Also the server is probably not in
> a position to track a user's MUC activity and query all MUCs for whether
> they implement some sort of message storage. This part should be
> converted into "SHOULD NOT" or "MUST NOT".

After some discussion it was suggested to change the scope of this from
"should not store" to "should not return by default", which is also fine
with me.

One probably-non-blocking can-of-worms issue:
> Furthermore, I'm not sure if messages received by a client from offline
> history are supposed to contain the respective MAM-ID, so deduplicating
> here might be very adventurous, as the same message might arrive from
> offline history without a MAM ID and from MAM with a MAM ID, and the @id
> attribute might not be unique.

I'm not sure how implementations other than prosody handle this, but I'd
love to see MAM servers to also inject MAM-IDs into offline messages,
and have that explicitly written in 0313.

Non-blocking issues:
> Storage rules: those look very much like the original ones from the
> initial specification, and I think we have learned much since then.
> Prosody will store "normal" messages with a body, or "chat"
> messages that are not empty after stripped. By default, it will strip
> chat states, but it will count origin-id or <x muc> as elements that are
> worth of storing.
> Part of the problem is an implementation issue with storing the stripped
> message and not the original <https://issues.prosody.im/1423> but the
> general problem of clearly defined storage rules remains.
> This XEP needs something like the 0280 Recommended Rules
> <https://xmpp.org/extensions/xep-0280.html#recommended-rules> but it
> should be part of the XEP and not a later addition guarded by a separate
> namespace. Maybe.
> Also it would be great to persist message errors for sent messages. But
> this is a separate can of worms.
> My comment from the last 0313 LC about letting the client know if the
> MAM preferences are "undefined" yet, so that the client can ask the user
> once, now applies to XEP-0441, so I hope I'll think of bringing it up in
> the respective Last Call again.
> The Business Rules section needs clear guidance to client
> implementations that want to do "full sync", i.e. obtain all messages
> received by the account since the client was last online, without too
> many duplicates.
> This is a complex problem because offline messages might contain
> everything that is also in MAM, or might have been drained by another
> client in the meantime, so that offline messages will only give you the
> end of chat history.
> There is a separate standards@ thread regarding how to treat offline
> history for MAM-enabled clients, but it only solves part of the above
> problem.
> There is no "atomic" switch between fetching messages from MAM and
> receiving live traffic, so a client needs to either remember the last
> locally known MAM ID before sending initial presence, then request MAM
> after that last-known-MAM-ID until it catches up with offline history,
> or until it depletes the MAM archive, duplicating messages between
> offline history and MAM fetching.
> The naive approach of first fetching MAM, then sending initial presence
> causes a subtle race condition for messages that are delivered to your
> account in the brief moment after you completed fetching from MAM.
> There is also a problem if a client crashes during this catch-up phase
> (this is more common on mobile systems than you'd hope), as it needs to
> either persist the last-known-MAM-ID or keep the incoming MAM fetch in
> memory until everything is processed, as otherwise it would populate the
> message database with new MAM-IDs that would be incorrectly considered
> as the new last-known-MAM-ID after a client restart.
> We are still missing a "MAM subscription" mechanism, where a client
> would automatically receive the MAM-ID of all messages it sends, so that
> it can properly de-duplicate them from a later MAM fetch. As it is now,
> a client needs to exclude sent messages from the "obtain
> last-known-MAM-ID" algorithm, and then assign the MAM-ID for sent
> messages that are reflected to it on the next MAM fetch.
> Not sure which parts of that belong into bind2 though.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3204 bytes
Desc: not available
URL: <http://mail.jabber.org/pipermail/standards/attachments/20210907/966d2a6b/attachment.bin>

More information about the Standards mailing list