[Standards] LAST CALL: XEP-0313 (Message Archive Management)

Kim Alvefur zash at zash.se
Mon Sep 13 14:15:22 UTC 2021


On Thu, Sep 09, 2021 at 04:29:37PM +0100, Kevin Smith wrote:
>To summarise what I’ve said before:
>MUC in MAM kinda sucks, but groupchat doesn’t (necessarily) mean MUC.
>Sometimes people want groupchat messages in their archive because they want their archive to represent all those messages they received on any client, either for online searching, or offline sync, and MAM is the only mechanism we have for this.
>If we were to disallow groupchat in MAM completely, it would make existing deployed workflows unavailable.
>There are countless pitfalls to having MUC (rather than groupchat) in MAM, but prohibiting it just isn’t much of an option. I don’t much like where we are more than anyone else, but it *is* where we are, and I don’t think anyone has the time machine yet to go back and fix MUC preemptively.
>After discussion with Georg yesterday, I’ve submitted a new version of MAM that uses Dave’s suggested approach.

I'm assuming it's this PR here:

As a counterproposal; What about a `list-multi` of requested types? Have
it default to [chat, normal] or [chat, normal, groupchat] for those that
include groupchat by default. Could be extended to PEP notification
headline messages, which are similarly awkward.

Mentioned in the xsf@ chat:


>> On 8 Sep 2021, at 09:00, Georg Lukas <georg at op-co.de> wrote:
>> * Kevin Smith <kevin.smith at isode.com> [2021-09-07 18:41]:
>>> At the risk of repeating myself, I think that storing groupchat messages in the user archives is helpful, and people do this in the wild.
>> Right, I remember hearing that before, and IIRC the reason for that was
>> to allow for server-side message search?
>> Now I have multiple practical questions regarding how this is supposed
>> to work.
>> First, how and when are groupchat messages from MAM delivered to a
>> client? I can understand that it will mostly work well when querying for
>> the room JID. But what happens on a query that's only filtered by
>> `start` or `after-id`? Will the server intermix all groupchat messages
>> with all direct messages? Will it only send groupchats from the rooms
>> that some client of that user is currently joined? Only the rooms that
>> the querying client is joined? ...was joined in the past? Groupchats
>> have typically a much higher singal-to-noise ratio and could
>> significantly delay the loading of the really important messages here ;)
>> Should there be a difference between "channels" (public semi-anon rooms)
>> and "group chats" (closed non-anon rooms)?
>> Second, how does the MAM service ensure that the MUC history is complete
>> and does not contain holes, e.g. because all of the user's client left
>> the room at a certain time, or due to s2s outages? Or is there no such
>> guarantee, rendering the archive less than useful? Will the personal
>> archive re-populate MUC history when a client does a MAM query on the
>> MUC archive? Should the personal archive do MAM requests on its own?
>> Third, how is deduplication supposed to work? Will the user's archive
>> add its own <stanza-id> and only allow querying by that? How is a client
>> going to consolidate MUC messages based on their MUC-assigned stanza-ids
>> with ones from the personal archive - or is the client supposed to
>> ignore the MUC-assigned IDs?
>> Fourth, a personal MAM archive MAY exclude groupchat messages if these
>> are already archived on the MUC JID. There is no explicit signalling for
>> this, so I assume the most straight-forward implementation would be to
>> check all passing messages for the presence of a stanza-id field added
>> by the MUC JID, and to prevent storage of these. Let's ignore that a MUC
>> service or a room might change its archival preferences over the time,
>> we are still lacking a mechanism for the client to decide which JID to
>> query to obtain a MUC history. Should it first query the personal
>> archive and only fall back to the MUC archive if it receives an error?
>> An empty result set?
>>> So if there *was* somehow agreement for forbidding it, it would need a
>>> namespace bump, because it used to be allowed (and, indeed,
>>> recommended).
>> Well, given that a server MAY exclude groupchat messages if history is
>> accessible through other means, and given that 0045 includes a mechanism
>> for fetching history, I would say that a namespace bump is not needed ;-)
>> Georg
>> _______________________________________________
>> Standards mailing list
>> Info: https://mail.jabber.org/mailman/listinfo/standards
>> Unsubscribe: Standards-unsubscribe at xmpp.org
>> _______________________________________________
>Standards mailing list
>Info: https://mail.jabber.org/mailman/listinfo/standards
>Unsubscribe: Standards-unsubscribe at xmpp.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://mail.jabber.org/pipermail/standards/attachments/20210913/58b9538c/attachment.sig>

More information about the Standards mailing list