[Standards] LAST CALL: XEP-0313 (Message Archive Management)

Georg Lukas georg at op-co.de
Wed Sep 8 08:00:47 UTC 2021

* Kevin Smith <kevin.smith at isode.com> [2021-09-07 18:41]:
> At the risk of repeating myself, I think that storing groupchat messages in the user archives is helpful, and people do this in the wild.

Right, I remember hearing that before, and IIRC the reason for that was
to allow for server-side message search?

Now I have multiple practical questions regarding how this is supposed
to work.

First, how and when are groupchat messages from MAM delivered to a
client? I can understand that it will mostly work well when querying for
the room JID. But what happens on a query that's only filtered by
`start` or `after-id`? Will the server intermix all groupchat messages
with all direct messages? Will it only send groupchats from the rooms
that some client of that user is currently joined? Only the rooms that
the querying client is joined? ...was joined in the past? Groupchats
have typically a much higher singal-to-noise ratio and could
significantly delay the loading of the really important messages here ;)
Should there be a difference between "channels" (public semi-anon rooms)
and "group chats" (closed non-anon rooms)?

Second, how does the MAM service ensure that the MUC history is complete
and does not contain holes, e.g. because all of the user's client left
the room at a certain time, or due to s2s outages? Or is there no such
guarantee, rendering the archive less than useful? Will the personal
archive re-populate MUC history when a client does a MAM query on the
MUC archive? Should the personal archive do MAM requests on its own?

Third, how is deduplication supposed to work? Will the user's archive
add its own <stanza-id> and only allow querying by that? How is a client
going to consolidate MUC messages based on their MUC-assigned stanza-ids
with ones from the personal archive - or is the client supposed to
ignore the MUC-assigned IDs?

Fourth, a personal MAM archive MAY exclude groupchat messages if these
are already archived on the MUC JID. There is no explicit signalling for
this, so I assume the most straight-forward implementation would be to
check all passing messages for the presence of a stanza-id field added
by the MUC JID, and to prevent storage of these. Let's ignore that a MUC
service or a room might change its archival preferences over the time,
we are still lacking a mechanism for the client to decide which JID to
query to obtain a MUC history. Should it first query the personal
archive and only fall back to the MUC archive if it receives an error?
An empty result set?

> So if there *was* somehow agreement for forbidding it, it would need a
> namespace bump, because it used to be allowed (and, indeed,
> recommended).

Well, given that a server MAY exclude groupchat messages if history is
accessible through other means, and given that 0045 includes a mechanism
for fetching history, I would say that a namespace bump is not needed ;-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3204 bytes
Desc: not available
URL: <http://mail.jabber.org/pipermail/standards/attachments/20210908/0dc68316/attachment.bin>

More information about the Standards mailing list