[Standards] MAM Sync Strategies

JC Brand lists at opkode.com
Fri Aug 27 07:57:32 UTC 2021

Hi Sam et al

Webclients have restrictions that others don't, so while what you wrote 
makes sense, I do something a bit different with Converse.

First, depending on the type of storage used (sessionStorage vs 
localStorage vs IndexedDB), a webclient can easily run into a storage 
limit (around 5MB for localStorage).
IndexedDB doesn't have a storage limit, but you can't always assume that 
it's available (e.g. it's not in incognito mode in Safari) or desirable 
to use it.

Additionally, the user can at any time navigate away from the tab or 
reload the tab, thereby interrupting the process of populating the archive.

In theory this second problem can be solved by using a shared worker, 
which can still populate the history in the background (I've added 
support to Strophe.js), but Safari doesn't support shared workers and 
they bring other complications and considerations.

So, instead of fetching the history for every contact in the roster, I 
only do so for open chats (i.e. ongoing 1:1 and MUC conversations).

If there are no messages in the history, I do a reverse order query 
using "before" set to an empty string together with a configurable limit 
(similarly to how you do it).

If there are messages in the history, I set "after" to the stanza-id of 
the most recent message. Now of course there might be more messages than 
might fit on the returned MAM page. Whether Converse will fetch the 
other pages is configurable, and depends on whether it has limited 
storage or not. In the case where storage is limited, Converse should be 
configured to not fetch all subsequent pages. This creates a gap in the 
history. In the soon-to-be-released Converse 8, I create a special gap 
indicator, which is shown to the user in the chat history, to indicate 
that there are missing messages. The user can click on this indicator, 
to fill it with a new MAM query.

The potential presence of gaps and how to deal with them is something 
that I don't see mentioned in Sam's description. Probably because with a 
desktop client you can just fetch all messages and don't have to worry 
about gaps.

Then there's also an edge case, where there is a message history, but 
for some reason we don't have a stable stanza-id. In that case, I get 
the time of the most recent message (taking into account a possible 
<delay> element), and then I do a "before" query set to that date.

For those who might be interested, here's the code that implements the 
logic described above:


On 09.08.21 13:46, Sam Whited wrote:
> Hi all,
> I started a PR against modernxmpp to document MAM sync strategies after
> a discussion on jdev yesterday:
> https://github.com/modernxmpp/modernxmpp/pull/41
> I wondered if anyone would share what their sync strategy is (or even
> possibly add it to that PR) so that we can document a few clients and
> maybe move towards an XEP that outlines one or two ideal ones?
> I'll start with the one I described in the chat yesterday that's used
> (experimentally) by Mellium/Communiqué:
> On client start iterate through all items in the roster. If no messages
> exist in the local archive: Query in reverse order (in case the server
> breaks it up by page and we end up committing pages separately) with
> before: now && limit: X (where X is some configurable number, what we
> think will fit on the page with some margin, etc.). Otherwise query with
> after-id: <last message> (making sure that the last message was pulled
> from the DB before we send initial presence).
> If the user scrolls to the top of the history, query in reverse order
> with before-id: <first message>. Fetch the next page for as long as they
> continue to scroll up.
> Thanks,
> Sam

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/standards/attachments/20210827/b44a4db5/attachment.html>

More information about the Standards mailing list