[Standards] MAM Sync Strategies

Thilo Molitor thilo at eightysoft.de
Fri Aug 27 13:34:34 UTC 2021


Hi Sam et al,

for Monal we do something a bit different (a mixture of what you wrote and what 
JC wrote):

When the user first sets up a new account, we query the archive with 
end=<current datetime> and RSM {max=1, before=""}
The stanza-id of this message is then used as sync point the next time Monal 
has to do a catchup via MAM (e.g. no XEP-0198 resume, which Monal is capable 
of even if the device was restarted or the app killed etc.).
This sync-point will be updated to the newest stanza-id for every incoming 
message.

For the catchup we just query everthing with RSM {after=<sync-point (e.g. last 
seen stanza-id)>, max=50} and page through all results.
We delay processing of all incoming 1:1 "live" message stanzas while doing the 
MAM catchup to make sure the message order is preserved for OMEMO etc.

Once the MAM catchup is done, all queried "live" message stanzas are 
processed.

We do the same for every MUC catchup (every MUC has its own sync-point of 
course; every MUC catchup delays only "live" message stanzas for that MUC, not 
other MUCs or 1:1).

If we get item-not-found for a catchup query, we will query the whole archive.

Most of the time the XEP-0198 session will be resumed by Monal, MAM catchups 
are really rare.

- tmolitor



Am Freitag, 27. August 2021, 09:57:32 CEST schrieb JC Brand:
> Hi Sam et al
> 
> Webclients have restrictions that others don't, so while what you wrote
> makes sense, I do something a bit different with Converse.
> 
> First, depending on the type of storage used (sessionStorage vs
> localStorage vs IndexedDB), a webclient can easily run into a storage
> limit (around 5MB for localStorage).
> IndexedDB doesn't have a storage limit, but you can't always assume that
> it's available (e.g. it's not in incognito mode in Safari) or desirable
> to use it.
> 
> Additionally, the user can at any time navigate away from the tab or
> reload the tab, thereby interrupting the process of populating the archive.
> 
> In theory this second problem can be solved by using a shared worker,
> which can still populate the history in the background (I've added
> support to Strophe.js), but Safari doesn't support shared workers and
> they bring other complications and considerations.
> 
> So, instead of fetching the history for every contact in the roster, I
> only do so for open chats (i.e. ongoing 1:1 and MUC conversations).
> 
> If there are no messages in the history, I do a reverse order query
> using "before" set to an empty string together with a configurable limit
> (similarly to how you do it).
> 
> If there are messages in the history, I set "after" to the stanza-id of
> the most recent message. Now of course there might be more messages than
> might fit on the returned MAM page. Whether Converse will fetch the
> other pages is configurable, and depends on whether it has limited
> storage or not. In the case where storage is limited, Converse should be
> configured to not fetch all subsequent pages. This creates a gap in the
> history. In the soon-to-be-released Converse 8, I create a special gap
> indicator, which is shown to the user in the chat history, to indicate
> that there are missing messages. The user can click on this indicator,
> to fill it with a new MAM query.
> 
> The potential presence of gaps and how to deal with them is something
> that I don't see mentioned in Sam's description. Probably because with a
> desktop client you can just fetch all messages and don't have to worry
> about gaps.
> 
> Then there's also an edge case, where there is a message history, but
> for some reason we don't have a stable stanza-id. In that case, I get
> the time of the most recent message (taking into account a possible
> <delay> element), and then I do a "before" query set to that date.
> 
> For those who might be interested, here's the code that implements the
> logic described above:
> https://github.com/conversejs/converse.js/blob/8d62c2b103f70e0a1e82787b5c464
> e8b2d53cc01/src/headless/plugins/mam/utils.js#L201
> 
> Regards
> JC
> 
> On 09.08.21 13:46, Sam Whited wrote:
> > Hi all,
> > 
> > I started a PR against modernxmpp to document MAM sync strategies after
> > a discussion on jdev yesterday:
> > 
> > https://github.com/modernxmpp/modernxmpp/pull/41
> > 
> > I wondered if anyone would share what their sync strategy is (or even
> > possibly add it to that PR) so that we can document a few clients and
> > maybe move towards an XEP that outlines one or two ideal ones?
> > 
> > I'll start with the one I described in the chat yesterday that's used
> > (experimentally) by Mellium/Communiqué:
> > 
> > On client start iterate through all items in the roster. If no messages
> > exist in the local archive: Query in reverse order (in case the server
> > breaks it up by page and we end up committing pages separately) with
> > before: now && limit: X (where X is some configurable number, what we
> > think will fit on the page with some margin, etc.). Otherwise query with
> > after-id: <last message> (making sure that the last message was pulled
> > from the DB before we send initial presence).
> > 
> > If the user scrolls to the top of the history, query in reverse order
> > with before-id: <first message>. Fetch the next page for as long as they
> > continue to scroll up.
> > 
> > Thanks,
> > Sam




More information about the Standards mailing list