[Standards] XEP-0427: MAM Fastening Collation questions

Andrzej Wojcik andrzej.wojcik at tigase.net
Fri Jun 5 09:14:36 UTC 2020

Hi everyone,

I'm sorry but this will be a long email.

I've started the implementation of XEP-0427 with a goal to use the collation of fastenings (and pseudo-fastenings) to reduce traffic related to MAM history synchronization.

I was thinking about using `collate` summarizing to retrieve delivery receipts and chat markers. However, I'm not quite sure how would it really work.

In the section related to Pseudo-Fastening (https://xmpp.org/extensions/xep-0427.html#pseudo) there is the following note:

> Message Delivery Receipts: Message Delivery Receipts (XEP-0184) [5] "ack messages" - those containing a <received/> element - are considered to be equivalent to a fastening containing just the <received/> element, applying to the message given by the "id" attribute.

and that is quite clear. However, this means that result from MAM (if I'm correct) would look like that:

<message id='aeb213' to='juliet at capulet.lit/chamber'>
  <result xmlns='urn:xmpp:mam:2' queryid='f27' id='28482-98726-73623'>
    <forwarded xmlns='urn:xmpp:forward:0'>
      <delay xmlns='urn:xmpp:delay' stamp='2010-07-10T23:08:25Z'/>
      <message xmlns='jabber:client'
        to='juliet at capulet.lit/balcony'
        from='romeo at montague.lit/orchard'
        <body>Call me but love, and I'll be new baptized; Henceforth I never will be Romeo.</body>
    <applied xmlns='urn:xmpp:mamfc:0'>
        <received xmlns='urn:xmpp:receipts' />

That looks OK for 1-1 chat. But how about delivery confirmations forwarded by the MUC room or MIX channel? (note: MIX messages are stored in user MAM archive)

If I'm correct we would have the same response if at least any of the recipients sent a delivery receipt. If many recipients would send those delivery confirmations we would still have one entry and no way to tell who actually sent that confirmation. So we only know that someone received this message - it could be even our own client! The only information would be how many clients received that message (thanks to the 'count' attribute).

The same issue I think is with Chat Markers. Moreover, in XEP-0427 there is the following statement:
> Chat Markers: Chat Markers (XEP-0333) [6] A Chat Marker is similarly equivalent to a fastening containing the Chat Marker, but applying to all previous messages (since previous messages can be assumed to have been read and or displayed, etc).

So, should all messages preceding message with chat marker (ie. <received/>) have fastening in the summary? each of them should have the following  element in the <result/>:

<applied xmlns='urn:xmpp:mamfc:0'>
    <received xmlns='urn:xmpp:chat-markers:0' />

If that is true, then we would have a quite large overhead. Not to mention that counting confirmations separate for each type and message would create additional overhead (received/displayed).
Moreover, in the case of a MUC room/MIX channel, it would be impossible to tell who actually confirmed what. (who received? who read?) I'm not sure how aggregation should work in this case and I think that XEP-0427 was actually targeting MUC/MIX as there would be polls/reactions which would be nice to count.

I think that there is one more possible issue with XEP-0427 related to 'Last Archive ID'. XEP states that while this value could be deduced, it suggests that <latest/> element is added to return id of the last element in the query (even if it is fastening message id). And that could work, but not when client wants to use RSM for pagination (ie. it was not connected for a longer time and wants to sync in batches). Then it is possible that latest fastenings id would not even be in the original result set. 

Example 1.

Let's say that the user has 200 archived messages since the 'start' date. The first message is a message with stable id '1' and it has delivery receipts at position 150 with a stable id of '150' in this archive. Then the client asking for first 100 messages (assuming that all of them are messages and not fastenings) will receive 100 messages and fastening for a message with stable id '1'. In this case, <lastest/> would be set to '150' as that was lastest ID in the returned set. But when a client would ask (using RSM) for messages after <latest/> then it would receive messages from positions 151 to 200. Messages from positions 101 to 149 would not be fetched and synced at all.

This issue causes another one. When XMPP client uses RSM <after/> for fetching and pagination then according to the XEP-0427 it may receive the same message fastenings multiple times.

Example 2.

Let's say that the user has 200 archived messages since the 'start' date. The first message is a message with stable id '1' and it has delivery receipts at position 150 with a stable id of '150' in this archive. Then the client asking for the first 100 messages (assuming that all of them are messages and not fastenings) will receive 100 messages and fastening for a message with stable id '1'. Assuming that client would ignore the value of <latest/> and would fetch once again using <after/> set to the value available in <last/> element of the previous response, then it would receive messages for range 101-200 and instead of the message at position 150 it would get it as a fastening (no <forwarded/> just <applied/>).

Moreover, if the client would always use a value of <last/> to fetch the next messages it could end up in the infinite loop. This could happen if in the archive would be 300 messages, first 100 of that would be normal messages, then 100 would be just "fastenings" but each fastening would point to a different message. that would give us 100 fastenings pointing to 100 different messages. Client asking for messages after stable id 120 (I assume that stable id is equal to the position of the message in the set), would receive 100 fastenings (nothing more) and <last/> id would actually match the id sent in <after/> element creating an infinite loop.

To sum up, I think that idea for aggregation of messages on the server-side is good, but the XEP in the current state has some holes in it making it unusable. I do not see, how we could benefit from this XEP, even assuming that it would only be used for 'real' fastenings (removing aggregation of delovery receipts), as even if we would know that this message has count of 'likes' of 100, we would need all the actual fastening messages to be able to update that count when a new message is received, because it may happen that use changed his reaction, so instead of like now different reaction should be shown and if I recall correctly XEP-0422 (https://xmpp.org/extensions/xep-0422.html#replace) allows.

Maybe I've did not understand something from the XEP or that XEP needs clarification, but I would prefer a XEP that would allow me to fetch messages from date to date (including after) that would return messages and aggregated fastenings in that time period. That would allow XMPP clients to sync faster (knowing the actual state of the message (received/displayed) client would update it in the local archive just once) and would redeuce load on the server (less data to aggregate). It would be also good to keep details about sender of a fastening message (even if aggregated). That would allow (in case of MIX/MUC) to show who actually read and who just received that message.

I hope that this is not to long and anyone would actually read that.

Dave, Kev: as you are authors of XEP-0427 could you comment on the 'issues'? or correct me if I've did not understand XEP correctly?

Andrzej Wójcik

XMPP: andrzej.wojcik at tigase.org
Email: andrzej.wojcik at tigase.net

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/standards/attachments/20200605/8dd61dff/attachment-0001.html>

More information about the Standards mailing list