[Standards] xep-0313 missing features

Jonas Wielicki jonas at wielicki.name
Tue Mar 6 13:14:06 UTC 2018


On Dienstag, 6. März 2018 13:23:21 CET Lazar Otasevic wrote:
> Conclusion:
> 
> 1. it is not possible to iterate efficiently backwards by including both
> <before>&<after> in the query because once <after> is included in the query
> then it iterates forwards. that means when iterating backwards client has
> to omit <after> and fetch entire pages and then the last page will mostly
> overlap with some of the local messages, which is a waste.
> 2. it is not possible to determine "holes" in the archive reliably, because
> client can not know what is the last message archive-id, because our own
> sent messages have no feedback from the server once the message is archived
> what is its archive-id ... that means that client has to fetch ALL messages
> from MAM once again just to be sure that holes are filled, even though many
> message-bodies are already received/sent during live communication.
> 
> Basically In the current state all our own messages are "holes" in the
> local archive, not to mention all kind of "bad network" scenarios,
> multi-device and longer offline periods.
> 
> Making separate requests, one for archive ids and one for content would
> make:
> - much less waste in the sync because only ids would be wasted, and not the
> content
> - possible to fill multiple holes in one request by fetching content that
> is really needed
> - make possible for push payloads to contain only message ids (when clients
> want to handle encrypted messages locally by fetching them and only them)
> currently it is doable by giving to the push one id before the wanted
> message

So I might be wrong, but for the sake of it, I think there is a sane and not 
too complex way to do archive sync with MAM:

1. On startup, you put a "hole marker" at the end of your local archive. A 
hole marker is essentially just the stanza-id of the last message at the time 
the marker is created.
2. Iterate over all hole markers, from newest to oldest. Download everything 
between the last message *before* and the first message *behind* the hole 
marker. During download, move the hole marker accordingly to deal wtih 
disconnects while downloading. When finished, remove the hole marker; move on 
to the next hole marker if there’s any.

This should work and should also work with current semantics. I appreciate 
that there *might* be some overlap between the last page of a hole and already 
received messages. This is unfortunate, but can trivially be solved by 
comparing stanza-id (if available locally) or origin-id or message id in that 
order. (N.B.: once we get self-carbons, we’ll always have the stanza-id)

I also wrote that down in more detail here: 
https://github.com/jabbercat/jabbercat/issues/26#issuecomment-370333729


I think it would be great to have a way to limit the MAM query to an end-ID 
indeed. Matt, Kevin, any chance we get that in?

kind regards,
Jonas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.jabber.org/pipermail/standards/attachments/20180306/be1044f7/attachment.sig>


More information about the Standards mailing list