[Standards] Exact hint for Result Set Management
mwild1 at gmail.com
Thu Jul 12 10:23:53 UTC 2018
On 11 July 2018 at 18:25, Florian Schmaus <flo at geekplace.eu> wrote:
> On 11.07.2018 18:01, Matthew Wild wrote:
>> On 11 July 2018 at 16:33, Florian Schmaus <flo at geekplace.eu> wrote:
>>> I recently submitted PR #672 to the xeps repo
>>> to make users of RSM, like MAM, aware whether the result is exact or
>>> not. It received some scepticism from the council members in today's
>>> council meeting. I am to blame here as I thought the abstract motivation
>>> in the commit message was enough. It appears it wasn't.
>>> While I think multiple applications could exploit that information, my
>>> particular motivation was MAM. Consider the scenario where you have a
>>> master archive and a local archive. The local archive may have multiple
>>> holes at unknown locations. Now you want to sync your local archive from
>>> the master using MAM/RSM.
>> I'm not keen on this solution for the premise you've given.
>> I don't believe that when using MAM correctly you would ever end up
>> with "holes at unknown locations" in your local archive. I don't think
>> that encouraging people to use a "bisection algorithm" is the right
>> thing to do.
> So you don't want MAM users to be able to efficiently sync archives with
> multiple holes by a simple change because you do not want MAM to be used
> in scenarios where this could happen?
Just adding this flag will not make servers implement it, so it's
going to add code and still need a fallback.
I believe that ambiguity in how to implement protocols properly is a
big problem we have. XEPs tend to describe only the wire protocol, and
if the author of the XEP even had a data model in mind, it's rarely
(if ever) clear. For example it took me a long time to realise that
pubsub's model is an (optionally capped) ordered key->value store, and
that's made clear precisely nowhere in XEP-0060.
So for MAM I wanted to focus on the specific use-cases that the
protocol was designed for, and to present a clear way to correctly
implement them (I am well aware that the XEP does not reach this goal
What you are describing - a local archive with multiple holes that the
implementation is unaware of - is not a state that I see any such
optimal correctly implemented MAM usage getting into.
Therefore it's not a problem I want to solve, because it will only add
to confusion about the best and easiest way to implement MAM.
> Even if we would live in a world where such MAM archives are never going
> to happen, adding the exact hint to RSM is worthwhile.
I've no objection to RSM taking its own course, I won't object to such
an enhancement if it's the right thing for RSM. Only if it's done on
the sole basis as something that is desirable for MAM.
> From a generic, non MAM-specific point-of-view, RSM is eventually used
> to sync data, and for that you often want to now if the RSM metadata is
> exact or not. My MAM example is just one illustration of that. It always
> appeared like an afterthought that RSM does not allow the RSM data
> originator to signal if the numbers are exact or not. The proposed
> change tries to fix that.
I think the intention of RSM's vague numbers is that they were used
for things like UI (progress bars, etc.) hints only.
One of the reasons for this is that RSM is designed to work with
dynamic result sets. For example you might request disco#items of a
MUC server, but rooms will be added/removed while paging through the
results. RSM is carefully designed so that you will never receive
duplicates, but an item that was present when you started paging will
not be included in the results if it was removed before you reached
its page. That's why <count> is not accurate and not meant to be used
for sync purposes.
MAM is a special case because normally[*] XEP-0313 explicitly forbids
adding or removing items in the middle of the result set. This is not
true of most other things that RSM would be used for (disco items,
pubsub, etc.), and therefore I think this flag would basically only
work for MAM. And then all my reasoning above therefore applies.
[*] the server setting stable='false' is an exception here - an aspect
of the XEP I'm not keen on, but it was deemed necessary for some
environments. Sync is simply impossible with such a server.
More information about the Standards