[Standards] Exact hint for Result Set Management
kevin.smith at isode.com
Thu Jul 12 10:39:33 UTC 2018
On 12 Jul 2018, at 11:23, Matthew Wild <mwild1 at gmail.com> wrote:
> On 11 July 2018 at 18:25, Florian Schmaus <flo at geekplace.eu> wrote:
>> On 11.07.2018 18:01, Matthew Wild wrote:
>>> On 11 July 2018 at 16:33, Florian Schmaus <flo at geekplace.eu> wrote:
>>>> I recently submitted PR #672 to the xeps repo
>>>> to make users of RSM, like MAM, aware whether the result is exact or
>>>> not. It received some scepticism from the council members in today's
>>>> council meeting. I am to blame here as I thought the abstract motivation
>>>> in the commit message was enough. It appears it wasn't.
>>>> While I think multiple applications could exploit that information, my
>>>> particular motivation was MAM. Consider the scenario where you have a
>>>> master archive and a local archive. The local archive may have multiple
>>>> holes at unknown locations. Now you want to sync your local archive from
>>>> the master using MAM/RSM.
>>> I'm not keen on this solution for the premise you've given.
>>> I don't believe that when using MAM correctly you would ever end up
>>> with "holes at unknown locations" in your local archive. I don't think
>>> that encouraging people to use a "bisection algorithm" is the right
>>> thing to do.
>> So you don't want MAM users to be able to efficiently sync archives with
>> multiple holes by a simple change because you do not want MAM to be used
>> in scenarios where this could happen?
> Just adding this flag will not make servers implement it, so it's
> going to add code and still need a fallback.
And, as specified (optional but with no default or meaning for a missing flag) it seems unhelpful and as it adds a SHOULD, in a Draft XEP, with no namespace bump or discovery, it’s adding ambiguity and confusion..
> What you are describing - a local archive with multiple holes that the
> implementation is unaware of - is not a state that I see any such
> optimal correctly implemented MAM usage getting into.
OTOH, a local archive with multiple *known* holes is easy to get into and we need to ensure this case is covered - but this doesn’t need this change to RSM.
> Therefore it's not a problem I want to solve, because it will only add
> to confusion about the best and easiest way to implement MAM.
>> From a generic, non MAM-specific point-of-view, RSM is eventually used
>> to sync data, and for that you often want to now if the RSM metadata is
>> exact or not. My MAM example is just one illustration of that. It always
>> appeared like an afterthought that RSM does not allow the RSM data
>> originator to signal if the numbers are exact or not. The proposed
>> change tries to fix that.
> I think the intention of RSM's vague numbers is that they were used
> for things like UI (progress bars, etc.) hints only.
> One of the reasons for this is that RSM is designed to work with
> dynamic result sets. For example you might request disco#items of a
> MUC server, but rooms will be added/removed while paging through the
> results. RSM is carefully designed so that you will never receive
> duplicates, but an item that was present when you started paging will
> not be included in the results if it was removed before you reached
> its page. That's why <count> is not accurate and not meant to be used
> for sync purposes.
> MAM is a special case because normally[*] XEP-0313 explicitly forbids
> adding or removing items in the middle of the result set. This is not
> true of most other things that RSM would be used for (disco items,
> pubsub, etc.), and therefore I think this flag would basically only
> work for MAM. And then all my reasoning above therefore applies.
Different, but yet in MAM you probably don’t want to count results accurately either, even though you could, and will probably be returning *very* approximate values here to avoid flooring the archive server (whatever form it takes).
> [*] the server setting stable='false' is an exception here - an aspect
> of the XEP I'm not keen on, but it was deemed necessary for some
> environments. Sync is simply impossible with such a server.
You can sync with such a server, but only those results that have become stable - the idea here is that if you have a clustered server doing something eventually convergentish you don’t want to refuse to answer MAM queries until the archive results are perfectly synched, so you can answer with some unstable results on the basis that that’s good enough for many use cases. Although this is getting offtopic somewhat.
More information about the Standards