[Standards] MAM: Conflicting storage prefs behaviour

Matthew Wild mwild1 at gmail.com
Sun Feb 19 21:49:05 UTC 2017


Hi Ruslan,

On 19 February 2017 at 13:48, Ruslan N. Marchenko <me at ruff.mobi> wrote:
> Good afternoon,
>
>
> I'm preparing implementation of the mam and since there're very few details
> in the XEP-0313 about actual archiving, mostly about querying - i believe
> the archiving process is then left at the discretion of the implementers.
>
>
> Now, to avoid storing multiple copies of the message for given server to me
> it makes sense storing message while it is being routed. Certain central
> server archiving. Probably not blindly archive everything but rather as a
> _union all_ of all user prefs. And retrieval will just query that central db
> and wrap message to necessary xml layers. Since there's no archive
> modification in the MAM scope - there's no reason not to do it and it should
> be quite efficient.
>
> And here where the _potential_ conflict comes.
>
> Romeo at montague.lit informs server it should only archive messages to
> Juliet at capulet.lit because he doesn't want to miss a thing by losing it in
> tonnes of interactions, or perhaps he wants certain privacy in his archive,
> who knows:
>
> <iq type='set' id='romeo1'>
>   <prefs xmlns='urn:xmpp:mam:1' default='never'>
>     <always>
>       <jid>juliet at capulet.lit</jid>
>     </always>
>   </prefs>
> </iq>
>
> _Note: in the above I've missed <never/> because xep does not require it. It
> says server must return it in the "result" but nothing about client sending
> full prefs bucket in "set"._
>
> However Mercutio at montague.lit informs server to store roster and probably
> some other prefs
>
> <iq type='set' id='mercut1'>
>   <prefs xmlns='urn:xmpp:mam:1' default='roster'>
>     <always>
>       <jid>romeo at montague.lit</jid>
>     </always>
>     <never>
>       <jid>tybalt at capulet.lit</jid>
>     </never>
>   </prefs>
> </iq>
>
> Now, server will apparently store conversation between Romeo and Mercutio,
> the question is then - should server keep silent to Romeo that it has his
> other conversations?
> If Romeo later changes his preferences to include those of Mercutio - should
> server reveal it actually has some messages to consume?
> If yes - it exposes certain privacy risk: If I didn't ask to store message,
> and server stores them - then the other side requested them to be stored.
> If no - it would require tracking storage prefs windows and apply them all
> the way through the time, or add metadata listing who was eligible user of
> the stored message at the time of storing it. Which still is a bit
> cumbersome.
>
> Or the best practices here should be to never mix archives and keep a
> separate copy for each user according to his current preferences at any
> given time? Could user request to purge the archive?

As far as XEP-0313 is concerned, each user has a separate archive. If
you want to be clever, on the server side, it would be possible to
store a message between two contacts on the same server only once.
However this is purely an optimisation, and whether the server does
this or not should be totally invisible to the users. It's possible to
do this, but personally as an implementer I don't believe the benefits
outweigh the complexity of the implementation. Storage is cheap, and I
personally don't think that most servers should be storing archived
messages for a long time anyway.

> On the other hands whole XEP says it's up to server what to store hence it
> may return absolutely different comparing to what it was asked for - prefs
> are rather hints (may/should) not orders (must).

The prefs are at the JID level, to enable/disable archiving for that
JID. If archiving is enabled for a given JID, the server would look at
the stanza and decide whether to archive it (and yes, the XEP leaves
this choice largely up to the server). If archiving is disabled for a
given JID (e.g. through <never/>) then the server would not consider
any stanzas for archival.

In other words, a JID being in <always/> does not mean "archive every
single stanza", it just means archiving is enabled for that JID.

> Then perhaps in this particular implementation it would make sense to
> disable prefs and store everything instead to avoid the conflict/leak?

Some deployments may disable prefs, but I think most will want it
enabled. In any case, your prefs control your archive only. If you
disable archiving for Juliet, you should never see any of her messages
in your archive, even if she has archiving enabled for you.

An example implementation of storage de-duplication, let's say you
have two users: userA and userB.

userA sends a message to userB, which the server archives, because
both of them have archiving enabled. When processing the message from
userA, the server gives the stanza a unique ID and stores it in a
global stanzaStore. It then adds a record to userA's archive, which
includes the ID of the stanza in the stanzaStore.

When delivering the stanza to userB, it adds a record to userB's
archive, and with the same ID in the record. Both users now have the
stanza "in" their archive, but it is only stored once, in the server's
central stanzaStore.

Now userB decides to disable archiving (with userA, or with all
contacts, it doesn't matter).

The same thing still happens as before, but this time no record is
added to userB's archive. If they perform a query, no messages will be
returned. They are still stored (because userA requested that) but
userB cannot see them.

Does this make sense?

XEP-0313 explicitly does *not* allow you to prevent your contacts
archiving your messages (this is impossible to do), so if you contact
has archiving enabled and you don't want that, there is nothing you
can do. The preferences in XEP-0313 are only about controlling *your*
archive.

Regards,
Matthew


More information about the Standards mailing list