[Standards] XEP-136 and XEP-59 implementation comments

Alexander Tsvyashchenko lists at ndl.kiev.ua
Wed Nov 14 13:16:05 UTC 2007


Hello All,

When working on mod_archive_odbc implementation (XEP-136 support for  
ejabberd) and libwsw (the library for XEP-136 support on clients side)  
I discovered different issues with XEP-136 standard which I’d like to  
present here, in the hope that they still can be addressed until  
XEP-136 goes “gold”.

I hope posting those issues here is the right way to start discussion  
about their resolution, if not - I appologize for that and would  
appreciate if someone points me to the right way of doing it.

For those issues that have proposed solutions, support for these  
solutions was implemented and verified in mod_archive_odbc and libwsw,  
so they’re certainly feasible.

The HTML version of these comments is available at  
http://www.ndl.kiev.ua/typo/articles/2007/11/14/xep-136-and-xep-59-implementation-comments

Replication
===========

Replication in XEP-136 has at least two flaws which make its usage
somewhat problematic.

Duplicate items
---------------

Basically, it is stated that client should use specifically prepared <after>
RSM tag to specify the point where to start from.

IMHO this seems to be quite bad decision on its own for at least two reasons:
  1. This violates XEP-59, item 2.2: "The requesting entity MUST treat all UIDs
     as opaque".
  2. It differs from all other XEP-136 commands, which use "start" / "end"
     attributes to specify the required range, thus creating unwanted
     and confusing "special case".

That being not enough, it creates yet another problem.

Consider several changes that are done at the same period of time -  
for example,
as part of "remove range" request, but it can happen even without that
if several messages come in different conversations at the same time
and auto-archiving is enabled - and, thus, change times of these conversations
become equal.

Now if someone issues <modified> query and this query due to RSM  
<limit> clause
stops at some of the collections with the same changed time, the next  
query will
either list all collections which were sent to the client already with  
the same
time - or skip all remaining collections with the same time, as the server
have no way to determine where to start its answer having only "change time"
value in <after> element.

Both cases seems to be quite bad, as first one requires the additional
filtering on client side and in some cases may mean client will enter the
infinite loop (if <limit> size is less than number of collections with the
same changed time), and second one means that some data will be just missing,
thus destroying synchronization between client and server.

Proposal: change "10. Replication" item by removing references to <after>
and <last> element and stating that start replication date should be specified
using "start" attribute of "modified" command with additional note that
the collections with changed time exactly equal to "start" time are  
NOT included
in the result (thus, "start" will effectively work as "after").

So, rephrasing the query from Example 57:

     <iq type='get' id='sync1'>
       <modified xmlns='http://www.xmpp.org/extensions/xep-0136.html#ns'
                 start='1469-07-21T01:14:47Z'>
         <set xmlns='http://jabber.org/protocol/rsm'>
           <max>50</max>
         </set>
       </modified>
     </iq>

Probably, it may make sense to use "after" attribute in "modified" command to
highlight the difference with "start", I'm not sure which solution is better.

Then "modified" command may work just like any other command and RSM will also
be used consistently to page through results without any ambiguity.

This change can be done with complete backward compatibility: if server
discovers that there's <after> RSM element that specifies datetime - use old
mechanism, if <after> is not specified or is not datetime - use the new one.

Who changed it?
---------------

Typically the client will perform replication when it has some local cache
for collections / messages, to synchronize its cache with server one.
Therefore, it makes sense that client also use this cache for caching those
collections client uploads.

However, implementing it strictly according to XEP-136 means that client
has no way to determine if the changes received in replication were done
by this client or not - so, it will have to re-fetch entire collection even
if <changed> item in replication results was caused by upload from itself,
thus basically downloading the same collection it just uploaded on the server,
which is stored already in local cache.

Proposal: extend replication answer to include "by" attribute, which specifies
full JID of entity who made that change. Then client that receives replication
results can verify if the change was done by itself or not, thus discarding
those changes that are cached locally already.

Example:

     <iq type='result' to='romeo at montague.net/orchard' id='sync1' >
       <modified xmlns='http://www.xmpp.org/extensions/xep-0136.html#ns'>
         <removed by='romeo at montague.net/pda'
                  with='balcony at house.capulet.com'
                  start='1469-07-21T03:16:37Z'/>
.....

This change can be done with complete backward compatibility, as it's just
extends the answer format.

XEP-59: detecting the change
============================

During caching in client implementation I faced up with the problem
that it may be dangerous to fetch collections when client is not synchronized
with the server, as if client maintains some internal state based on received
results and it receives results after some change was made, but it  
doesn't know
it - it may screw up its internal state.

Consider the following example: the client builds indices for collections that
are fetched by utilizing RSM "index" attribute, so that it can answer locally
indexed requests. However, these local indices are valid until the change
is made on server side - after that they should be rebuild using replication.

However, if now the client fetches some collections after the change happened
on the server, and it wasn't able to discover that - it will screw up its
indices by inserting fetched collections in local cache, as indices may
be shifted already - and will be shifted once more, when performing  
replication,
as client cannot detect at replication time what collections were  
fetched before
replication and what collections after it.

Please note that this is just one possible scenario, there may be some others.
All such scenarios would require some form of determining whether fetched
results are "valid", which translates to "were they changed compared to some
fixed time point?"

One possible solution here is to perform replication before and after each
query to the server, and discard results of query just performed if it is
found out that change took place - however, this seems to be unacceptably
high overhead, as instead of 1 query the client has to perform 3 queries.

Proposal: add to RSM result the tag "changed", which, when present,
indicates the datetime of the most recent change of the items affected by
the query. It typically shouldn't be that problematic to compute this value
(certainly it wasn't for XEP-136 implementation), and it can be made optional,
as it is done with "index" if in some cases it's hard to calculate it.

Example:

     <iq type='result' to='romeo at montague.net/orchard' id='sync1' >
       <modified xmlns='http://www.xmpp.org/extensions/xep-0136.html#ns'>
         <removed by='romeo at montague.net/pda'
                  with='juliet at capulet.com/chamber'
                  start='1469-07-21T02:56:15Z'/>
     ...

         <changed by='romeo at montague.net/orchard'
                  with='balcony at house.capulet.com'
                  start='1469-07-21T03:16:37Z'/>
         <set xmlns="http://jabber.org/protocol/rsm">
           <first index="0" >63362086582 at 1</first>
           <last>63362092915 at 51</last>
           <changed>1469-07-21T04:22:39Z</changed>
           <count>1372</count>
         </set>
       </modified>
     </iq>

Inconsistencies or omissions
============================

Start attribute
---------------

Attribute "start" usage seems to be inconsistent:

* For "8.1 Retrieving a List of Collections" it is "If only 'start' is
specified then all collections on or after that date should be returned."
* For "8.3 Removing a Collection" it is "If the end date is in the future then
then all collections after the start date are removed."

I assume it's just a typo and for 8.3 it should be "on or after" instead of
"after", no?


Remove JID from prefs
---------------------

I'm not sure if this is a problem or not, but it seems there's no way
to remove JID from user prefs once it is there. I'm not really  
comfortable with
this as it means even if some JIDs are removed from your roster, all their
collections are also removed - but still you have them in prefs without the
possibility to remove them, and this list will grow over time.

Wouldn't it make sense to specify that uploading item with all tags besides
JID being empty removes this user from prefs, in the same way it is done
for links and extra info in chats?

JIDs prefs: ambiguity
---------------------

Possibly related to previous item: what should happen if during prefs upload
some attributes for JID are not specified, and they were present earlier?
Should they be reset to default values, or remained as they were  
before update?

Taking into account previous item, I see two possibilities:

  1. All missing attributes are remained as is unless none are specified - then
     request is treated as removal request.
  2. All missing attributes are reset to default values unless none are
     specified - then request is treated as removal request.

Second case seems to be more logical for me, as then removal behavior follows
almost automatically from general case.

Resource modification when auto archiving
-----------------------------------------

When performing auto archiving it's possible that the initial message may be
not enough to determine full JID of the recipient - if the conversation
is initiated by the client whose server performs auto archiving and the client
does not know what resource it should use, it will send the message to  
bare JID,
thus initiating auto archiving for collection with bare JID.

However, when reply message is received, we know now the full JID - thus, it
could make sense to adjust initial collection, changing its JID to full JID,
otherwise we will either start new collection after message is received -
or continue recording in bare JID collection, thus effectively eliminating
resource usage in "with" attribute altogether.

Of course, the possibility here would be to just drop all resources from JIDs
and store only bare JIDs, but that seems to be too limiting and inconvenient.

The proposal here is to specify the algorithm the implementation should use
to perform tracking of conversations, making the best effort to determine
and correct JIDs when additional info becomes available.

In mod_archive_odbc I've implemented tracking algorithm, but it's  
possible I've
missed some points due to not really good knowledge of XMPP standards or
lack of experience with XMPP-related developments. I will provide description
of the algorithm in appendix - please, fill free to comment on it or take it
as the basis for inclusion to XEP-136, if it appears to be useful.

Various small notes
===================

Duplicate messages times
------------------------

In "5.3 Uploading Messages to a Collection" it's specified that "If  
the collection already exists then the server
MUST append the messages to the existing collection." However, it's not said
what should be done if time for some of the messages is equal to time of those
messages existing already in collection.

I assume that from "append the messages" clause it follows that  
duplicate entities
should be created, but it could be good to mention to avoid ambiguities.

Malformed XML in examples
-------------------------

"Example 21. Private chat linked to later groupchat" and
"Example 24. Private chat with attributes form" contain malformed XML:
first message in chats starts as "to", but closes as "from".

List collections for Bare JID / Domain
--------------------------------------

There seems to be no way to list collections solely for service JID,
as according to XEP-136 it's treated as domain JID request.

For example, when trying to list all collections for icq.example.com
you will get instead all collections of all users at icq.example.com - even
if you wanted to receive collections ONLY for icq.example.com

I do not think this is major problem, as it can be filtered out on  
client side -
the only drawback is high amount of extra traffic, so, probably, it can
be left as it is, but adding some notice in specification on that subject
could be nice.

File format
-----------

 From my experience it seems that limiting one file to conversation with just
one JID is too restrictive - dealing with one single file for all JIDs is much
more convenient in many cases than with a bunch of files.

On the other hand, I can imagine when it's better to separate it to  
small files.

Therefore, probably the restriction could be just removed by allowing having
"with" attribute in "chat" items stored in file and making "with"  
attribute for
"archive" tag optional? This doesn't seem like a big change, but will  
certainly
make this file format more useable for those cases when one big file  
is preferred,
such as backup, or import / export.

Appendix: conversations tracking
================================
Here is the approach that is used in mod_archive_odbc.

It's assumed that information about all active collections being  
recorded is stored in dictionary.

The dictionary has two levels: first level key is bare JID, second  
level key is the thread. If thread is not present, {no_thread,  
Resource} is used
instead.

Algorithm for deciding on collection to use when some message is  
received is as follows:

1. If thread is specified in the message - just use both-levels keys  
normally, reusing some existing collection if there's a match or  
creating new one if no matching collection found.
2. If no first-level key exists for this JID: create new collection +  
two level keys with existing information, second level key will be  
{no_thread, Resource} with Resource being either empty or non-empty.
3. Otherwise use first-level key to get access to second-level keys, then:
     * If resource IS specified: search for matching resource through  
second-level keys:
         - if found - just use the appropriate collection.
         - if not, search for second-level key with empty resource. If  
found, use its collection and rewrite key's & collection's empty  
resource to the new one. If not - create new collection + key.
     * If resource IS NOT specified: use the most recent second-level  
key or create new collection if none exists.

Good luck!                                     Alexander




More information about the Standards mailing list