[Standards] XEP-0231 (Data Element) - local caching

Pavel Simerda pavlix at pavlix.net
Tue Jul 22 20:52:50 CDT 2008


I have some suggestions for XEP-0231 (Data Element).

Right now, as the example shows:

<message from='ladymacbeth at shakespeare.lit/castle'
         to='macbeth at chat.shakespeare.lit'
  <body>Yet here's a spot.</body>
  <html xmlns='http://jabber.org/protocol/xhtml-im'>
    <body xmlns='http://www.w3.org/1999/xhtml'>
        Yet here's a spot.
        <img alt='A spot'
             src='cid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 at shakespeare.lit'/>
  <data xmlns='urn:xmpp:tmp:data-element' 
        alt='A spot'
        cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6 at shakespeare.lit'

Note: in this particular example the data is very short, this may not
be the case in real world where people tend to ignore the size of data
they send.

We send data once for every session (and omit for subsequent messages).

This has two important implications:

1) The other entity may or may not cache it for the session and reuse
it. That is good.

2) If an entity keeps the data for a longer time (e.g. for weeks
or even permanently), this cache will never be used. As the sending
entity always resends the data for a new session.

What I propose is:

 * By default the sending entity would not send the data. It would
   merely reference it by its cid url.
 * Let the recieving client follow "3.4 Retrieving Uncached Media Data"
   if the data is not cached (no real change, this is already being
 * Reserve the possibility of sending the data immediately with the
   message for the *specific* case that the sending client actually
   knows the recieving party cannot have the data cached (e.g. the
   data was never sent before). This behavior should be considered

I further propose we add some informational section about generation
of CIDs. Although it's specified elsewhere, I believe this XEP will be
very useful and will be referenced from many future XEPs (and maybe
improved as well - possibly some server caching etc). I think the
informational section could suggest UUIDs generated by hashing the
actual content.

Another thing that could be considered... is to add some sort of
caching hint attribute that would suggest how long its reasonable to
cache a particular resource. Maybe we could borrow from HTTP Cookies
but allow (suggest) the clients to have some mechanisms for limiting the
time, size and number of cached objects.

There are many possibilities, I will just describe one of them.

 - no reason for caching the file will not be used again
 - we suggest the recieving party only caches for this
   particular session
 - we suggest caching for twelve days from the last use of this cid (!)
 - for every use (recieved reference) the recieving client should reset
   the date we count from
 - we suggest the client picks the longest time it allows (it could
   possibly cache some small pieces of data permanenty)

Of course, the client MAY ignore the caching hit. In this case it
SHOULD NOT cache at all.

If the cache attribute is not specified, we should decide on a
reasonable default value ('session' or '1' day both seem good to me).



Web: http://www.pavlix.net/
Jabber & Mail: pavlix(at)pavlix.net
OpenID: pavlix.net

More information about the Standards mailing list