[Standards] XEP-0231 (Data Element) - local caching

Marcus Lundblad ml at update.uu.se
Fri Jul 25 05:46:06 CDT 2008


ons 2008-07-23 klockan 03:52 +0200 skrev Pavel Simerda:
> Hello,
> 
> I have some suggestions for XEP-0231 (Data Element).
> 
> Right now, as the example shows:
> 
> <message from='ladymacbeth at shakespeare.lit/castle'
>          to='macbeth at chat.shakespeare.lit'
>          type='groupchat'>
>   <body>Yet here's a spot.</body>
>   <html xmlns='http://jabber.org/protocol/xhtml-im'>
>     <body xmlns='http://www.w3.org/1999/xhtml'>
>       <p>
>         Yet here's a spot.
>         <img alt='A spot'
>              src='cid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 at shakespeare.lit'/>
>       </p>
>     </body>
>   </html>
>   <data xmlns='urn:xmpp:tmp:data-element' 
>         alt='A spot'
>         cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6 at shakespeare.lit'
>         type='image/png'>
>     iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP
>     C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA
>     AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J
>     REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
>     ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
>     vr4MkhoXe0rZigAAAABJRU5ErkJggg==
>   </data>
> </message>
> 
> Note: in this particular example the data is very short, this may not
> be the case in real world where people tend to ignore the size of data
> they send.
> 
> We send data once for every session (and omit for subsequent messages).
> 
> This has two important implications:
> 
> 1) The other entity may or may not cache it for the session and reuse
> it. That is good.
> 
> 2) If an entity keeps the data for a longer time (e.g. for weeks
> or even permanently), this cache will never be used. As the sending
> entity always resends the data for a new session.
> 
> What I propose is:
> 
>  * By default the sending entity would not send the data. It would
>    merely reference it by its cid url.
>  * Let the recieving client follow "3.4 Retrieving Uncached Media Data"
>    if the data is not cached (no real change, this is already being
>    done).
>  * Reserve the possibility of sending the data immediately with the
>    message for the *specific* case that the sending client actually
>    knows the recieving party cannot have the data cached (e.g. the
>    data was never sent before). This behavior should be considered
>    optional.
> 
> I further propose we add some informational section about generation
> of CIDs. Although it's specified elsewhere, I believe this XEP will be
> very useful and will be referenced from many future XEPs (and maybe
> improved as well - possibly some server caching etc). I think the
> informational section could suggest UUIDs generated by hashing the
> actual content.
> 
> Another thing that could be considered... is to add some sort of
> caching hint attribute that would suggest how long its reasonable to
> cache a particular resource. Maybe we could borrow from HTTP Cookies
> but allow (suggest) the clients to have some mechanisms for limiting the
> time, size and number of cached objects.
> 
> There are many possibilities, I will just describe one of them.
> 
> cache="no"
>  - no reason for caching the file will not be used again
> cache="session"
>  - we suggest the recieving party only caches for this
>    particular session
> cache="12"
>  - we suggest caching for twelve days from the last use of this cid (!)
>  - for every use (recieved reference) the recieving client should reset
>    the date we count from
> cache="unlimited"
>  - we suggest the client picks the longest time it allows (it could
>    possibly cache some small pieces of data permanenty)
> 
> Of course, the client MAY ignore the caching hit. In this case it
> SHOULD NOT cache at all.
> 
> If the cache attribute is not specified, we should decide on a
> reasonable default value ('session' or '1' day both seem good to me).
> 
I have written an implementation of the current XEP use-case 3.1
(in-band images) in libpurple (Pidgin).
Currently it always includes the data the first time it is sent in a
session. But the implementation will also request the data using
use-case 3.4 if it hasn't cached it. Currently caching is only done
in-memory within a session (really a chat conversation).
So this implementation would still work if another client would use
something along the above proposal.

Though I'm not sure if it's worth the extra complexity, since the
recommended max size is 8 kB. Also the emoticons I have used are
generally quite small, often around 1 kB.

Maybe we could add a clause that says the sender "MAY include the data"
the first time it is used. This way it could be optional to include the
data. In some special cases a client can choose to not include the data
if it knows the receiver might have it cached.

In other cases it would probably not make sense to cache data, such as
when providing a preview for an image file transfer.

//Marcus


> Cheers,
> Pavel
> 




More information about the Standards mailing list