[Standards] XEP-0231 (Data Element) - local caching

Peter Saint-Andre stpeter at stpeter.im
Wed Jul 30 01:49:01 UTC 2008

Ahoj Pavle!

Pavel Simerda wrote:
> Hello,
> I have some suggestions for XEP-0231 (Data Element).

Thanks for looking at this spec so thoroughly.

> Right now, as the example shows:
> <message from='ladymacbeth at shakespeare.lit/castle'
>          to='macbeth at chat.shakespeare.lit'
>          type='groupchat'>
>   <body>Yet here's a spot.</body>
>   <html xmlns='http://jabber.org/protocol/xhtml-im'>
>     <body xmlns='http://www.w3.org/1999/xhtml'>
>       <p>
>         Yet here's a spot.
>         <img alt='A spot'
>              src='cid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 at shakespeare.lit'/>
>       </p>
>     </body>
>   </html>
>   <data xmlns='urn:xmpp:tmp:data-element' 
>         alt='A spot'
>         cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6 at shakespeare.lit'
>         type='image/png'>
>     REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
>     ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
>     vr4MkhoXe0rZigAAAABJRU5ErkJggg==
>   </data>
> </message>
> Note: in this particular example the data is very short, this may not
> be the case in real world where people tend to ignore the size of data
> they send.

Yes, that's just about the smallest image I could find. The spec says 
that the image should not be more than 8k (which is twice the suggested 
size of an IBB chunk) but we don't know if people will typically send 
images that are smaller or larger than 8k -- I think smaller but I don't 
know that yet.

> We send data once for every session (and omit for subsequent messages).

In this case it's important to define "session" (see rfc321bis). Is it a 
chat session, a presence session, or something else?

> This has two important implications:
> 1) The other entity may or may not cache it for the session and reuse
> it. That is good.
> 2) If an entity keeps the data for a longer time (e.g. for weeks
> or even permanently), this cache will never be used. As the sending
> entity always resends the data for a new session.
> What I propose is:
>  * By default the sending entity would not send the data. It would
>    merely reference it by its cid url.
>  * Let the recieving client follow "3.4 Retrieving Uncached Media Data"
>    if the data is not cached (no real change, this is already being
>    done).

I think I like that approach. It introduces a round trip for the IQ, 
which might introduce some latency. But it puts the burden for "storing" 
and "serving" the image on the sender, which might discourage abuse of 
in-band images.

>  * Reserve the possibility of sending the data immediately with the
>    message for the *specific* case that the sending client actually
>    knows the recieving party cannot have the data cached (e.g. the
>    data was never sent before). This behavior should be considered
>    optional.

In that case the sender needs to keep a list of every JID to which it 
has ever sent the image. That seems suboptimal.

And I suppose the recipient might have received the image from another 
sender at some point, or might have received the image through other 
means (e.g., an emoticon "bundle").

> I further propose we add some informational section about generation
> of CIDs. Although it's specified elsewhere, I believe this XEP will be
> very useful and will be referenced from many future XEPs (and maybe
> improved as well - possibly some server caching etc). I think the
> informational section could suggest UUIDs generated by hashing the
> actual content.

Yes I think that would be helpful.

> Another thing that could be considered... is to add some sort of
> caching hint attribute that would suggest how long its reasonable to
> cache a particular resource. 

Do you think that would really be helpful? I'm still thinking about it...

> Maybe we could borrow from HTTP Cookies
> but allow (suggest) the clients to have some mechanisms for limiting the
> time, size and number of cached objects.
> There are many possibilities, I will just describe one of them.

Do you have examples of these?

> cache="no"
>  - no reason for caching the file will not be used again

Perhaps a thumbnail related to file transfer or some other ephemeral image?

> cache="session"
>  - we suggest the recieving party only caches for this
>    particular session

Perhaps also a thumbnail, or an image related to a whiteboarding session?

> cache="12"
>  - we suggest caching for twelve days from the last use of this cid (!)
>  - for every use (recieved reference) the recieving client should reset
>    the date we count from

Perhaps images included in an XHTML notification from a blogging service 
or somesuch?

> cache="unlimited"
>  - we suggest the client picks the longest time it allows (it could
>    possibly cache some small pieces of data permanenty)

Perhaps a commonly-used emoticon?

> Of course, the client MAY ignore the caching hit. In this case it
> SHOULD NOT cache at all.

Why not? My client could ignore caching hints because it has its own 
local policy (e.g. cache images only from people in my "Friends" group, 
but cache those forever because I want to keep them in message history). 
Or my client could ignore caching hints because it simply can't cache 
images (no room on the device, web client, etc.).

> If the cache attribute is not specified, we should decide on a
> reasonable default value ('session' or '1' day both seem good to me).

I think that's up to the client.

> Cheers,
> Pavel


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 7338 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.jabber.org/pipermail/standards/attachments/20080729/24447cfd/attachment.bin>

More information about the Standards mailing list