[Standards-JIG] NEW: Message Archiving

David Yitzchak Cohen lists+jabber_standards at bigfatdave.com
Mon Jun 7 03:17:03 UTC 2004

On Sun, Jun 06, 2004 at 05:16:54AM EDT, Justin Karneges wrote:
> On Saturday 05 June 2004 10:41 pm, David Yitzchak Cohen wrote:
> > On Sun, Jun 06, 2004 at 12:00:31AM EDT, Justin Karneges wrote:

> > > and for offline messages it is x:delay which is
> > > shown to the user.
> >
> > There, sadly, you have no way of preventing the server from having
> > fun with dinosaur era messages without control of your own server at
> > a minimum.
> Your server can do far worse things, like masquerade as any other user on 
> Jabber.  I don't think it is worth recording two timestamps in a feeble 
> attempt to compensate.

Okay, you're picking on me here.  The case where a server generates
a bogus timestamp can be a mis-synced NTPd (which I agree probably
shouldn't be a good enough reason in and of itself to add complexity
to our protocol - troubleshooters can use their own tools, and don't
need us to go out of our way to help 'em out), but cases where a server
generates what appear to be bogus timestamps can be far less drastic.
The classic case of logging on and getting bombarded with offline
messages is one such case.  Other cases involve temporary network issues.
(Imagine a Jabber gateway with a whole bunch of PPP-connected Jabber
servers, for instance.)  In either case, it's useful to know both when
the message was sent and when it was received (just like SMTP).

> > > On a related note,
> > > should the client be able to specify a timestamp for a message
> > > collection?
> >
> > Shouldn't the timestamp range for a message collection be calculated by
> > the server based on its components?  Oh, you mean, to have a timestamp on
> > "when the collection was collected," or something like that ... um ... I
> > don't have need for such an extension, but I wouldn't call it useless.
> Well, if you receive an x:delay message, and upon login you receive it and 
> archive it, what would be the date of the collection?  I think I asked this 
> question because it might provide some sort of compromise with your x:delay 
> concern, as you'd wind up with two start times then, one essentially being 
> the time of receipt.

Well, that'd work fine, as long as you didn't decide to continue
the conversation after that initial exchange.  (If you continue the
conversation, your original timestamp is of course clobbered.)  I guess
it's a neat compromise, but it complicates the protocol without really
adding reliable information, so I'd hesitate to call it a viable solution
to my slight concern above.

> > > Regarding to/from, this is determined by the presence of a "to" or "from"
> > > attribute in the message (the JEP forbids logging both, although it could
> > > be better spelled out).
> >
> > In other words, a user can't log messages from third-party conversations
> > (say, another of his JIDs) without losing some info.  ("Your pick:
> > either you lose the sender, or the recipient; but you can't have both!")
> This is an interesting point.  Currently the protocol has a first-person angle 
> to it.  Maybe a collection shouldn't have just a 'jid', but instead an 
> 'a' (the logger) and 'b' (the peer), thus allowing you to archive any 
> conversation, not just your own, with no real distinction between who is 
> closer to the reader. :)

Here's my take: since the JID isn't a unique key for finding a collection
anyway, you might as well toss the whole idea of a collection "having"
a JID.  When you ask the server for a list of collections with a certain
JID, it can simply search for collections having messages to/from that
JID.  (In practice, the server can maintain a list of JIDs "involved" in
any given collection (or possibly a list of collections in which a given
JID is involved - which is faster will probably depend on the density)
in order to avoid a performance hit in the lookups.)  Another neat
advantage here is that in MUC, you can archive entire groupchats, and
you'll be able to find conversations where XYZ participated easily.

> And actually, logging both the 'to' and 'from' in a message stanza doesn't 
> need to be forbidden (and as I reread the JEP, I see that it is not), just 
> that at least one needs to be present to derive the proper sender.

You might as well just have both required, but allow the server to supply
defaults for any that aren't supplied:

client supplies________|server assumes______________

This kinda simplifies all the rules from a client perspective, and makes
conformance testing easier for the server.

> A related issue is how to deal with JID resources.  Logging both the full 
> to/from JIDs in the message stanza would allow us to retain this information, 
> however it is also completely redundant, since the resources should never 
> change in a normal conversation.

Picture a situation where I'm chatting with you from /work, and then
I get off in the middle in order to get on /mobile, where I continue
chatting with you, and finally, arriving at /home, we finish up our
conversation there.  (Never say never, eh?)

> The only gray area is the start of a chat, 
> when you have sent the first message and have not received a reply yet.  
> Maybe once the resource is determined, the 'to' (or 'b', heh) of the 
> _collection_ should be updated to reflect it.  If you never get a reply, then 
> 'b' will remain the bare JID.

Well, the searching idea proposed above works nicely with the /work,
/mobile, /home problem, since it allows you to find any conversation
where you were at work by searching on your at jid/work :-)

> The one exception to all of this is groupchat, where the resource will 
> constantly differ, and so in that case we wouldn't want to change the 
> collection 'b' value, but we do want to record the nickname in the stanza.

Let's not make two different protocols, one for MUC and one for standard
messages. . .

> And for groupchats:
> <message xmlns='jabber:client' archive:nick='ahkitj'>
>   <body>/me waves</body>
> </message>

eww. . .

> Notice that there would then be no distinction between 'a' and 'b' in a 
> groupchat.  The person doing the actual logging is no different from any 
> other participant, aside from that his jid would be the 'a' in the collection 
> (maybe this could be an optional field then).

interesting ;-)

 - Dave

Uncle Cosmo, why do they call this a word processor?
It's simple, Skyler.  You've seen what food processors do to food, right?

Please visit this link:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mail.jabber.org/pipermail/standards/attachments/20040606/5b6103b9/attachment.sig>

More information about the Standards mailing list