[standards-jig] FW: [Re: retracting JEP-0040?]
timbeau_hk at yahoo.co.uk
Mon Jun 9 17:33:57 UTC 2003
On 9/6/03 1:26 pm, "Ralph Meijer" <jabber.org at ralphm.ik.nu> wrote:
> On Tue, Apr 29, 2003 at 08:46:36PM -0500, Peter Saint-Andre wrote:
>> ----- Forwarded message from Timothy Carpenter <> -----
>> Subject: Re: retracting JEP-0040?
>> From: Timothy Carpenter
>> To: Peter Saint-Andre
>> I have a few thoughts, basically in the area of functionality or behaviour
>> in JEP0040 that are not covered by JEP0060.
>> 1) there seems to be no ability to request a snapshot of the full contents
>> of a node at the point of subscription.
>> 2) The area of ID is still undefined, yet this is important to define. I
>> added a separate element, publish type, to control if the event was
>> updating, contained a full list of all current items in the node, was
>> correcting it (as opposed to altering it) etc. An alteration due to an event
>> needs to be distinguished from an alteration because value(s) are wrong.
>> This may not matter in chat, but is very important when sending financial
>> 3) Sequence numbering (the main thrust of JEP0040) is not covered. Being
>> able to ask for 'n events' in the past is not comparable, and relying on a
>> free format ID to control order and gaps is also not suitable in all cases.
>> As such, I believe that JEP0040 does not cover the same areas as JEP0060,
>> but I would like to suggest that the purpose of JEP0040, a robust version of
>> pubsub, be converted into an extension built upon JEP0060 foundations.
> It's been a while since this came by, but I was thinking about Gap Filling
> and pub/sub and remembered this JEP.
> Most of the stuff in JEP-0040 could indeed be adjusted to fit JEP-0060 instead
> of JEP-0024:
> 1) JEP-0060 already defines a way to request a snapshot of the full
> contents of a node in section 7.1.8:
> <iq type="get" from="pgm at jabber.org" to="pubsub.jabber.org" id="items1">
> <pubsub xmlns="http://jabber.org/protocol/pubsub">
> <items node="generic/pgm-mp3-player"/>
> This returns a list with, for each item in the node, the last publish
> to that item in that node. This of course assumes each publish is a
> complete replacement of a certain item. If you would want to have
> for example incremental updates, you could have some sort of hierarchy
> in the item id. For example: 'item/1' 'item/2', 'item/3', where item/* all
> belong to the same logical item.
Maybe I was not clear. It is important to be able to get the snapshot at the
moment of subscription - so as you subscribe, the first data you get is a
complete snapshot of the node. If you do this as two steps - request subs,
get confirmation, then request snapshot, then there is a risk the first
event is just a partial update, arriving before the satisfaction of your
request for a snapshot has been delivered.
This would not be as important if we had 'publish type' elements, with the
subscriber filtering out all 'update' events until the first 'snapshot'
arrives...but it is not as elegant as the ability to 'subscribe with
> 2) Doesn't this kind of information belong in the payload?
I am not sure it does. The 'publish type' is really defining what kind of an
event is occurring. Is this event a partial update?, a snapshot?, a
correction to a previous event?, etc. so I believe it should reside outside
> 3) Sequence numbering
> As JEP-0060 does not allow for subscribing to a publisher and the original
> publisher of an item is not relayed in the notifications, we probably only
> need one sequence here. This would be a sequence per node, increasing upon
> each publish. There should be a way to request the items in the gap,
> by extending the <items/> element with 'start' and 'stop' attributes
> that have the sequence numbers as value.
> At one moment I thought about JEP-0059 (Limiting and Paging Extension),
> but I thought it was too verbose.
With sequencing, it is important to consider what impact having one sequence
puts upon the pubsub component and what impact to QoS.
With one sequence number we, in effect, sequence the channel/circuit only.
The pubsub component could journal each pubsub-subscriber circuit traffic
separately in case there are any gaps. This could be done and I think it
SHOULD be done for efficiently handling short, relatively recent outages,
but it is not the full answer.
To journal ALL traffic for ALL subscribers is not necessarily efficient in
many scenarios that have rapid updates to large populations of subscribers.
So, we limit the 'history' in the circuit journal. What happens if the gap
falls off the end of the circuit journal? This is where the publisher
sequence number is useful.
If we only have a circuit sequence number and we fall off the end of the
journal, all the pubsub can do is resend images to " verify and
synchronize". Of course, this is NOT suitable to, say, someone tracking the
movements in a location-based scenario, as all the subscriber would get is
the last known position and NOT the missing events.
The use of the publisher sequence number allows the pubsub component to
refer back to the publisher stream to repair the gap. This stream can be
obtained either locally if the pubsub component decides to journal THAT
(more efficient than jounalling all the duplicated and filtered traffic
across all subscribers), or to request it from the publisher.
This mechanism allows for a highly scaleable, fractal model for pubsub
distribution, as when a pubsub component asks for a repair from the
publisher, it is behaving as a subscriber does to the pubsub component.
Put the case that the publisher is actually a pubsub repeater (presenting
services on behalf of other publishers), and you can see how this allows for
Of course, we could allow for some pubsub components to only provide circuit
gap filling (not referral to publisher) and have this discoverable, but I
think we should be aiming at robust, industrial-strength protocols here so
Jabber can be used in demanding situations.
More information about the Standards