[Standards] offering a sorted+updated top list of items via pubsub

Justin Karneges justin-keyword-jabber.093179 at affinix.com
Fri Jun 18 21:57:41 UTC 2010

Hi folks,

I want to be able to subscribe to a top list of items efficiently.  The 
mechanism I've come up with so far is for the subscriber to specify a window 
size, and then the publisher sends a series of deltas to the subscriber 
including the positions.  Namely this is "insert item at index N" and "remove 
item at index N".  So, for example, a subscriber specifies a window size of 
10 when subscribing, and immediately receives 10 items and their position 
indexes.  From this point on, only the changes to this list need to be sent 
out, and only if they are within the top 10.  The server always knows what 
the client's "view" is, so if a new item gets inserted at position 3, the 
server is aware of (and relies upon!) that this has caused 8 items to shift 
positions on the client side and that the 10th item has shifted into position 
11 which is outside of the window and would need to be re-sent if that item 
were to enter the window area again.  The client can change its window size 
at anytime

For the window size, I'm using a custom pubsub x:data option.  So maybe we'd 
want something like this:

<iq type="set" to="foo" id="1">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <subscribe node="mynode" jid="myjid"/>
      <x xmlns="jabber:x:data" type="submit">
        <field var="FORM_TYPE" type="hidden">
        <field var='pubsub#window-size'>

For the positions, I'm currently doing this by putting them in the 
application-specific pubsub payloads, but I'm starting to think this should 
actually be an extension of the standard pubsub event itself.

<message from="foo">
  <event xmlns="http://jabber.org/protocol/pubsub#event">
    <items node="mynode">
      <item pos="3"> <--- generic position, usable for any data type
  <headers xmlns='http://jabber.org/protocol/shim'>
    <header name='SubID'>session1</header>

In our stream we have multiple kinds of appdata, where some are actual items 
that must be sorted and other data which is more like metadata and do not 
have a concept of position.  So 'pos' would only appear on the items where 
position is relevant.

Also of note is that subscription ids MUST be used, otherwise late stanzas 
from previous sessions could screw up the diffing.  Further, clients can't 
ever skip stanzas, so if they disconnect and come back later then they'll 
need to resubscribe to get the initial list again (barring server storage of 
all the deltas).  Temporary presence-based subscriptions are ideal here.

Looking for comments and critiques!


More information about the Standards mailing list