[Standards] Comments on SIFT

Waqas Hussain waqas20 at gmail.com
Sat Mar 6 21:51:53 UTC 2010


While implementing mod_sift for Prosody, I saw some possibilities for
improvement and had thoughts about issues. Some of these follow.


1. Remove disallowed child elements for filtered messages and presence.

Here's a typical identi.ca message:

  <message from="update at identi.ca/xmpp001daemon" to="waqas at jaim.at" type="chat">
  <body>evan: RT @sil doom. the Shuttle computer I'm setting up for
dad can't read the hard drive. Won't boot from USB, has no CD drive, I
have no USB ... [23931040]</body>
  <html xmlns="http://jabber.org/protocol/xhtml-im">
  <body xmlns="http://www.w3.org/1999/xhtml">
  : RT @doom. the Shuttle computer I'm setting up for dad can't read
the hard drive. Won't boot from USB, has no CD drive, I have no USB
...
  <a href="http://identi.ca/evan">evan</a>
  <span class="vcard">
  <a title="Stuart Langridge" class="url" href="http://identi.ca/user/279">
  <span class="fn nickname">sil</span>
  </a>
  </span>
  <a href="http://identi.ca/conversation/24011046#notice-23931040">[23931040]</a>
  </body>
  </html>
  <entry xmlns="http://www.w3.org/2005/Atom">
  <source>
  <title>evan - Identi.ca</title>
  <link href="http://identi.ca/evan" />
  <link rel="self" type="application/atom+xml" href="http://identi.ca/evan" />
  <link rel="license" href="http://creativecommons.org/licenses/by/3.0/" />
  <icon>http://avatar.identi.ca/1-96-20090819204503.jpeg</icon>
  </source>
  <title>RT @sil doom. the Shuttle computer I'm setting up for dad
can't read the hard drive. Won't boot from USB, has no CD drive, I
have no USB ...</title>
  <author>
  <name>evan</name>
  <uri>http://identi.ca/user/1</uri>
  </author>
  <actor xmlns="http://activitystrea.ms/spec/1.0/">
  <object-type>http://activitystrea.ms/schema/1.0/person</object-type>
  <id xmlns="http://www.w3.org/2005/Atom">http://identi.ca/user/1</id>
  <title xmlns="http://www.w3.org/2005/Atom">Evan Prodromou</title>
  <link rel="alternate" type="text/html" href="http://identi.ca/evan"
xmlns="http://www.w3.org/2005/Atom" />
  <link rel="avatar" type="image/jpeg"
xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="353"
xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="353"
href="http://avatar.identi.ca/1-353-20090819204502.jpeg"
xmlns="http://www.w3.org/2005/Atom" />
  <link rel="avatar" type="image/jpeg"
xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="96"
xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="96"
href="http://avatar.identi.ca/1-96-20090819204503.jpeg"
xmlns="http://www.w3.org/2005/Atom" />
  <link rel="avatar" type="image/jpeg"
xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="48"
xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="48"
href="http://avatar.identi.ca/1-48-20090819204503.jpeg"
xmlns="http://www.w3.org/2005/Atom" />
  <link rel="avatar" type="image/jpeg"
xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="24"
xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="24"
href="http://avatar.identi.ca/1-24-20090819204503.jpeg"
xmlns="http://www.w3.org/2005/Atom" />
  <point xmlns="http://www.georss.org/georss">45.5088375 -73.587809</point>
  <preferredUsername
xmlns="http://portablecontacts.net/spec/1.0">evan</preferredUsername>
  <displayName xmlns="http://portablecontacts.net/spec/1.0">Evan
Prodromou</displayName>
  <note xmlns="http://portablecontacts.net/spec/1.0">Montreal hacker
and entrepreneur. Founder of identi.ca, lead developer of StatusNet,
CEO of StatusNet Inc.</note>
  <address xmlns="http://portablecontacts.net/spec/1.0">
  <formatted>Montreal, Quebec, Canada</formatted>
  </address>
  <urls xmlns="http://portablecontacts.net/spec/1.0">
  <type>homepage</type>
  <value>http://evan.prodromou.name/</value>
  <primary>true</primary>
  </urls>
  </actor>
  <link rel="alternate" type="text/html"
href="http://identi.ca/notice/23931040" />
  <id>http://identi.ca/notice/23931040</id>
  <published>2010-03-06T20:01:22+00:00</published>
  <updated>2010-03-06T20:01:22+00:00</updated>
  <link rel="ostatus:conversation"
href="http://identi.ca/conversation/24011046" />
  <forward ref="http://identi.ca/notice/23928915"
href="http://identi.ca/notice/23928915"
xmlns="http://ostatus.org/schema/1.0" />
  <content type="html">RT @<span class="vcard"><a
href="http://identi.ca/user/279" class="url" title="Stuart
Langridge"><span class="fn nickname">sil</span></a></span> doom. the
Shuttle computer I'm setting up for dad can't read the hard drive.
Won't boot from USB, has no CD drive, I have no USB ...</content>
  </entry>
  </message>

Look at the size of that. Should I laugh or cry?  This should be reduced to:

  <message from="update at identi.ca/xmpp001daemon" to="waqas at jaim.at" type="chat">
  <body>evan: RT @sil doom. the Shuttle computer I'm setting up for
dad can't read the hard drive. Won't boot from USB, has no CD drive, I
have no USB ... [23931040]</body>
  </message>

for mobile clients. That's roughly 6% of the original (~4,257 bytes
reduced to ~262 bytes). I think without this behavior, message
filtering is pretty useless.

Useless fact: Watching offline messages from identi.ca using up
bandwidth in slow motion (slow, expensive GPRS with payment based on
bandwidth usage) is what got mod_sift for Prosody started.


2. Offline messages.

A SIFT message filter which has some <allow/> elements doesn't scale
well for large numbers of offline messages. Currently a server with an
SQL backend may do something like this:

  1. resource becomes available
  2. SELECT * FROM offline_messages WHERE JID == ${account_jid}
  3. loop over the resultset and send all messages to the newly
available resource
  4. DELETE FROM offline_messages WHERE JID == ${account_jid}

With per-message filtering this changes to something like this:

  3. loop over the resultset and send all _allowed_ messages to the
newly available resource
  4. for each sent message, DELETE FROM offline_messages WHERE JID ==
${account_jid} and MESSAGEID == $(unique_message_id)

This could be optimized somewhat, but would still be relatively complex.

I've added this here to get comments from other server developers. How
significant is this overhead in your opinion?


3. Automatic IQ responses.

Currently SIFT allows blocking IQs based on payload. The server
auto-reponds with an error. It would be interesting if the server
could be made to reply with an IQ result preset by the client.

Maybe something along these lines:

  <sift xmlns='urn:xmpp:sift:1'>
    <iq>
      <reply name='query' ns='http://jabber.org/protocol/disco#info'>
        [...] service discovery reply payload here [...]
      </reply>
    </iq>
  </sift>

The above example has some issues (think service discovery nodes), but
the approach is worth considering regardless. This fits perfectly for
version replies, etc.


4. mod_sift for Prosody

Our implementation is a work in progress, but it does the basics.
Hopefully we'll have some implementation experience soon. Now if only
those client developers hurry up.

Docs: http://code.google.com/p/prosody-modules/wiki/mod_sift
Source: http://code.google.com/p/prosody-modules/source/browse/mod_sift/mod_sift.lua


--
Waqas Hussain



More information about the Standards mailing list