[Standards-JIG] JEP-0060 should be read with "Atom over XMPP"

Bob Wyman bob at wyman.us
Fri May 26 14:58:25 UTC 2006


Jean-Louis Seguineau wrote:
> Interesting working group you have on FeedMesh, Bob :)
> http://groups.yahoo.com/group/feedmesh/
> More seriously, where do I look for summarized info on FeedMesh.
	Unfortunately, there isn't a good "summary" of FeedMesh available at
present. I'll try to provide the high-points...
	The basic idea behind FeedMesh is to provide a way for the large
consumers of "web updates" (focused now on the blogosphere) to share the
information they collect. Currently, FeedMesh nodes are being run by
PubSub.com and Yahoo! (via their blo.gs service). Google, Verisign
(weblogs.com) and others are expected to begin running FeedMesh nodes once
we define the next revision of our protocol.
	Consumers of blog search or matching services expect those services
to provide access to 100% of the data produced in the blogosphere. Thus, the
best any of us can do is simply meet expectations. In such an environment,
the data collection problem is a pure cost to those involved and not
something that can really provide competitive advantage. Competitive
advantage must come from the services that are provided -- not how much data
we collect. Thus, it makes sense for us to cooperate in collecting data
while we compete in providing differentiated services based on that data. 
	What a FeedMesh node does is broadcast to its partners a continuous
stream of information concerning updates to the content found in the
blogosphere. This amounts to millions of messages per day and is currently
accomplished by streaming XML over open sockets -- telnet style. Currently,
the FeedMesh only distributes what we call "thin pings" which are the
equivalent of the pings generated by blogs -- or, notifications under
JEP-0060. Each FeedMesh listener then fetches the pinging blog feed and
reads the updates. In the future, we'll be implementing "fat pings" which
carry the actual full content of the updated feed entries. As a result,
FeedMesh listeners will no longer need to fetch the source feeds -- they
will get their data via the FeedMesh's syndication protocol.
	What we're doing here is trying to overcome some of the
inefficiencies of the web "crawling" and discovery models that have built up
over the years and we're trying to eliminate unnecessary costs while freeing
funds for all of us to invest more in the features and services that are
really important to actual users of the search/discovery/notification
services. 
	My hope is that the "fat ping" version of FeedMesh will be based on
the "Atom over XMPP" specification that we've proved to be useful through
out over two years of implementation experience with an earlier version at
PubSub.com. 
	Although we're being very open about the design and evolution of
FeedMesh, it should be noted that there will never be a very large number of
"core" FeedMesh nodes. The reason for this is that the volume of information
that is exchanged on the FeedMesh is massive. We're talking about a real
fire-hose here... The result is that unless you're running a full-bore
service and need to know about the millions of updates to the blogosphere on
a continuous, real-time basis, the FeedMesh is probably overkill. Consumers
who need less than *everything* will get filtered streams of data (via "Atom
over XMPP" if they connect to PubSub.com) that focus just on the data they
are interested in. Thus, you might ask some FeedMesh node like PubSub to
provide you with only the updates to a particular set of sites or you might
create a JEP-0060 content-based subscription that only asked for posts that
contained the keyword "Seguineau"... We'll watch the firehose and then
provide you with the trickle of information that you really care about.
Other FeedMesh providers will do the same.
	I hope this gives you some idea of what we're up to...

	bob wyman





More information about the Standards mailing list