[Standards] XEP-0277 "Microblogging over XMPP" and the Atom data format
bear42 at gmail.com
Sun May 16 18:44:25 UTC 2010
On Sun, May 16, 2010 at 09:03, Guus der Kinderen
<guus.der.kinderen at gmail.com> wrote:
> Hi all,
> Recently, I have been working on an XMPP gateway (XEP-0100 "Gateway
> Interaction" style) that exposes Twitter functionality in a way compliant
> with XEP-0277 "Microblogging over XMPP". While coding, a number of questions
> and remarks related to this last XEP popped up.
> The XEP specifies that pubsub to publish and receive microblog posts (the
> XEP does indicate that for posting, an alternative interface can be used).
> The pubsub items used in the examples are using an Atom-based data format.
> My first question: the XEP does not specify explicitly that the Atom data
> format MUST/SHOULD be used. Can other formats be used as well? I feel that
> there is room for interpretation here. This can lead to implementations that
> are XEP compliant, but are not compatible with other implementations. Should
> the Microblogging XEP specify more exact what data format should be used?
I think that would be a good change as Atom has become the default
canonical format for this realm.
> Why was the Atom-based data format chosen? In my opinion, there are a number
> of characteristics that do not make it "fit" to the purpose:
Atom has been chosen, from what I can gather and also from my own
opinion, because it is now the format used for ActivityStreams,
PubSubHubbub, OStatus and the majority of the large consumers and
providers of feed data.
It is also a "proper" XML format which gives it a lot of advantages in
the XMPP world, but that's secondary to the prior reasons IMO.
> Atom requires a title for each entry. In the context of a microblog, this
> requirement doesn't make much sense to me. The examples in the XEP use the
> atom:title element to hold the text of the post. I would argue that this is
> done more appropriately in a atom:content element instead.
In the case of a post or update that does not have a Title per se, the
Atom spec says that the content of the post should be placed in the
Title element and the Content element should be empty. This rule is
also listed as a MUST in the ActivityStreams spec:
> Atom requires a unique identifier (atom:id) for each entry. Is this
> appropriate in a use case where content is being created by a client (as
> opposed to created content being distributed by the service provider)? In my
> gateway implementation, I can't think of a unique identifier that I can use
> when a client is generating a microblog post on the legacy service. Instead,
> the unique identifier is generated by the legacy domain. I feel that this
> argument holds true, even in a more generic context than my Legacy Gateway
> implementation: it is not uncommon for service providers to generate and add
> a unique identifier to an object created as a result of a user request. By
> using the Atom data format, the XEP is less flexible.
The unique identifier in your example would be something based on date
and time of receipt or generation of the post *and* you should then
include a Source element that outlines what the legacy system is using
to identify the item, including a URL to the item if possible.
In general anything that flows thru a Gateway should do this, see the
Salmon Protocol and also the Atom Threading protocol
> Atom requires an author for each entry. This appears redundant to me - the
> pep service itself is related to the author, and posts on a microblog are
> not likely to be authored by someone else than the owner of the blog.
> Nonetheless, Atom spec requires this element to exist.
The Author element is present to allow for downstream consumers of the
Atom item to be able to have a URL that points to the author without
having to discover thru web crawling what that author is. At the
minimum you just need to provide a URL to the identity url of whoever
generated the post.
> That's what I ran into so far. I'd be happy to receive your insights,
> comments and remarks.
IMO the reason to use Atom boils down to the fact that a *lot* of
activity by some very bright and active people have focused on Atom,
with some of them extending Atom but that in itself is another reason
to use it, and they are now generating content that could easily flow
thru your gateway with minimal processing and still retain a lot of
the metadata from the source while allowing you to add your own
p.s. my background in all of this is very behind the scenes and I was
working on quite a few Atom based services at Seesmic before they
pulled the plug on the project :(
I'm hoping some of the others in the XSF who are Atom gurus will step
in and correct or add to my thoughts above.
bear at xmpp.org (email)
bear42 at gmail.com (xmpp, email)
bear at code-bear.com (xmpp, email)
PGP Fingerprint = 9996 719F 973D B11B E111 D770 9331 E822 40B3 CD29
More information about the Standards