[Standards] pubsub/pep auto-creation

Kevin Smith kevin at kismith.co.uk
Tue Mar 20 14:45:22 UTC 2007

On 20 Mar 2007, at 13:40, Ralph Meijer wrote:
> In general publishing to a node may fail for any number of reasons.  
> One
> of them is the node not existing. Sure you would like to always be  
> sure
> to have the item published after your action, but you always must  
> check
> if it was successful anyway. If the node does not yet exist, you  
> get an
> error back with a node-not-found condition.

It may fail for other reasons, but in those cases you're not going to  
simply retry it as you will with a create.

> The point I was trying to make here is the creating a node is a one- 
> time
> event and thus a special case that should to be handled differently  
> from
> the modus operandus: publishing items.

Well, I agree it's not a frequent operation, but I don't really  
understand why this means it needs to have extra steps added for it.  
It's not a peripheral operation, but a requirement for the  modus  

>>> 1. Remove auto-create on publish. If the client publishes and the
>>> node doesn't exist, it needs to create the node (once). This is
>>> cleaner.
>> [..] the server knows what needs to be done.
> The tricky bit comes in from the configuration of nodes. If you need a
> configuration before the node can be created, the server does not know
> what needs to be done.

If you need a non-default configuration, you pass it and do publish 
+configure (autocreate+configure).

>>> This may mean that clients will always do publish-and-configure,
>>> which is messy but easier for clients.
>>   In fact, if the concern is that, although saving number of stanzas
>> sent (every session), the stanzas will be larger, it's perfectly
>> possible for a client to do publish+configure the first time it
>> accesses a node, and to only do publish thereafter (and this scores
>> on both approaches: the number of stanzas is low and the stanza size
>> is low).
> Here I say, how does the client know if the node is already there?

Because when it sent the initial publish, it didn't get an error  
about creation failing.

> Have the first request send a configuration?
Yes, this is necessary anyway (see later)

> What if another client
> (different machine, for example) already created the node and  
> configured
> it? Do you now change the configuration?
Absolutely, bugs happen, people use broken clients, and even straight  
user error can come into play.

> My angle is that I find it weird to have the complete desired state  
> sent
> in one request, particularly when some of the information is likely to
> not have changed.

Assuming we're only talking about the initial publish by a client in  
a session (as this isn't needed later, as previously noted):
With autocreate, the client sends the value/payload and the  
configuration in a single stanza. I can see how this seems weird if  
that is redundant - but it's not. The reason this is no more  
redundant than what non-autocreate does is because non-autocreate  
forces you to check several things (since creation/configuration/ 
publishing is no longer atomic).

If you might be dealing with whitelist-of-one (iq:private) nodes (or  
any other more-restrictive-than-default node), what might happen  
using the previously described methods is this (since we could create  
and then configure, rather than create and configure (and given the  
preference for explicit steps, this is probably what would be done))  
(Of course, you have similar problems if anything else goes wrong  
(buggy client, changing specification, user error, whatever):

<client> publish
<server> no node
<client> create
<s> ok
<c> configure
<s> ok
<c> publish

I think that's overly verbose, but if it's a one-time cost, the  
argument is that it's not so bad.
One problem comes about because this isn't a one-time cost (although  
it does look like it).
What happens if the stream dropped, or the client crashed, or the  
user shut down the machine, or whatever, in the middle? Now we have

<c> pub
<s> 404
<c> create
<s> ok

Next session, we come along and try to publish, and deal with create/ 
configure if it's not there

<c> pub
<s> ok

But wait! That's not what we meant, because now we've published our  
private data to a node with default config, which isn't what we  
wanted. So to be certain that you're publishing to a node with the  
configuration you want, without autocreate, you need to do

<c> Check for node (you can't publish to check, since you may be  
sending out private data)
<s> found
<c> Check configuration
<s> open
<c> Configure whitelist, me
<s> ok
<c> publish
<s> ok

Now, this ends up doing the same as the autocreate-config, only in  
more steps.

>>> Furthermore, the client will need to keep a record of the desired
>>> node configuration for each payload type or NodeID,
>> There's no getting around this (well, there is, with ugly registries
>> of defaults for different nodes, but we really don't want to go
>> there) - when a client is publishing and wants a non-default
>> configuration, it needs to know what that configuration is, whether
>> it's create+configure or publish+configure.
> If you are not creating the node, you don't need to keep stuff around.

This discussion is all about creating the node though, isn't it?  
Whether you use autocreate+config, or manualcreate+config, the client  
needs to know what config to send.

>> * With autocreate, the client sends "Node X, value Y" and afterwards
>> Node X has Value Y. Nothing implicit in the outcome, the server does
>> what it's told.
> No, the client sends either "Publish Y to Node X, and create node X if
> it doesn't exist", or "Configure Node X, then publish Item Y and  
> create
> node X before if it does not exist yet". Both in one exchange. In both
> cases the node creation is implicit.

Ok, it's implicit in the action, rather than implicit in the request.

> Exactly once during the node's lifetime, the server
> will respond with "Oops, you tried publishing to node X, but it isn't
> there". At this point you can create the node like in 'regular'  
> pubsub.

Maybe I'm really missing the point here (I do seem to be) but I  
really don't understand why it's a good thing to add this step. It  
really seems like we're discussing adding an extra 4 stanzas to the  
exchange of first publishing, without increasing the utility.

> I like to compare this with exception handling in e.g. Python. You
> generally code for the common case, and handle exceptions (hence their
> name) separately.

Well, the use of exceptions can be something of a religious matter.  
I'm in the camp that exceptions should only be used for things which  
are truly exceptional and unexpected, rather than run-of-the-mill  
occurrances (like something needing to be created before it's used).  
This is a different issue though, because exceptions amount to a GOTO  
in code which has its own set of debates and opinions.

> Note that this can all be done transparently to the user.

Yep - I think it's nicer if it's transparent to the stream / client  
too, but there's no doubt that the user won't be involved.

>> And here's the issue I have at the moment, I see no cons to automagic
>> node creation and I see a whole range of pros.
> I'm sure it is a matter of opinion. I hope I pointed out the cons as I
> see them.

Well, maybe it really does come down to opinion. The way I see  
things, the automagic does bring material improvement (cleaning up  
the stream). Is the disadvantage a matter of principle (things  
shouldn't be done transparently)?

> Again, it is in the eye of the beholder. Implementation of some of the
> bits just move to the server side.

Right, the 'does it exist? Make it so' has to be done by someone, be  
it the compiler/interpreter doing dynamic things, the server, or the  

> I don't think the protocol, with the
> implicits or additional usually redundant information, really is  
> simpler.

Well, while the check still has to be done somewhere, the automagic  
does take the checking out of the xmpp stream. That's where the  
saving comes in, I think, it makes the overall chatter less, which is  
a Good Thing™, to my mind.


Kevin Smith
Psi XMPP Client Project Leader (http://psi-im.org)

More information about the Standards mailing list