[Standards-JIG] JEP-0060 PubSub: Modifying Node/Collection Associations and other issues.

Bob Wyman bob at wyman.us
Sun May 28 20:35:21 UTC 2006

I find the discussion of collections in JEP-0060 to be very confusing and at
times contradictory. I think this may be simply because this is probably one
of the least frequently implemented parts of the specification. Some
thoughts follow:

Bernhard has recently been asking about "moving" a node from one collection
to another. These questions arise, I think, since JEP-006 does not directly
address the "move node" use case. The subject of associating nodes with
collections is only dealt with in any depth in Section 9.5 "Creating Nodes
Associated with a Collection." As one can see from the section title,
Section 9.5 is primarily focused on associating newly created nodes with
collections at the time of the node's creation. The subject of
"Disassociating" nodes from collections or of associating existing nodes
with collections is only lightly touched upon at the end of Section 9.5. I
believe we need to flesh out the discussion of nodes and collections in
order to build a more useful specification.

            I propose that we expand (yes, expand.) JEP-0060 to include some
new sections and modify the existing Section 9.5:


            9.X  Discovering the Nodes in a collection

            9.Y Associating Nodes with Collections

            9.5 Modify to deal better with "default" attributes.


Discovering the Nodes in a Collection:

            It appears that the pubsub#leaf_nodes field of a collection
contains: "'The leaf nodes associated with a collection". However, other
than an indirect reference at the end of Section 9.5 and a brief appearance
in Section 16.4.3, this very important attribute is not discussed in the
document and no examples demonstrate its use. I believe it would be
appropriate to provide an example of "Discovering the Nodes in a Collection"
which would demonstrate querying for and reading the pubsub#leaf_nodes
field. Also, it would appear that even though the field is called
"leaf_nodes" that it can contain either leaf-nodes or collection-nodes.
(Collections contain *either* leaf-nodes or collection-nodes.) Thus, the
field should probably have been called: "child_nodes," "contained_nodes," or
simply "nodes." 


Associating Nodes with Collections:

            There are three interesting use-cases which are not already
covered well by the existing Section 9.5:

1.       Associating an existing node with a collection.

2.       Disassociating an existing node with a collection.

3.       Moving a node from one collection to another (i.e. 1 and 2

            Other than the discussion in Section 9.5 of associating newly
created nodes with Collections and some inferences in other places, JEP-0060
is largely silent on the subject of associating nodes with collections. This
should be fleshed out.

            It appears that one can associate a node with a collection by
modifying the "leaf_nodes" field of the collection. Alternatively, one can
modify the "pubsub#collection" field of a node. However, these mechanisms
are only mentioned in terms of disassociating a node from a collection (see
end of Section 9.5). We should have some discussion of the "positive" use of
these mechanisms to associate a node with a collection and we should
probably have at least one example (yes, more text.).

            Given these two mechanisms for modifying the association of a
node with a collection, it would appear that the answer to bernards's
questions concerning "moving" a node, would be that to do so, you must use
one of these mechanisms. (modify pubsub#leaf_nodes of the appropriate
collections or pubsub#collection of the node being moved.)

            However, there are some serious problems created here due to the
support for implementation-specific "semantic meaning" in node ids. The
problem is that modifications to collections would require the regeneration
(changing) of NodeIDs if those NodeIDs have semantic meaning. For instance,
if an implementation held that "/" in a NodeID indicates hierarchy, then if
I moved the node with the NodeID "foo/bar" into the "mumble" collection, it
would be necessary to generate a new NodeID ( perhaps: "mumble/bar") and
delete the old NodeID. This is very problematic since it means that the act
of "moving" the node would kill any existing subscriptions to that NodeID.
(Unless the subscriptions were all "re-written" and some means was found of
communicating the rewrites to all appropriate clients. Also, all items in
the collection would have to be re-written to indicate that they had been
published to the new NodeID not the old one. This would be cumbersome and

It would appear that implementations that supported "semantics in NodeIDs"
would, by definition, be incapable of permitting a single node to appear in
more than one collection since a node can only have a single NodeID. (Unless
some complex syntax for NodeIDs was defined to indicate membership in more
than one collection ("(foo|mumble)/bar" might work for the example above.)


Issues with 9.5 (concerning default configurations)

            As written, it would appear that Section 9.5 creates a conflict
with the normal means of creating a node.

1.       Because an implementation can "generate" NodeIDs if it attributes
semantic meaning to NodeIDs, and since it is not possible to discover if an
implementation has such a policy (there is no IQ value that describes the
policy), it would appear that one should never provide a NodeID in a request
to create a node within a collection. One must always use "Instant Nodes"
(see: Example 107) in this case and thus a system that supports NodeID
semantics MUST always support Instant Nodes. Also, in order to support
clients that are expecting implementations that support Semantics in
NodeIDs, all implementations are essentially forced to support Instant

2.       It appears impossible to create a node in a collection if that node
is to have a "default" configuration. This is because the mechanism for
creating a node in a collection requires that a non-empty <configure/>
element be used yet Section 8.1.2 says that to create a node with a "default
configuration," one must use an empty <configure/> element to indicate that
you wish a default configuration. It would seem that we would have to either
expand Section 8.1.2 to say that you must either use an empty <configure/>
element or, in the case that you are creating a node in a collection, you
must use a <configure/> element that only contains a value for
"pubsub#collection". Alternatively, the discussion in 9.5 could be modified
to state that nodes created in collections must be fully configured on
creation. This would suggest that the example 192 should be expanded to show
full configuration upon creation.


General issues:

            It should be clear to anyone reading the discussion of
collections that the support for "semantics" in NodeIDs (described in 12.12)
makes things much more complex than they might otherwise be. This is
particularly the case since a client has no means to discover what semantics
might be associated with the NodeID patterns.. But, NodeID semantics makes
it hard for us to:

1.       Support Nodes that are members of more than one collection

2.       Move nodes from one collection to another

3.       Disassociate a node from a collection

4.       Name nodes consistently across implementations.

5.       Achieve consistency of the protocol across implementations and thus
achieve interoperability.


As a publisher, I'm particularly interested in being able to have NodeIDs be
consistent across implementations. I would, for instance, like to be able to
call for a "standard" that says that we would use the conventions defined
for "TagURIs" in defining nodes. (See: http://www.ietf.org/rfc/rfc4151.txt)
For instance, I would like to say that any server that provides access to a
stream of updates to my blog should name the appropriate node as
"tag:wyman.us,2005:blog". Of course, individual servers might put "my node"
in different places in a hierarchy as defined by their collection structure
- but, anyone looking for "tag:wyman.us,2005:blog" would always find my blog
feed no matter what the hierarchy was. One site might have my blog under
"CGM.blogs" while another might have it in the collection
"Blogs.PubSub_employees." I shouldn't care where it appears as long as the
NodeID is the same in all places..


Please consider expanding the documentation of collections to cover all the
appropriate use cases and PLEASE remove the "semantics" for NodeIDs that is
mentioned in sections 9.5 and 12.12. Section 12.12 should say "NodeID's" are
opaque. Any hierarchy is defined only by the relationship by inclusion in
hierarchies (graphs) of collections.


bob wyman




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/standards/attachments/20060528/2f93cd2e/attachment.html>

More information about the Standards mailing list