[Standards] Best approach for Shared XML editing

Joonas Govenius joonas.govenius at gmail.com
Thu Sep 11 20:16:21 CDT 2008


On Thu, Sep 11, 2008 at 7:18 PM, Dave Cridland <dave at cridland.net> wrote:
> On Thu Sep 11 22:01:35 2008, Joonas Govenius wrote:
>>
>> RFC 5261 looks like a handy diff format but it doesn't solve the core
>> problem of dealing with simultaneous modifications. The problem with
>> XPath based approaches in general is that the node the XPath selector
>> referred to in the sender's copy might not be the same node that it
>> refers to in the receiver's copy if other modifications have taken
>> place in the mean while. There are basically three ways to resolve
>> this issue:
>
> Worth noting it's not intended to solve the problem of simultaneous
> modifications, it's designed to express the modifications. Processing rules
> to handle them are a different thing.
>

Right.

>> 3) You can give each node a unique id that can be used to address it
>> instead of the XPath selector. This is the approach that sxe uses
>> because it is very simple and flexible. The client simply needs to
>> maintain an _unordered_ set of records (each describing an XML node)
>> that can be mapped to a unique XML document based on the contents, not
>> the order, of those records.
>>
>>
> Of course, you can explicitly give things in an XML document xml:id
> attributes. You can even do it on the fly with RFC 5261 - add a stable
> identifer with XPath (unstable refs), then do lots of work with the element.
>

Ok. Doing it on the fly could work if a client asked the server (or
some master user) for a stable ref that only it would be allowed to
refer to. That would allow some simultaneous modifications, although
it would still not be ideal for whiteboarding where several users
would typically simply append their elements to the end of the <svg/>
element or some such place.

>> Furthermore, the namespace issues described in RFC 5261 are avoided by
>> letting the protocol send "records" rather than inline XML.
>
> Sure, but then again, look at example 13 of SXE.
>
> Here you have record GUID1 - so far, so good. It's
> <{http://www.w3.org/1999/xhtml}html/>, and that's a known, qualified
> element.
>
> Next, an attribute {xml}:lang is added with a value of 'fi', and then the
> value is subsequently changed to 'en'. Great compact representation. Only
> the namespace is incorrect, of course, it should be the one specified in
> XML-NAMES, since you need the namespace itself there, not the prefix.
>
> But then it goes really wrong. A local element, <head/> is added below the
> qualified <html/> element. In XML, at this point, we have:
>
> <ns0:html xmlns:ns0='http://www.w3/org/1999/xhtml' ns1:lang='en'
> xmlns:ns1='xml'>
> <head/>
> </ns0:html>
>
> Which is not what you wanted at all. To have the <head/> element qualified,
> this needs spelling out in SXE, which really isn't very efficient. Or else
> the ns defaults to the parent - but that can't be, because there'd be no way
> of specifying a default element, then.
>

Ok, I agree that the example has several flaws. Probably because it
was first written by hand by me and later modified by Peter, who was
probably misled by the 'ns' attribute. The 'ns' attribute really
should be renamed 'prefix' or removed all together.

Really, the example should look something like this:

<add type='element' name='html' ... />    (no 'ns'/'prefix')
<add type='attr' name='xmlns' chdata='http://www.w3.org/1999/xhtml'
... />    (no 'ns'/'prefix')
<add type='attr' prefix='xml' name='lang' chdata='fi' ... /> or just
<add type='attr' name='xml:lang' chdata='fi' ... />

<add type='element' name='head' ... />    (no 'ns'/'prefix')
...

So I'm abusing notation and treating the namespace declarations as
attributes. I agree that it's not elegant but I think it does transmit
exactly the information that I want.

> In other words, Example 13 is significantly smaller than it really has to be
> according to the specification, and that's really not good given how huge it
> is to begin with.

Yeah, it is missing the  <add type='attr' name='xmlns'
chdata='http://www.w3.org/1999/xhtml' ... /> element but I don't think
the situation is quite as bad as you thought.

> And remember, this is to ultimately generate:
>
> <html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en'>
> <head>
> <title><!--The title of the document goes here.-->Royal Musings</title>
> <body/>
> </html>
>
> (You'll note that I've corrected the actual output given later in Example
> 16, since that's wrong even discounting the namespace issue).
>
> So I'll grant you that by forcing the actors to exchange vast swathes of
> XML-encapsulated XML, with no possibility for reducing redundancy, you're
> removing the need to deal with the complexities of namespace issues.
>
> But I'm also intrigued - I wonder how big a document one must be working on,
> and for how long, before it becomes cheaper to use SXE over simply
> retransmitting the entire document each time it's changed?
>

Well... say sxe uses n times the number of bytes the XML you're adding
uses, then it would be cheaper to send using sxe if your existing
document is more than n-1 times longer than the part you're adding...
So not very big.

Sure, n right now seems to be something around 7, but that just hasn't
been a concern for me so far as I've just aimed for correctness. I do
think the factor can be reduced significantly by some tweaks that make
many of the GUIDs in <new/>s in particular implicit, and the rest is
simply highly compressible so I don't think it's such a huge problem
in terms of bandwidth.


Joonas


More information about the Standards mailing list