[standards-jig] UPDATED: JEP-0004
ralphs at blueairnetworks.com
Fri Sep 13 13:47:07 CDT 2002
Peter Saint-Andre wrote:
> JEP-0004 (jabber:x:data) has been updated to incorporate feedback from
> experience gained while this specification was in Draft status. Once
> the new Council is installed, version 1.x of this JEP will be voted on
> for advancement to Final status. If you have comments on the changes,
> please make them on this list.
Couple of points, maybe they've been discussed to death already, in
which case feel free to either ignore me or simply tally my vote...
These comments are meant as constructive criticisms - apologies if my
tone makes it sound otherwise. As x:data is billed as one of the next
"big things" for Jabber, i think it's important to raise these issues
now. Or raise them again as the case may be.
2.4 - Portability
XHTML forms are rejected because the lack a <required/> tag. I don't
follow this reasoning at all - surely if that was the only problem you
could simply add the tag. I'll have more to say on this at the end.
No formal mention is made of how <required/> is to be used. Some of the
examples show it, and I can guess about it, but there definately would
need to be more detail given. How are clients to respond when they get
a required field, but they do not understand or implement that
particular field type? Or is it purely a visual thing, to add a little
asterisks next to the input widget?
3.1 - Field types.
The given values are rather limited, and they do not form an orthogonal
set either. For example, no hidden multiline text input (say I wanted
to paste in my RSA keys). No way to specify suggested field widths for
graphical browsers. This things could be added by means of <x> tags,
but that makes it even harder to validate --- the <x> tag is already an
abomination in this regard.
A more flexible way for text fields would be to have a base type with
attributes like "hidden", "width", "height". When height==1 (default)
then its a single line. Clients can ignore these attributes if they
wish, and still achieve the required functionality.
It seems a litte egotistical to include the special-case "jid" field,
which is highly Jabber-centric, while all the other fields are generic.
However, it would make a lot of sense to include more specific field
types, like "integer", "float", "date", "time", "currency value". Then
GUI clients could render them into something more interesting/useful
than a bunch of text boxes -- a big selling point for end users.
Of course, non-GUI clients must not be forgotten, so the "complex"
field types should be expressed using optional attributes on top of the
simple base types.
And no, I don't think using <x> tags all over the place is the answer to
this problem. Having data types that correspond to real-world needs
makes life simpler for clients and service providers.
The JEP needs to recognize that real-world needs drive what actually
gets deployed. I'll make this point repeatedly: if something is not
specified in the JEP then ad-hoc implementations will spring up, and
inevitably this will lead to inoperability between clients. Fragmenting
an already small market space is clearly a Bad Thing(tm) for Jabber.
3.2 - Context
Seems to be a new way to express an idea we already had: the "type"
attribute as used in IQ requests. Why change the terminology? That
only raises the bar for writing compliant clients/services. Put that
another way: it gives more opportunities to screw up.
One reason for changing the terminology would be to offer better support
for multi-part forms, but the proposal does not address this any better
than the IQ method. The <thread> object is equivalent to the 'id'
attribute in an <iq>, except it can be used in <message> and <presence>.
However this isn't reason enough, as I will explain in my comments on
section 5.3 (Workflow). Forms data should live in <iq>'s and there
needs to be a way to pass links to it in other packet types.
Recommendation: stick with type=get/set/result as it is today.
Optionally, make it possible to specify a different JID that will handle
the next stage of form projessing (much like the ACTION attribute in an
HTML FORM). This way, a multipart form could propagate through a series
of different JIDs/resources (for example as a purchase order walks
through different levels of approval, each adding their stamp of
approval/comments to it). Of course the default action, if none is
specified, would be to return the completed form to its sender.
3.3 - Errors
This is actually a much bigger deal in real world scenarios. It is
critically important to decide the "who, what, when, where" of error
checking. For the end-user experience, it is important to do as many
up-front checks as possible (eg. invalid date, string too long, etc.)
where feedback is immediate. This means that sufficiently rich data
types must be used (hence my comments in section 3.1)
From a security standpoint, the server accepting the data must screen
out dangerous/invalid input, prevent buffer overruns, and protect
confidential data like passwords and credit cards. This is of course
very application dependent so it falls outside the scope of this JEP.
However, getting feedback to the user is again very important, so the
interface for doing so must be well defined (ie. included in the JEP).
3.4 - Rendering
Example2 repeatedly breaks the rule about <field type='fixed'> not being
used to clarify a question. Actually if the words were changed from
active voice to 'sectional headings' instead, then you'd be alright.
For example change "First please giveus some information about yourself"
to "Personal information" instead.
The use of the <value> tag all over the place is visually annoying and
wasteful of resources. I see no need for it, other then to be able to
split apart multiple values, but there's a better way to do that... for
multivalued fields, make ALL the options use <options> tags, with the
'label' attribute being optional. Do not include the first value in the
field element, that is illogical (to me anyways). The <field> tag
should be viewed as a container.
When the reply is sent back, instead of sending the selected value(s),
send a comma delimited list of the selected field _numbers_. Eg if the
first and third fields were selected, send back "0,2". This makes it
easier and faster to do range checks on the server, since multivalued
data is almost always stored in an array of some form anyways. It also
helps prevent a favourite script-kiddie attack on websites: sending
back a form with options set to illegal values.
All other cases (non-multivalued) don't need <value> tag at all, it
seems silly to force it upon them only for sake of symmetry, esp. since
you can do multivalued case without <value> tag as I just described.
3.5 - Search results
Now you're mixing client rendering into the picture. This competes with
the notion of "hidden" element type defined in section 3.1, since you're
trying to restrict which fields get shown by the client.
Actually, it comes back down to the fact that the data types are too
limited. Don't think of the search results listing as being a terminal
point, eg. some thing that the client displays after it has completed
the form transaction. Instead, the results page is just another page in
the set of search dialog pages. The set doesn't necessarily end there
either: the user could select from the listing, which then would move
into the add user transaction. All of this can be done seamlessly via
What you need is a data type to represent the search results listing.
Something like an HTML table with rows and columns. Attributes like
"read-only" apply in this case. The user can scroll the list and select
a row and/or column. Perhaps it would look like this:
<field type='table' readonly='true' select='true'>
(This needs further consideration; I'm not proposing it as-is. Perhaps
it would be better to have nested <field> elements, for example. And
<tr> is not exactly logical, HTML used it because they didn't have
namespaces at the time, we could use <row> and make it clear).
This result (row/column numbers) could then forward into the add-user
dialog, which really should also be driven via the x:data model, instead
of every client implementing their own variation...
The <reported> idea isn't very flexible, and seems like a kludge to
provide necessary functionality. Asking clients to extract fields and
sort them into tables is too much work; the server should be doing this.
The idea of allowing multiple reply packets, and merging their results,
is again too much work for a client - think sorting order, update
frequency, etc -- and it opens a can of worms, beginning with "how do i
know when is it done?". For the most part, data is coming from one
source - so send one packet. To handle multiple directory searches, let
the user select (via drop down?) which directory to search, or search
them all and return multiple <form> elements in the <query> result.
If you *truly* want to do distributed searches, trust me on this one,
you'll need much more infrastructure than this approach allows.
4.5 - Whitespace
This one bites us because mac, unix, and windows cannot agree on line
terminators. Behaviour must be rigidly defined, or else you'll get
double lines, missing lines, and more fun. The "accept liberally"
policy causes trouble here... does "\r\n" mean one line or two? What
about "\r\n\r" or "\n\r"? And beware of leading whitespace issues also,
eg. Example20 shows the data nicely indented, but i'll bet the user
didn't type it with leading spaces/tabs. So the example is inaccurate.
I'd suggest ignoring/banning all control characters. Use <br> to mark
linebreaks. This would mean that client authors cannot just copy the
string from their input widget over into the jabber packet... is that
too much to ask? Of course that is exactly why we have a problem in the
4.7 - Labels
Reading between the lines here, labels are allowed to contain underscore
"_", presumably to indicate keyboard accelerator shortcuts. This is an
important feature for end users who use GUIs. As such it really should
be explicitly documented in the JEP.
It would be clearer to use an optional "accelerator" attribute, instead
of piggybacking the underscore inside the label. Then there is no
guesswork. Furthermore, it allows more creative expression of
accelerator keys (including shift-, control- and alt-/meta- bindings).
Yes, this becomes client/platform specific, but at least if we put it in
the JEP then we've got a _defined_ way to make use of this feature. So
we might have accel="a" or accel="C-x" etc.
Also, tabbing order is an often requested feature. Most end-users only
notice when this feature is absent, or doesn't work the way they would
expect. Maybe an optional "taborder=N" attribute can be added. Again,
the JEP should take a stance on this, so as to prevent many incompatible
implementations from appearing.
5.3 - Workflow
This is perhaps the most important application for all of the complexity
that x:data is trying to provide. Simple one-page registration forms
can be handled through the existing namespaces reasonably well. But
being able to generate forms dynamically, for virtually any client, that
is a truly useful thing.
The <thread> tag doesn't really buy much here. It does not record the
status of a multi-page form, it merely gives all the requests a common
name. The ID tag in an IQ provides this already. Really you want
thinks like links to the next/previous stage, cancel, clear, etc.
I'm not sure that I see the value in using anything but IQ's for this
sort of transaction. Message and presence tags can be used to initiate
a transaction, eg. to alert you that there is something to do. The
message would include a link to the form source - another JID/namespace.
Similar to the x:oob concept, but link to a jabber entity which handles
the form generation/submission.
Think of forms as something like modal dialogs in typical applications.
They appear in direct response to user action (menu selection,
hotkey, clicking on a link). The user works their way through, or
cancels, before returning to the application. It makes no sense to have
such requests get spooled offline (eg <message>), or even worse, get
redirected to your pager/cellphone (unless of course it too speaks
Jabber, in which case the resource priority feature that already exists
will take care of delivering the event to you).
This is a general comment about using the <x> tag to augment the
existing search and register namespaces. Doing so makes parsing much
more difficult. It is conceptually incorrect because it mixes new and
old methods for doing the same thing into the same code base. It would
be much better to make a clean separation right up front. That way the
old code can be left alone, and new stuff added alongside (rather than
So I propose that clients that support x:data actually indicate it when
they make requests. Instead of doing a query in jabber:iq:register to
get the paramters, they should first try jabber:iq:xdata:register. If
that fails they can fallback on the old method. In a few years when
every client and server supports xdata, the old namespace can be
retired, and the fallback can then removed.
Instead of an <x> tag inside the <query>, we can then use something like
<form>, which better describes the <field> elements within. The form
tag can have attributes like a name, a unique ID for tracking, and more.
It can specify an "action" field which will handle the completed form.
And we can have multiple <forms> in a single request/reply, useful for
example to represent results of multiple directory searches.
Finally you might say, there is nothing "x" about this "xdata" model.
Well that's right, so perhaps instead of xdata we should call it what it
is: a form. That would give us namespaces like jabber:iq:form:register
and jabber:iq:form:search, etc.
To me, JEP-0004 does not look like a major improvement over the existing
jabber:iq:register and jabber:iq:search namespaces. It is certainly a
step in the right direction, but needs at the very least to have some
room for growth options down the road. Otherwise, in two years time
when everybody realises they wanted a "date" field type, we'll be forced
to abandon jabber:x:data and move to a new namespace-du-jour. Or come
up with another, even more complicated method for being backwards
It seems that the jabber community is just scratching the surface of the
forms data issue. Meanwhile, the W3C has been dealing with HTML forms
for well over a decade. We'd do well to learn from their mistakes
(starting with <input> tag then later adding different types, then
eventually coming up all sorts of non-specific "next generation"
models). I definately think KISS principle should apply, and we can
safely get rid of a lot of historical cruft - for example we can make
the "type" attribute mandatory - but the basic ideas learned in HTML
forms are good starting blocks for the Jabber model.
Using <x> tags to add the x:data into existing namespaces may seem like
a good way to introduce the concept right now, but down the road it will
only be a hindrance to development - old version cannot be retired
because they are the container for the new. And we can't just "forget"
the old protocol without breaking the established schema for the older
namespace. Better to introduce a new namespace now - it is simpler and
cleaner this way.
If you've read this far.. thanks for hearing me out,
PS. I'd be willing to rewrite the JEP (or submit a new one) to reflect
my idea of how this x:data idea should be done. I really didn't write
this whole rant^H^H^H^Hessay only to stir up trouble, rather I'd like to
see a genuinely useful standard emerge.
More information about the Standards-JIG