[standards-jig] UPDATED: JEP-0004
reatmon at jabber.org
Wed Sep 18 01:26:18 UTC 2002
This has taken a little bit of time to reply to since it's so large a
Ralph Siemsen wrote:
> Peter Saint-Andre wrote:
> > JEP-0004 (jabber:x:data) has been updated to incorporate feedback from
> > experience gained while this specification was in Draft status. Once
> > the new Council is installed, version 1.x of this JEP will be voted on
> > for advancement to Final status. If you have comments on the changes,
> > please make them on this list.
> Couple of points, maybe they've been discussed to death already, in
> which case feel free to either ignore me or simply tally my vote...
> These comments are meant as constructive criticisms - apologies if my
> tone makes it sound otherwise. As x:data is billed as one of the next
> "big things" for Jabber, i think it's important to raise these issues
> now. Or raise them again as the case may be.
> 2.4 - Portability
> XHTML forms are rejected because the lack a <required/> tag. I don't
> follow this reasoning at all - surely if that was the only problem you
> could simply add the tag. I'll have more to say on this at the end.
> No formal mention is made of how <required/> is to be used. Some of the
> examples show it, and I can guess about it, but there definately would
> need to be more detail given. How are clients to respond when they get
> a required field, but they do not understand or implement that
> particular field type? Or is it purely a visual thing, to add a little
> asterisks next to the input widget?
XHTML wasn't used for a lot more reasons. Most of them have been lost
and forgotten since they took place in person or on the phone. That's
our fault for not writing them down. Here are a few that we have
dredged up from memory:
1) XHTML is for building forms. But x:data isn't just about the form.
It's about the data. It's about representing the data in all aspects of
the journey. Form, submission, reporting, etc...
2) XHTML is so-so for defining forms. It has three different tag sets
to handle different cases. Our goal was to write a simple namespace,
one that didn't have three different tags to create the fields.
3) To use the XHTML spec, we would have to pull out a sub-section of the
spec (which is messy in itself) and rework it to fit our needs. Why not
just start from scratch and do it right the first time?
The required field is a holdover from a previous version of the JEP that
included validation, but people wanted it left in. There is a follow up
JEP coming after this one is finalized that will explain how validation
and error handling works with x:data. The JEP was already getting way
too long for my tastes, and I want people to use it and give feedback
rather than wait two more years for everthing to be worked out.
Finally, not everyone has an XHTML renderer available to them. Tk
doesn't, CLI doesn't, text to speech doesn't. So locking into XHTML is
pointless for everyone else except the Win32 and Gnome crowds. Because
we have to write renderers anyways. Plus, to make the HTML form look
nice (since it's outside to GUI control at that point) you would have to
include CSS and other tricks. All of which make this much more painful
overall then just defining a represenation of the data and letting each
client author do what they want. That include using XSLT to convert it
to an HTML form and displaying it if they want to.
> 3.1 - Field types.
> The given values are rather limited, and they do not form an orthogonal
> set either. For example, no hidden multiline text input (say I wanted
> to paste in my RSA keys). No way to specify suggested field widths for
> graphical browsers. This things could be added by means of <x> tags,
> but that makes it even harder to validate --- the <x> tag is already an
> abomination in this regard.
You don't need a hidden multi-line field... It's hidden. Just put as
much data in the value tag as you want to.
> A more flexible way for text fields would be to have a base type with
> attributes like "hidden", "width", "height". When height==1 (default)
> then its a single line. Clients can ignore these attributes if they
> wish, and still achieve the required functionality.
Why would the asker care how big the box should be? That's up to the
Client to decide. You're asking for hints that would only apply in
certain situations. And in those, you could pick whatever you wanted
> It seems a litte egotistical to include the special-case "jid" field,
> which is highly Jabber-centric, while all the other fields are generic.
> However, it would make a lot of sense to include more specific field
> types, like "integer", "float", "date", "time", "currency value". Then
> GUI clients could render them into something more interesting/useful
> than a bunch of text boxes -- a big selling point for end users.
> Of course, non-GUI clients must not be forgotten, so the "complex"
> field types should be expressed using optional attributes on top of the
> simple base types.
I can see your stance, but would like to point out that this is a Jabber
only spec. We are not trying to push x:data into anywhere else. So to
have a JID only type makes sense for us. It's tied to the protocol, and
is useful beyond just the form. It's useful for identifying
programatically what field types are available in the reported side of
> And no, I don't think using <x> tags all over the place is the answer to
> this problem. Having data types that correspond to real-world needs
> makes life simpler for clients and service providers.
> The JEP needs to recognize that real-world needs drive what actually
> gets deployed. I'll make this point repeatedly: if something is not
> specified in the JEP then ad-hoc implementations will spring up, and
> inevitably this will lead to inoperability between clients. Fragmenting
> an already small market space is clearly a Bad Thing(tm) for Jabber.
I'll defer most of this since it is partly covered by validation and
that is coming in a later JEP.Everything becomes text when it goes into
XML, so having an integer field is kind pointless. What you really want
to say is that this field can only be a number. And what the format of
the number is. Again, that's planned for the Validation JEP.
> 3.2 - Context
> Seems to be a new way to express an idea we already had: the "type"
> attribute as used in IQ requests. Why change the terminology? That
> only raises the bar for writing compliant clients/services. Put that
> another way: it gives more opportunities to screw up.
> One reason for changing the terminology would be to offer better support
> for multi-part forms, but the proposal does not address this any better
> than the IQ method. The <thread> object is equivalent to the 'id'
> attribute in an <iq>, except it can be used in <message> and <presence>.
> However this isn't reason enough, as I will explain in my comments on
> section 5.3 (Workflow). Forms data should live in <iq>'s and there
> needs to be a way to pass links to it in other packet types.
> Recommendation: stick with type=get/set/result as it is today.
> Optionally, make it possible to specify a different JID that will handle
> the next stage of form projessing (much like the ACTION attribute in an
> HTML FORM). This way, a multipart form could propagate through a series
> of different JIDs/resources (for example as a purchase order walks
> through different levels of approval, each adding their stamp of
> approval/comments to it). Of course the default action, if none is
> specified, would be to return the completed form to its sender.
I don't think you understand what is meant by context. We aren't
suggesting that x:data in iteself provides context. What are saying is
that you have to provide structure surrouding the x:data to provide the
In some cases the context maybe an iq tag with get, set, result (as you
mentioned). In some cases it might be a message with the thread tag.
In others it might be presence using some other x namespace.
The point of context is that if you look at the form in a vacuum, you
cannot tell anything about it. You can't tell who to submit it to, you
can tell what namespace/thread to put it under, you can't tell anything
except that it's a form/submission/result. The other stuff in the
packet provides the context.
Pretend that x:data is a screwdriver, and the user is a screw. When you
put the screwdriver against the screw, the screw has no idea what to do
with it. It's not until you provide context to the screwdriver by
turning it, that you give the screw any idea what it is that it should do.
> 3.3 - Errors
> This is actually a much bigger deal in real world scenarios. It is
> critically important to decide the "who, what, when, where" of error
> checking. For the end-user experience, it is important to do as many
> up-front checks as possible (eg. invalid date, string too long, etc.)
> where feedback is immediate. This means that sufficiently rich data
> types must be used (hence my comments in section 3.1)
> From a security standpoint, the server accepting the data must screen
> out dangerous/invalid input, prevent buffer overruns, and protect
> confidential data like passwords and credit cards. This is of course
> very application dependent so it falls outside the scope of this JEP.
> However, getting feedback to the user is again very important, so the
> interface for doing so must be well defined (ie. included in the JEP).
Again... covered in an upcoming JEP.
> 3.4 - Rendering
> Example2 repeatedly breaks the rule about <field type='fixed'> not being
> used to clarify a question. Actually if the words were changed from
> active voice to 'sectional headings' instead, then you'd be alright.
> For example change "First please giveus some information about yourself"
> to "Personal information" instead.
I disagree. Stating that we are asking for information about yourself,
is not a description of a field. "Your first or given name goes in this
field." is a description of a field. I'll agree that the example could
be rewritten to make it more clear, but it does not break the rules.
> The use of the <value> tag all over the place is visually annoying and
> wasteful of resources. I see no need for it, other then to be able to
> split apart multiple values, but there's a better way to do that... for
> multivalued fields, make ALL the options use <options> tags, with the
> 'label' attribute being optional. Do not include the first value in the
> field element, that is illogical (to me anyways). The <field> tag
> should be viewed as a container.
Um... first off who cares about visually annoying. It's XML. It's ugly
already. As for the use of option.. you can't do that. There is a big
difference between an option that you can pick, and value is selected.
The mulitple value tags are there to handle the list-multi, which allows
you to pick mulitple values. There isn't much we can do to get around
it, and this makes it very clean for defining where the value in a field
> When the reply is sent back, instead of sending the selected value(s),
> send a comma delimited list of the selected field _numbers_. Eg if the
> first and third fields were selected, send back "0,2". This makes it
> easier and faster to do range checks on the server, since multivalued
> data is almost always stored in an array of some form anyways. It also
> helps prevent a favourite script-kiddie attack on websites: sending
> back a form with options set to illegal values.
It also locks the render into a single path. Display the options in the
order I gave them to you. You can't alphabetize a list and do this.
Because the list order changes. Plus, you can't represent the data in a
result if you have taken away the vale fields.
Not to mention that converting the return value to numbers doesn't stop
someone from processing a form incorrectly. It could be text, or
numbers. If the processor will be hosed by bad data, it doesn't matter
what form it takes.
> All other cases (non-multivalued) don't need <value> tag at all, it
> seems silly to force it upon them only for sake of symmetry, esp. since
> you can do multivalued case without <value> tag as I just described.
You just broke all XPath parsers that want to treat fields the same.
You can no longer just reach into a field and say XPath("field/value")
and get back all of the values for the field. Just to save you from
having to look at a <value/> tag. XML is designed to be structured, and
often times structure interferes with ease of readability, but it sure
lets the computer rip through things since it is orderly.
> 3.5 - Search results
> Now you're mixing client rendering into the picture. This competes with
> the notion of "hidden" element type defined in section 3.1, since you're
> trying to restrict which fields get shown by the client.
This isn't about rendering... Nothing in the JEP is about rendering.
It's about representing data. You specify which fields will be returned
in a result, and what label they will have. That's all.
> Actually, it comes back down to the fact that the data types are too
> limited. Don't think of the search results listing as being a terminal
> point, eg. some thing that the client displays after it has completed
> the form transaction. Instead, the results page is just another page in
> the set of search dialog pages. The set doesn't necessarily end there
> either: the user could select from the listing, which then would move
> into the add user transaction. All of this can be done seamlessly via
> x:data approach.
The difference being that to select something and submit it, it has to
be a form. And so is not a result set.
> What you need is a data type to represent the search results listing.
> Something like an HTML table with rows and columns. Attributes like
> "read-only" apply in this case. The user can scroll the list and select
> a row and/or column. Perhaps it would look like this:
> <field type='table' readonly='true' select='true'>
> (This needs further consideration; I'm not proposing it as-is. Perhaps
> it would be better to have nested <field> elements, for example. And
> <tr> is not exactly logical, HTML used it because they didn't have
> namespaces at the time, we could use <row> and make it clear).
> This result (row/column numbers) could then forward into the add-user
> dialog, which really should also be driven via the x:data model, instead
> of every client implementing their own variation...
> The <reported> idea isn't very flexible, and seems like a kludge to
> provide necessary functionality. Asking clients to extract fields and
> sort them into tables is too much work; the server should be doing this.
You're assuming that my client wants to take the data results and put
them into a table. The data result set is all about representing the
data as is. Not filtering it for your benefit. if a client wants to
put the results into a table and sort by a field... so be it. That's
for the Client to decide. My client wants to build it like a tree.
> The idea of allowing multiple reply packets, and merging their results,
> is again too much work for a client - think sorting order, update
> frequency, etc -- and it opens a can of worms, beginning with "how do i
> know when is it done?". For the most part, data is coming from one
> source - so send one packet. To handle multiple directory searches, let
> the user select (via drop down?) which directory to search, or search
> them all and return multiple <form> elements in the <query> result.
> If you *truly* want to do distributed searches, trust me on this one,
> you'll need much more infrastructure than this approach allows.
I agree. The problem is that the can of worms is already opened.
Jabber has supported this in the protocol from the beginning. It's a
little too late to say not to do it. We have to fit the tool to the
problem, not rewrite the problem because we don't like the tool looks.
> 4.5 - Whitespace
> This one bites us because mac, unix, and windows cannot agree on line
> terminators. Behaviour must be rigidly defined, or else you'll get
> double lines, missing lines, and more fun. The "accept liberally"
> policy causes trouble here... does "\r\n" mean one line or two? What
> about "\r\n\r" or "\n\r"? And beware of leading whitespace issues also,
> eg. Example20 shows the data nicely indented, but i'll bet the user
> didn't type it with leading spaces/tabs. So the example is inaccurate.
> I'd suggest ignoring/banning all control characters. Use <br> to mark
> linebreaks. This would mean that client authors cannot just copy the
> string from their input widget over into the jabber packet... is that
> too much to ask? Of course that is exactly why we have a problem in the
> first place...
If you notice what the JEP says, it basically says don't worry about the
newlines. The person doing the asking is the one that should worry
about it. And when they worry about it, it should be in making sure
that it never sends any.
That said, it is agreed amongst the authors that whitespace is still an
issue, and one that we are actively seeking an answer too (we have a few
ideas, just trying to find the one that makes the most sense for now and
> 4.7 - Labels
> Reading between the lines here, labels are allowed to contain underscore
> "_", presumably to indicate keyboard accelerator shortcuts. This is an
> important feature for end users who use GUIs. As such it really should
> be explicitly documented in the JEP.
> It would be clearer to use an optional "accelerator" attribute, instead
> of piggybacking the underscore inside the label. Then there is no
> guesswork. Furthermore, it allows more creative expression of
> accelerator keys (including shift-, control- and alt-/meta- bindings).
> Yes, this becomes client/platform specific, but at least if we put it in
> the JEP then we've got a _defined_ way to make use of this feature. So
> we might have accel="a" or accel="C-x" etc.
> Also, tabbing order is an often requested feature. Most end-users only
> notice when this feature is absent, or doesn't work the way they would
> expect. Maybe an optional "taborder=N" attribute can be added. Again,
> the JEP should take a stance on this, so as to prevent many incompatible
> implementations from appearing.
I disagree that this should be stated in the JEP. That is 100% a Client
issue, and not a protocol issue. We do no cover any rendering
specifics, nor will we since this is about representing data in XML, and
Build the tab order in based on the order of fields. Massage the label
name to be what you want for your specific application.
This JEP is not about rendering, and never should be. If Joe Bob
decides that he wants to add a new feature to his OS some day, am I
going to retro fit the JEP to match it? No. There are lots of GUI
tools that don't do keyboard accelerators.
> 5.3 - Workflow
> This is perhaps the most important application for all of the complexity
> that x:data is trying to provide. Simple one-page registration forms
> can be handled through the existing namespaces reasonably well.
No... No they can't. And that's part of the problem with them. You
can't easily internationalize a form, you can represent any data that is
not part of the DTD, and you cannot provide for the user to give you
back exactly what you want. You can't do any of those things without
breaking the DTD, and thus violating the XML.
> being able to generate forms dynamically, for virtually any client, that
> is a truly useful thing.
> The <thread> tag doesn't really buy much here. It does not record the
> status of a multi-page form, it merely gives all the requests a common
> name. The ID tag in an IQ provides this already. Really you want
> thinks like links to the next/previous stage, cancel, clear, etc.
> I'm not sure that I see the value in using anything but IQ's for this
> sort of transaction. Message and presence tags can be used to initiate
> a transaction, eg. to alert you that there is something to do. The
> message would include a link to the form source - another JID/namespace.
> Similar to the x:oob concept, but link to a jabber entity which handles
> the form generation/submission.
> Think of forms as something like modal dialogs in typical applications.
> They appear in direct response to user action (menu selection,
> hotkey, clicking on a link). The user works their way through, or
> cancels, before returning to the application. It makes no sense to have
> such requests get spooled offline (eg <message>), or even worse, get
> redirected to your pager/cellphone (unless of course it too speaks
> Jabber, in which case the resource priority feature that already exists
> will take care of delivering the event to you).
This has been suggested before, and the best solution is to create a new
namespace tool for providig the needed context. It's not really
something that needs to go into x:data. x:data is about a single form.
If your application will require multiple forms, then add a layer and
use x:data for each of those forms. Don't try and make x:data into a
swiss army knife.
> This is a general comment about using the <x> tag to augment the
> existing search and register namespaces. Doing so makes parsing much
> more difficult. It is conceptually incorrect because it mixes new and
> old methods for doing the same thing into the same code base. It would
> be much better to make a clean separation right up front. That way the
> old code can be left alone, and new stuff added alongside (rather than
> So I propose that clients that support x:data actually indicate it when
> they make requests. Instead of doing a query in jabber:iq:register to
> get the paramters, they should first try jabber:iq:xdata:register. If
> that fails they can fallback on the old method. In a few years when
> every client and server supports xdata, the old namespace can be
> retired, and the fallback can then removed.
> Instead of an <x> tag inside the <query>, we can then use something like
> <form>, which better describes the <field> elements within. The form
> tag can have attributes like a name, a unique ID for tracking, and more.
> It can specify an "action" field which will handle the completed form.
> And we can have multiple <forms> in a single request/reply, useful for
> example to represent results of multiple directory searches.
> Finally you might say, there is nothing "x" about this "xdata" model.
> Well that's right, so perhaps instead of xdata we should call it what it
> is: a form. That would give us namespaces like jabber:iq:form:register
> and jabber:iq:form:search, etc.
No no no. x:data is a tool. It is not a namespace in itself. The
packet is not any harder to parse and pull data out of if you had been
parsing it correctly in the first place. As XML, you are supposed to
ignore any tags that do not recognize. Since <x xmlns='jabber:x:data'>
never appreared in a packet before, it shouldn't break any clients that
have been written the right way.
It comes back to the context. x:data is pointles without context, and
iq:register is already context enough. You do not need to create
sub-namespaces to do this. If you don't like the way this JEP behaves
then you are going to be disappointed in the future. There are several
JEPs coming down the pipe that are basically defining small tools that
can used to extend context namespaces. Basic tools that Clients can
support very easily, and simply pull context out of the wrapper.
I can see the usefulness of stating that you support x:data in the
request, but creating a jabber:iq:x:data:register is ugly, and
completely unneeded. The jabber:iq:register provides enough context on
what the form is.
> To me, JEP-0004 does not look like a major improvement over the existing
> jabber:iq:register and jabber:iq:search namespaces. It is certainly a
> step in the right direction, but needs at the very least to have some
> room for growth options down the road. Otherwise, in two years time
> when everybody realises they wanted a "date" field type, we'll be forced
> to abandon jabber:x:data and move to a new namespace-du-jour. Or come
> up with another, even more complicated method for being backwards
Why can't we simply add a type='date'? Why does it have to be the end
of the world?
> It seems that the jabber community is just scratching the surface of the
> forms data issue. Meanwhile, the W3C has been dealing with HTML forms
> for well over a decade. We'd do well to learn from their mistakes
> (starting with <input> tag then later adding different types, then
> eventually coming up all sorts of non-specific "next generation"
> models). I definately think KISS principle should apply, and we can
> safely get rid of a lot of historical cruft - for example we can make
> the "type" attribute mandatory - but the basic ideas learned in HTML
> forms are good starting blocks for the Jabber model.
Oddly enough, that was the starting point for x:data. The only
difference being that x:data is above and beyond just the form. It's
about the data.
> Using <x> tags to add the x:data into existing namespaces may seem like
> a good way to introduce the concept right now, but down the road it will
> only be a hindrance to development - old version cannot be retired
> because they are the container for the new. And we can't just "forget"
> the old protocol without breaking the established schema for the older
> namespace. Better to introduce a new namespace now - it is simpler and
> cleaner this way.
> If you've read this far.. thanks for hearing me out,
> PS. I'd be willing to rewrite the JEP (or submit a new one) to reflect
> my idea of how this x:data idea should be done. I really didn't write
> this whole rant^H^H^H^Hessay only to stir up trouble, rather I'd like to
> see a genuinely useful standard emerge.
We appreciate your input.
Ryan Eatmon reatmon at jabber.org
Jabber.org - Perl Team jid:reatmon at jabber.org
More information about the Standards