[Standards] Questions about xhtml-im

Peter Saint-Andre stpeter at stpeter.im
Tue Jul 29 05:06:20 UTC 2008

Jehan wrote:
> Hello,
> I try to understand the logic of 'xhtml-im'
> (http://www.xmpp.org/extensions/xep-0071.html). Is there anyone nice
> enough to explain me the following points please? :-)
> 1/ Section 4: > Lightweight text markup is then provided within an <html/> element
>> qualified by the 'http://jabber.org/protocol/xhtml-im' namespace. [14]
>> However, this <html/> element is used solely as a "wrapper" for the
>> XHTML content itself, which content is encapsulated via one or more
>> <body/> elements qualified by the 'http://www.w3.org/1999/xhtml'
>> namespace, along with appropriate child elements thereof.
> So why is the xhtml-im namespace for here if it is not used at all (as
> the direct and only son is under the normal html namespace)? This is
> maybe a stupid question as I think I have still not understood the
> complete logic behing the xml namespaces...
> Or should its role be to "tell" (through its schema?) which subset of
> xhtml is authorized in it?

Right. The 'http://jabber.org/protocol/xhtml-im' namespace lets you know 
that this is the XHTML-IM integration set, not full XHTML or some other 

> Now the real and more important issues I have:

To understand XEP-0071, you need to know that it is defined very 
carefully in terms of XHTML modularization:


I think most of your confusion comes from the fact that you don't seem 
to have grokked modularization.

> 2/ in 6.1, it is said that the structure module includes the elements
> <head>, <html> (I guess this "html" means the one under the namespace
> 'http://www.w3.org/1999/xhtml', not the one under the namespace
> 'http://jabber.org/protocol/xhtml-im', doesn't it?) and <title>. 

That is true in XHTML itself.

> But in
> the meantime, it is said that under the XMPP <html> (prefixed by
> 'http://jabber.org/protocol/xhtml-im') tag, you have only one or more
> <body> ('http://www.w3.org/1999/xhtml').

Correct. Because we defined our own integration set.

> Yet as far as I know (and as confirmed by any of the 'xhtml DTDs'
> (http://www.w3.org/TR/xhtml1/#dtds) ), inside a body
> ('http://www.w3.org/1999/xhtml'), you cannot have any html
> ('http://www.w3.org/1999/xhtml'), head or even title tag.


> Hence I would think this is an error telling they are possible tags in
> the xhtml-im subset of xhtml (as a consequence, it becomes useless to
> unrecommend them in the section 7.1!). Or were you planning any other
> wrapper element than <html> ('http://jabber.org/protocol/xhtml-im') for
> these tags?

No. Please read up on XHTML modularization. And yes I know that it's 
confusing. Blame the W3C for that.

> 3/ Section 6.6, I don't understand what the Style attribute module
> defines:
>> The Style Attribute Module is defined as including the style attribute
>> only, as included in the preceding definition tables.
> It looks like there is no additional information that what already was
> in other modules (which already included the "style" attribute for any
> tag)...
> So what is this "module" and its content?


> 4/ Section 7, I don't understand the concept behind all the
> "recommended profile" part. It looks like the whole important idea
> behind xhtml-im here is solely style, not semantic! I mean, this is the
> whole point of the importance given about "semantic web", and all the
> work which has been done for the last years in the W3C to bring real
> semantic to xhtml. And in our subset, we would want to remove all this
> as not recommended and give higher priority to the absolutely not
> semantic part? The main example is that the most important attribute for
> all tag seems to be "style"!

This is for simple IM formatting. See the description of scope.

> And several tags that I would think are very basic and important in a
> definition of xhtml-im are not recommended, like "em", or "strong", or
> all the titles tag "h1" to "h6", the cite and blockquote, etc.
> -> You don't set text in bold or italic (which you can do with the
> style attribute), you emphasize them!
> -> You don't set a text with a bigger police, underline it and give it
> a different police, no you set titles, subtitles, etc.
> The style should come from the meaning of the tag, like in the web!

How so? Remember that we don't have external CSS here. :)

> If you read the abstract of the xhtml 1 specif (linked from the XEP
> -0071), the semantic is given a nice part:
>> This specification defines the Second Edition of XHTML 1.0, a
>> reformulation of HTML 4 as an XML 1.0 application, and three DTDs
>> corresponding to the ones defined by HTML 4. The semantics of the
>> elements and their attributes are defined in the W3C Recommendation for
>> HTML 4. These semantics provide the foundation for future extensibility
>> of XHTML. Compatibility with existing HTML user agents is possible by
>> following a small set of guidelines.
> The web is becoming more and more semantic, 

That can be debated. :)

> this would be a shame XMPP,  which is pretty new,

10 years old in 2009. :)

> would not be semantic...

We had many discussions about structure vs. style when we defined 
XHTML-IM. I'm sure there is a lot about this in the list archives.

> 5/ Linked to the previous point, this XEP seems to describe XMPP usage
> only for IM point of view, 

Correct. It's XHTML-IM, after all.

> but it has other usages now:
>> Even within the restricted set of modules specified as defining the
>> XHTML-IM Integration Set (see preceding section), some elements and
>> attributes are inappropriate or unnecessary for the purpose of instant
>> messaging
> For instance, there can be notifications, textual data exchange, and
> most probably other cases... And for this, we may need to structure text
> (which then can be rendered according to the given structure!).

We could always define another integration set...

> 6/ In section 12.2, when you explain the meaning of "ignoring" an
> element, I can read:
>> Therefore, an XHTML-IM implementation MUST process all XHTML 1.0 child
>> elements of the XHTML-IM <html/> element even if such child elements are
>> not included in the XHTML 1.0 Integration Set defined herein, and MUST
>> present to the recipient the XML character data contained in such child
>> elements.
> I am just asking the meaning of this sentence. 

Please quote the entire section:


A user agent that implements this specification MUST conform to Section 
3.5 ("XHTML Family User Agent Conformance") of Modularization of XHTML. 
Many of the requirements defined therein are already met by Jabber 
clients simply because they already include XML parsers.

However, "ignore" has a special meaning in XHTML modularization 
(different from its meaning in XMPP). Specifically, criteria 4 through 6 
of Section 3.5 of Modularization of XHTML state:


       W3C TEXT: If a user agent encounters an element it does not 
recognize, it must continue to process the children of that element. If 
the content is text, the text must be presented to the user.

       XSF COMMENT: This behavior is different from that defined by XMPP 
Core, and in the context of XHTML-IM implementations applies only to XML 
elements qualified by the 'http://www.w3.org/1999/xhtml' namespace as 
defined herein. This criterion MUST be applied to all XHTML 1.0 elements 
except those explicitly included in XHTML-IM as described in the 
XHTML-IM Integration Set and Recommended Profile sections of this 
document. Therefore, an XHTML-IM implementation MUST process all XHTML 
1.0 child elements of the XHTML-IM <html/> element even if such child 
elements are not included in the XHTML 1.0 Integration Set defined 
herein, and MUST present to the recipient the XML character data 
contained in such child elements.


> What I understand is
> that when I encounter a tag which I recognize as being xhtml, but which
> is not in the xhtml-im subset, then I must display it "as is"?

Let's say you receive this:

<html><body><p>I like the Extensible Messaging and Presence Protocol 

In this case you would display the XML character data of the <abbr/> 
element even though it's not part of the XHTML-IM integration set.

That's just one example.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 7338 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.jabber.org/pipermail/standards/attachments/20080728/d7ed9fc7/attachment.bin>

More information about the Standards mailing list