I think we are saying the same:
1. Self-closing tags work for foreign elements
2. Self-closing tags work for valid XHTML tags if they are void tags in
HTML, because the self-closing '/' mark is not necessary and
effectively ignored when parsing.
3. However, self-closing tags for valid XHTML tags that are raw text or
normal element in HTML (e.g. <span />, <script />, <div />) do not work
when parsed as HTML, because, again, the self-closing '/' mark is
ignored, but that means that the HTML parser considers the tag not
closed, meaning it remains opened and thus turns the valid XHTML into
invalid HTML.
The last is not some weird XHTML mime type extra, it's just that
changing the MIME type can trigger supporting browsers to use the XML
rather than the HTML encoding of HTML. And in the XML encoding it's
perfectly valid, because the self-closing tag is actually treated as
one.
That's by the way not an obscure example. This happens in the wild a
lot, the most common case is <script src="[..]"></script> which is
the
only valid way to embed a javascript from a remote source in HTML
encoding. Notably the short <script src="[..]" /> is not valid in HTML,
but in XML encoding it's even the canonical form. Which means that for
many HTML web pages, if you turn them in canonical XHTML form, they
will actually be invalid in HTML encoding.
And just to remind why that's relevant: It means that a HTML parser
that only supports the HTML encoding and not the discouraged XML
encoding will not be able to (correctly) parse valid XHTML as is.
Also a relevant note is that XHTML(-IM) allows for mixed content. Mixed
content is not used elsewhere in XMPP and some XML parsers used in XMPP
don't fully support it.
XHTML-IM also uses CSS, which is another language that's actually not
XML, so even if one resided to not pass the XHTML to a HTML parsing
library but directly implement the parsing/processing based on the XML
parser already present for the XMPP document, one would still need a
CSS parser.
And again, XEP-0394 is not an attempt to re-invent HTML. (X)HTML is
invented for documents, it's simply not a good fit for markup of non-
documents like IM, mostly because messages are usually not displayed as
a document within a frame or similar, but rather in various ways,
depending on the client and platform. XEP-0394 is a markup specifically
targeted towards the messaging usecase and built such that the main
body is not duplicated to achieve backwards-compatibility.
Anyway, I don't think this discussion is going to lead anywhere. I do
wonder though if there are any actual implementations of XHTML-IM as
described in XEP-0071 are out there. If not, that IMO confirms it
really is not needed.
Marvin
On Sun, 2026-03-15 at 15:22 -0500, Stephen Paul Weber wrote:
This is not
correct. HTML5 HTML syntax allows for self-closing tags
in
foreign elements (e.g. when having an svg embedded directly in the
document), it does not allow for self-closing tags for HTML void
elements (those that don't have a closing tag, like <img>), raw
text or
normal elements (which have mandatory closing tags).
This is not correct. The HTML5 specification, 13.1.2.1:
Then, if the element is one of the void elements,
or if the element
is a
foreign element, then there may be a single U+002F SOLIDUS
character (/),
which on foreign elements marks the start tag as self-closing. On
void
elements, it does not mark the start tag as self-closing but
instead is
unnecessary and has no effect of any kind.
It "does not mark the start tag as self closing" of course because in
an
HTML parser context the element is inherently self closing and needs
no such
mark. But the syntax is explicitly supported here.
However if you do <img src=foo.png/> (which
is valid in
HTML, but not XML) the '/' actually becomes part of the 'src'
attribute
value, meaning that the file 'foo.png/' is used as src.
Yes. This unfortunate edge case is mentioned also in the spec,
however if
using well-formed XML it is fortunately not possible to run into it.
For normal elements where a closing tag is
mandatory, the
unnecessary
empty '/' attribute works the same, meaning the HTML parser is
still
waiting for the closing tag after a seamingly self-closed element.
As
an example <div><strong />Hey</div> will render "Hey" in bold
when
parsed as HTML (because the strong is not closed), but not when
parsed
as
XHTML.
https://gist.github.com/mar-v-in/7aa612d173d02240b7d2124c18670ec3
is an example file, which when you save it with .html ending and
open
it in a browser, it will make the last line bold, if you rename the
file to .xhtml, the last line won't be bold - because the file
ending
is translated to an appropriate MIME type and the XML parser is
triggered. I reproduced this on both Firefox and Chromium and it
matches the specification.
Yes sure, old XHTML mime type had all kinds of extra things like this
and
that's part of why people got grumpy about it.
Anyway, I still haven't heard of the features
and functionality
that
people aim to get by reinstating XHTML-IM that XEP-0394 couldn't
provide as well or even better.
XEP-0394 is a non-starter for me. An attempt to re-invent HTML
ourselves
with character ranges instead of markup? An attempt to do markup in
XML
without resorting to... actually using markup? If anything as I said
in my
first post the existence ofr 0394 in experimental shows that there is
a
desire for a stable standard for rich text in XMPP using XML, and
indeed we
already have a much better one in XHTML-IM.