Hello,
first of all, this is my first attempted contribution to an Internet
standard, so I do apologize if the form is a bit off.
That said, I was implementing a parser for "XEP-0393: Message Styling",
and I ran into the issue that not much is defined in the way of escape
characters.
In Gajim, for instance, the sequence
*hello*
becomes bold, the sequence
\*hello*
does not, but the sequence
\\*hello*
also does not.
In many applications, preceding a \ with another \ escapes the latter \,
making it so that the escape character itself is escaped, and therefore
the * that creates the emphasis span would not be.
However, the standard does not mention any rule regarding this (and
Gajim does not do what I expect), so it is perhaps a good idea to add a
note about this before making it final.
Kind regards,
Werner
Show replies by date
Hi Werner,
This is not specific to the '\' character - any non-whitespace character will have
the same effect (e.g. "this text*is not bold* "); as specified in section 6.2:
"… The opening styling directive MUST be located at the beginning of the parent
block, after a whitespace character, or after a different opening styling directive.
…"
So there is no styling because the '*' is not immediately after a whitespace
character.
However, the inclusion of a formal grammar would make implementing a parser easier and
help to clarify cases like this.
(The reason for escaping with '\' comes from the C language, where it was useful
to be able to represent ASCII control characters using printable characters, specifically
"\n" to produce a line-feed, plus others; but then comes the need to represent a
literal '\' character that is not an escape, and thus "\\" escapes a
literal '\'. Escaping characters in the context of human-to-human communication
seems less appropriate.)