[Standards] Need sanity check on an example in XEP-0393: Message Styling

Tedd Sterr teddsterr at outlook.com
Fri Nov 6 20:59:06 UTC 2020


The way you're suggesting requires unbounded lookahead - you'd have to check from the current potentially-opening-directive all the way to possibly the end of the line/string just in case there might be a matching closing directive, all to decide whether it is indeed an opening directive; you'd then need to do the same again for every further potentially-opening-directive. So you would end up parsing multiple subsections of the string multiple times.

My suggestion only requires that you lookahead one character and make the decision right there of whether it's a valid open or not (a valid open is not in itself a valid span since you also require a valid close.) So you only need to parse the whole string once.

For directives inside spans, there are no directives between '**' because there are no characters between the two; if there were intervening text, that text might happen to contain potential styling directives (spans can be nested, but still require a valid open and close). If you have something like '*strong_text*' then the _ would be a valid open, except it doesn't have a matching close, so it doesn't constitute a valid span, and is inactive.

For my other suggested example, '**text*', the first asterisk would be inactive, but the second would be a valid open and the third a valid close, leading to **strong*.

Don't try to be overly clever with the parsing, a lookahead of one character should be sufficient to identify directives. (Whether they are active and demark spans depends on matching pairs of directives.)


________________________________
From: Standards <standards-bounces at xmpp.org> on behalf of Sam Whited <sam at samwhited.com>
Sent: 06 November 2020 19:02
To: standards at xmpp.org <standards at xmpp.org>
Subject: Re: [Standards] Need sanity check on an example in XEP-0393: Message Styling

To clarify, I think your logic is what I was thinking when I put that
example in initially. My thought that I was wrong mostly stems from
"Characters that would be styling directives but do not follow these
rules are not considered when matching and thus may be present between
two other styling directives."

The middle asterisk does not follow the rules (there is no text between
it and the previous opening styling directive), therefore it is not
considered when matching and may exist between two styling directives.

But I'm still not 100% convinced by either reasoning and am trying to
think how to clarify the situation.

—Sam

On Fri, Nov 6, 2020, at 13:49, Sam Whited wrote:
> But there is intervening text, the middle one you just decided was
> text and not a styling directive. Are you suggesting it doesn't count
> as text, and doesn't count as a closing directive? If so, what is it?
>
> Seems like I definitely need to clarify the rules a bit either way.
>
> —Sam
>
> On Fri, Nov 6, 2020, at 12:49, Tedd Sterr wrote:
> >  Input = ***
> >
> > Current = * (index 0) Lookahead = * (index 1)
> >
> > Is current a styling directive? No, because lookahead indicates no
> > intervening text.
> >
> > Current = * (index 1) Lookahead = * (index 2)
> >
> > Is current a styling directive? No, because lookahead indicates no
> > intervening text.
> >
> > Current = * (index 2) Lookahead = EOS
> >
> > Is current a styling directive? No, because lookahead indicates end
> > of string.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/standards/attachments/20201106/2ade99dc/attachment-0001.html>


More information about the Standards mailing list