[Standards] XEP-0301 0.5 comments [Sections 6 and beyond]

Mark Rejhon markybox at gmail.com
Wed Jul 25 20:42:26 UTC 2012

Kevin, thank you very much for your comments!
-- I think we've now covered section 6 and beyond sufficiently for Last Call.
-- Now we just need to cover sections 1 through 5, and then I'll submit a v0.6.

> My reading of 6.1.4 is that it's fine to vary the interval significantly
> - when talking about varying for low-bandwidth concerns
> I immediately thought we were talking about order-of-magnitude
> variation. If this isn't the intention I think 6.1.4 could do with
> tightening up.

[Change Made]
Added the phrase ", preferably within the range recommended by
[[[Transmission Interval(link)]]]."

An order of magnitude is extreme (e.g. 0.7 seconds versus 7 seconds).
The range is 0.3s to 1.0s (to keep within ITU-T F.700).  There's
wiggle room for extenuating reasons -- such as longer intervals during
congestion, narrowband GPRS, during fast surges of unimportant
real-time text (e.g. detection of duplicate character, such as
spacebar stuck down by accident), etc.

>>> 6.2.1 - I suspect this should be more prominent than buried inside Implementation Notes
>> I'm glad you think this section is important enough to be part of the Protocol. [snip]
>I checked some other XEPs and decided it's probably fine where it is.

Ok, for now I'll keep the section where it is, and await comments from LC.

>> Can you provide any suggestions of any further clarifications for <rtt event='init'/>?
>Perhaps it would be worth clarifying that init can be used to indicate
>that RTT is activated prior to RTT being sent.

[Changes Made]
4.2.2 insert sentence "It can be used to signal activation of
real-time text before sending a new real-time message."
6.2.1 modified bullet "Begin transmitting real-time text (by sending
<rtt/> elements); or"
6.2.1 modified bullet "Signaling first (by transmitting <rtt
event='init'/> as the first <rtt/> element)."
This clarifies the purpose of <rtt event='init'/> and decouples it
from the disco stuff.

> >> 6.4.1 - It might be useful to reference some method of calculating this.
> >[snip]
> > -- Therefore, I don't bother to compute more than one edit per change event.
> It's worth suggesting this, then - this was what I referred to in my
> previous mail as 'oversimplifying the edit'. You're right that there's
> a trivial linear implementation if you're prepared to reset inner
> blocks that haven't changed due to bounding changes.

[Change Made]
Changes made to section "Monitoring Message Changes Instead Of Key Presses"
6.4.1 insert sentence "The difference in text, between consecutive
text change events, is typically a one character difference (e.g. key
press) or one text block difference (e.g. auto-correct, cut, paste).
6.4.1 Other tweaks:
....change "for any text insertions and deletions" into "for a single
text block deletion and/or insertion"
....change "from before the text change event" changed "from the
previous text change event"
....change "it is possible to" changed to "it is simple to"
....change "This is equivalent to recording a small sequence of
typing" into "Repeating this step during every text change event, is
equivalent to recording a small sequence of typing."

> > [Change Made]
> > Beginning now says "It is possible for sender clients to implement
> > [[[Message Reset]]] as the only method of transmitting changes to a real-time message."
> Thanks. I'd be inclined to add something like "This method of sending
> is discouraged for general-use clients" or something. Plenty of wiggle
> room for people who feel they have a need to do it, while providing
> sensible guidance to people who really just want to know the Right
> Thing to do.

[Changes Made]
6.4.3 add sentence "This method of sending real-time text is generally
discouraged in most general-use clients".
6.4.3 change "However, disadvantages" into "However, major disadvantages"

Side note: I find "Basic Real Time Text" is quite convenient for rapid
prototyping of real-time text -- for management demos -- before
optimizing it properly for release :-)
...To preserve key press intervals in an alternative manner without
using <w/> Wait Intervals -- one can even temporarily use an
ultra-short millisecond-sized transmission interval (allowing 30
stanzas per second for a key held down) and use an internal server
such as Openfire (to keep high stanza rate away from public XMPP

> >> 6.4.4 - this looks like something discouraged, too
> Perhaps a "This sending model is unsuitable for general-purpose
> clients, but useful if mid-message editing capabilities..." would help.

[Change Made]
Mostly verbatim change (c/model/method/ and c/for/for most/ and c/but/but is/)

> >>  6.5 - " In addition, it is best to process <w/> elements using
> >> non-blocking programming techniques." - I don't really know what this
> >> is doing here.
> I think we need to assume that anyone implementing the spec has at
> least a basic competence - this seems to be stating the obvious

[Change Made]
Remove sentence.   I agree -- it'll just be quite obvious to the
implementer if they do Wait Intervals the wrong way.

> >> - this seems inconsistent with an earlier section that (I
> >> think) was recommending or mandating support for multiple full JIDs.
> I don't think you need to expose multiple RTTs to the user, just to
> track them independently.

Discussion split into separate thread.

> If it does remain, I think it can be usefully contracted to something like:
> "Implementors should be aware that processing incoming RTT can cause
> many updates to a message each second".
> This should probably also get a mention in Security Considerations as a fun DoS vector.

[Change Made]
I deleted the sentence instead (agreeing with you)
There's already a sentence in 6.6.5 there already saying "With
real-time text, frequent screen update can occur." -- It pretty much
exactly says the same thing.

> >> 6.6.6 seems redundant.

[Change Made]
Section moved to Interoperability Considerations, where it really
belongs -- explanation below.
(Side effect observed: Intriguingly, the section move apparently
eliminates the weighty "6.6.6".)

Before removal, I need to do consultations first, since parts of it is
not redundant -- a part of XEP-0301 accessibility audience has never
heard of Jingle before, so the reference is appropriate.  In addition,
several vendors including Gunnar Helstrom of Omnitor work with
www.reach112.eu provide ITU-T F.703 (Total Conversation) compliant
clients that are used for accessible emergency service in Europe. This
isn't one package but multiple vendors adhering to standards.  It's
still a very vocal topic in a certain audience.  This is a really
small section, so I'd like to give it LAST CALL scrutiny before
considering removal.   I'll now do some consultations about removing
this section, but it doesn't seem to be a blocker for LC.

> > Also, when the resource is not locked yet (i.e. recipient hasn't replied
> > yet) it is fine to send real-time text only to the bare JID.
> It isn't - the bare JID doesn't have caps, so you can't be sending
> because you know the target supports it, and the bare JID can't have
> sent you a reply to your init, so you'll still be in the state where
> you're waiting for it to send you an RTT element before you continue
> sending, according to the rules in 6.2.1. (Note that we should have
> something like MUST NOT (or maybe SHOULD NOT, at a push) in 6.2.1
> rather than 'is inappropriate').
> > This makes the real-time text show up on all concurrently-logged in resources simultaneously.
> It needn't, bare-JID handling will usually not send to all resources
> unless the clients have activated carbons.

Discussion split into separate thread.

Additional note:
Also, I've kept RFC2119 normatives away from "Implementation Notes".
There are no RFC2119 normatives beyond section 5.
(Peter recommended I remove them from "Implementation Notes", or move
relevant parts outside of Implementation Notes)

> >> 10.1 - "It is important for implementers of real-time text to educate
> I think the first sentence could be removed and for the following to be
> "It is important for users of real-time text to be made aware..."

[Change Made]
"It is important for users to be made aware of real-time text (e.g.
user consent, software notice, introductory explanation)"

> I suggest the following text.
> "An implementation MUST NOT activate sending sending of RTT without the user's consent".
> How implementers choose to interpret 'user's consent' is up to them,
> and seems to be a sensible balancing of allowing RTT-specific clients
> to behave sensibly when their users are expecting RTT while still
> requiring that mainstream clients don't start sending out people's
> passwords unexpectedly.

[Change Made]
Related change was already made, see above, without RFC2119 normative.
(Peter Saint Andre suggested eliminating normatives beyond section 6)

Note: Over a longer timescale, it is possible that people may get more
familiar with real-time text (or "Fast Text") it can be acceptable to
simply display a short software notice or such.  (And putting the fine
print in that privacy policy link at the bottom of a webpage).  For
example, online customer support chat on a popular online shopping
website circa 2020, or some other mainstream situation.  If education
about real-time text is widespread by then one way or another -- then
real-time text might become largely expected and normal for various
kinds of mainstream situations (such as this or others not foreseen)
-- to the point where "user consent" merges with "just beginning to
type" for certain specific situations.  Because of this, I kept the
recommended change very general (and the lawyers can handle the rest,
if needed).

> >> 10.3 - "Use of this specification in the recommended way will cause a
> >> load that is only marginally higher than a user communicating without
> >> this specification." - do you have numbers for this? It seems quite
> >> counterintuitive, I'd expect it to increase the server load due to
> >> message routing roughly by a factor of the number of RTT transmitted
> >> between each typical <body/>.

[Change Made]
Last paragraph is now reduced from 3 sentences to 2 sentences:
-- Followed your suggestion to delete this first sentence of last
paragraph "Use of this specification [etc]..."
-- Replaced second sentence of last paragraph: "The additional
bandwidth overhead of real-time text can be very low for an existing
XMPP client, especially one already using many extensions."

I am aiming at a compromise: A sentence about bandwidth general enough
to satisfy both sides in a reasonable balance (People telling me not
to mention bandwidth, versus people concerned about bandwidth and
asking about it, versus people who wants me to add several paragraphs
with math formulas and hard data).  I will not be able to satisfy all
audiences -- there's pressure in all those directions, implementers
who won't use the spec if it uses too much bandwidth.  There are many
ways to optimize <rtt/> bandwidth (e.g. longer transmission intervals,
elimination of short <w/> Wait Interval, slipstreaming <rtt/> into
chat states, merging of consecutive action elements, etc.) -- I have
now worded the sentence with the word "can be" -- turning it into
"possible" instead of "always-is" fact -- so I hope this is reasonably
fair to pass to LC scrutiny.   (If this v0.6 fails, I can remove it
for 0.7)

> > Therefore, it's frequently about 2 to 3x more stanzas than what would have
> > happened without real-time text.
> I think that a 2x increase in load isn't really congruent with "a load
> that is only marginally higher".

I agree it's quite subjective, and depends on so many variables (e.g.
how many stanzas is the client already sending in the first place,
versus, the increment that real-time text would contribute to this).
At this moment, I hadn't intended to mention anything about load
increases since real-world measurements show a variance of less than
1x all the way to more than 20x, depending on all the variables
(implementation, optimizations, message length, etc.)  ...  Unless
others think I should mention approximate increases for a specific
situation?  (e.g. executing a real-world test with specific variables,
and then citing the results)

> I don't believe this to be true. Most server cost for an 'optimized
> server' would be associated with the routing of the stanza, not the
> number of similar messages that will follow it, with the possible
> caveat that sending similar messages will compress quite well.

Yes, that's true.  The initialization of a connection and the routing
is the most expensive operation.  I would have thought though, that
messages being sent less than 1 second apart are more likely an
optimized routing than messages sent 5 minutes apart.   Typing occurs
in bursts, so <rtt/> will often come in bursts, with idle periods.
But this is probably moot over the timescales of composing short
messages, anyway, except for stanza compression.
Nothing is mentioned in the spec on specific loading factors, so I'll
leave it out of the spec, then.

> >Ideally, it would be nice to be acknowledged for this idea somehow
> > *somewhere*, one way or another, even if it just generically says "Mark
> > Rejhon came up with the method of preserving key press intervals, which is
> > called "Natural Typing" at R3TF".   (The technique is called "Natural
> > Typing" within all of us at R3TF)
> > Comments?
> I think that wording (or something similar) is less confusing.
> "Invention" is a word that sets off alarm bells for me.

[Change Made]
Changed first sentence of last paragraph to:
"The technique of [[[Preserving Key Press Interval(link)]]], otherwise
called "natural typing", was created by Mark Rejhon, who is deaf."
(Kept second sentence about XSF's IPR Policy, which wasn't mentioned
that I should remove)


Thanks again for your comments & changes!
Just now need to cover sections 1 through 5, and then I'll submit a
v0.6 to Peter.

Mark Rejhon

More information about the Standards mailing list