[Standards] Comments on XEP-0301 (possible impact on -0308 in Section 4.2.3)

Mark Rejhon markybox at gmail.com
Sat Aug 4 09:31:39 UTC 2012

On 2012-08-03 10:53 PM, "Peter Saint-Andre" <stpeter at stpeter.im> wrote:
> On 8/3/12 2:54 PM, Paul E. Jones wrote:
> > Section 4.2.1:
> >
> > Why is "seq" only 31 bits?  Since the same memory is consumed for 31 or 32
> > bits, why not just makes it an unsigned 32-bit integer?  And why worry about
> > wrap-around?  I would allow it to occur.  Specify the behavior.
> Makes sense. For example, in XEP-0047 we say that when hitting the
> maximum we reset the sequence to zero.

-- The 31 bits comes from the lowest common denominator: Java does not
have an unsigned int type.  There are workarounds.
-- Considering wraparound will never happen, if you begin with any
values reasonably below MAXINT, whenever message resets occur?  You
only increment 15 times before a reset occurs, anyway.
-- Resetting back to the same value (instead of randomizing) will
impact the multiple-resource scenario, see the last paragraph of
section 4.6 "Keeping Real-Time Text Synchronized":

> > Section 4.2.2:
> >
> > A value for "init" is that it would remove any ambiguity related to the
> > "seq" value.  The "seq" value could always start at 1 if "init" were
> > required.  The problem with "init", though, is that if a sender sends three
> > messages one after the other, the first two might go to client A and the
> > last one might go to client B.  This would happen if I have two XMPP clients
> > connected to the server and I disconnect one.  Therefore, "init" and
> > "cancel" seem pointless.  I'd suggest getting rid of them entirely.  I like
> > having "new" since that Client B I refer to would know that if it gets an
> > <rtt> that is not "new" it must be some message somewhere in the middle of
> > typing and can just ignore those until it gets a <body>, then pick up with
> > RTT on the next <rtt event="new">.
> I think that's sensible, and it would simplify the protocol a bit
> further. Thanks for bringing up the multi-resource case.

Yes, Paul's scenario is sensible, but it does not exclude init/cancel.
I already cover the multi resource scenario.  No change to protocol is
needed, keeping init/cancel for multi resource scenario is not
See the section "Keeping Real Time Text Synchronized." -- the multi
resource scenario is conveniently covered already:

Quote: "Recipient clients MUST keep track of separate real-time
message per sender, including maintaining independent seq values. For
implementation simplicity, recipient clients MAY track incoming <rtt/>
elements per bare JID. Conflicting <rtt/> elements, from separate
Simultaneous Logins, is handled via the remainder of this section.
Alternatively, recipient clients MAY keep track of separate real-time
messages per full JID and/or per <thread/>."

First, I should clarify that Activation/Deactivation methods are
essentially optional; and of interest to implementers (including an
implementer I talked to that covers more than 100 million users on
their XMPP network, requires the activation/deactivation feature, for
many reasons, not always all agreeable reasons, but it is a legitimate
choice to let implementers choose their activation method.   Paul, for
example, can choose the activation method that he prefers (an on/off
Now, init/cancel can be appropriate for the multi-resource scenario
(simultaneous login).
-- Switching from a client that always activates (or was configured to
auto-accept), to a client that always activates (or was configured to
auto-accept), would work normally, as described in Paul's scenario
-- Switching from a client that always activates to a client that
confirms first, would work fine, since the continuing real time text
conversation would just cause incoming real time text to continue to
be displayed (as recommended by the last pararaph of section 6.2.1
Activation Methods), while waiting for the user to re-confirm, if the
recipient switches software while the sender is sending real time
text.   If the software has a confirmation (accept after confirm), a
prompt would just reappear
-- Switching from a client that activates to a client that has
real-time text turned off, will have the expected behavior of real
time text being deactivated

It bears worth noting that incoming init can be ignored if you're
already active and using bare jid distinguishing of real time
messages, (e.g. sender lauching 2nd copy of software, and activating
real time text in the 2nd copy), or you can distinguish the senders by
the full jid.
Section 4.6 "Keeping Real-Time Text Synchronized" last paragraph says:
"Recipient clients MUST keep track of separate real-time message per
sender, including maintaining independent seq values. For
implementation simplicity, recipient clients MAY track incoming <rtt/>
elements per bare JID. Conflicting <rtt/> elements, from separate
Simultaneous Logins, is handled via the remainder of this section.
Alternatively, recipient clients MAY keep track of separate real-time
messages per full JID and/or per <thread/>."

So, as you can see, init / cancel is fully appropriate (And backwards

One of the reasons I wrote the Activation/Deactivation methods section
is as a "BEST PRACTICES".   It's not required reading for the spec,
but the existence of such a section, greatly enhances interoperability
between clients that choose to implement very specific
activation/deactivation method (e.g. implemented by one company), and
clients (e.g. implemented by a different company) that simply just
sends <rtt/> whenever disco allows.   We have to avoid a scenario
where two willing clients do not talk to each other, because they
invented an activation/deactivation method that is incompatible with
the best practices that I wrote.

Should I rename the sections as:
"Optional Activation Methods"
"Optional Deactivation Methods"
"Activation Best Practices"
"Deactivation Best Practices"

Eliminating Activation/Deactivation is not an option -- it is
unfortunately a necessary section -- but I am happy to simplify its
wording while still deifning what best practices are available to
maximize interoperability between clients that implements very
different activation/deactivation methods.   It was written carefully
to maximize interoperability; but it seems some people are assuming
that the whole section needs to be supported, when it really doesn't
have to be supported -- implementers are actually free to choose their
own activation/deactivation methods.

Please help me by improving the wording of the activation/deactivation
chapters, but the removal of this section is not an option.  It is
necessary for some implementers, including an implementer that
controls over 100 million users.

> > Section 6.2.1:
> > I think the activation logic is complex.  Let each user turn it on or off as
> > he sees fit.  If you send <rtt> tags to my client, whether that gets renders
> > or not depends on my local settings.  I don't see a strong need to negotiate
> > this.  Just always send <rtt> and display it (if received) whenever the user
> > enables RTT.
> The objection here is that if my client doesn't support RTT and you send
> me RTT despite that fact, I will end up receiving a lot more data than I
> need to or want to (e.g., this extra data might cost me money by running
> up my bandwidth usage).

Please see my reasoning above.
-- Activation/deactivation is an optional feature.
-- It was written in such a way to maximize interoperability with
clients that simply does exactly what Paul wants a client to do --
He's essentially following (without being aware of it) activation
methods already.
-- Paul is already following Activation methods without being aware of
it, from section 6.2.1 ...

Sender from Paul's perspective.
"Begin transmitting real-time text (by sending any valid <rtt/> elements)"
(Quoted from Activation Methods)

Recipient from Paul's perspective:
"Accepting immediately (by activating in response)". (First bullet in
section 6.2.1)
activating in response simply means the same thing: "Begin
transmitting real-time text (by sending any valid <rtt/> elements)"
That's what Paul is implying.

>From Paul's perspective, if his software has RTT turned off, it's
exactly the same as:
"Ignoring (by discarding incoming <rtt/> as a last resort, without
using Deactivation Methods)." (second last bullet in section 6.2.1)

So, Paul E. Jones wrote that he is already following *exactly* the
Activation Methods, 100% in-spec, without being aware that he is
already doing so.
See?  I was clever.  :-)

So, Activation methods is actually simple; it just looks confusing at
first glance.
(Confused?  Re-read
http://xmpp.org/extensions/xep-0301.html#activation_methods from the
Paul perspective shown above (and then come back to this email)

I'm interested in feedback to *improve* the Activation methods,
because Activation/Deactivation section can't be removed or I disrupt
quite a few important implementers (including a big one) -- it is a
necessary section -- I'd like some help improving/simplify the
section, please.

Obviously: I did quote "as a last resort" (ignoring <rtt/>) because
you truly want to save bandwidth when recipients do not want to
interpret <rtt/>.  However, Paul's software is welcome to simply send
<rtt/> (If disco confirms, senders can just send you <rtt/>,
recipients that have disco but choose to turn off RTT, can also choose
to ignore incoming <rtt/>.  This is not a highly recommended scenario,
and Gunnar doesn't like this, but it's certainly a possible scenario
allowed by XEP-0301: Having disco means you can get incoming <rtt/>
but that your client chooses to ignore <rtt/>)

Paul, any comments?

> > Section 9:
> > How does XMPP indicate that a message should be displayed LTR or RTL?  Is
> > that derived from the language indicated in the <body> tag?  This is legal:
> >
> > <body xml:lang="en">This would display left-to-right</body>
> >
> > In any case, we do need to ensure we capture directionality for languages
> > like Hebrew.
> This is not indicated at all in XMPP, because it is handled by the
> receiving application based on the Unicode characters themselves.

That's correct.  Text direction is a Unicode string feature, and not
the responsibility of XMPP.
>From the perspective of XEP-0301, a real-time message is just a string
that has embedded directionality (e.g. Unicode default directionality
rules, as well as Unicode LTR / RTL override control characters)  ...
The sender/recipient GUI already automatically handles the
directionality.  Sender reads Unicode string from a good GUI Unicode
textbox. (Directionality is included)  Transmits over XEP-0301.
Recipients passes Unicode string to good GUI Unicode display control.
(Directionality gets rendered).   So that's an implementation detail
by a different standard (Unicode), and is beyond scope of XEP-0301
which is just a transmission conduit for a "string" of Unicode code
points.   Many modern/newer textbox controls on many operating systems
(e.g. including all languages of Windows, including English Windows
and Hebrew Windows) -- follow UAX #9 for text directions in Unicode --
http://unicode.org/reports/tr9/  ... So text direction has gradually
over the many years, become more and more of a GUI toolkit
responsibility nowadays.

Mark Rejhon

More information about the Standards mailing list