[Standards] XEP-0301 0.5 comments [Sections 6 and beyond]

Gunnar Hellström gunnar.hellstrom at omnitor.se
Tue Jul 24 08:13:04 UTC 2012

I am adding comments marked <GH>
On 2012-07-24 01:37, Mark Rejhon wrote:
> [Part 2 of 2, continued, ultra-long discussion regarding Kevin's 
> comments]
> Note: Due to the large number of comments from Kevin, I'm focussing on 
> addressing Kevin's concerns for now.
> I'd love to hear comments from others (Gunnar, Peter, Matt, etc) on 
> the discussions between me and Kevin.
> On Mon, Jul 23, 2012 at 10:32 AM, Kevin Smith <kevin at kismith.co.uk 
> <mailto:kevin at kismith.co.uk>> wrote:
>     6.1.4 - "it is acceptable for the transmission interval of <rtt/> to
>     vary" - yet earlier there was a SHOULD saying it doesn't vary, wasn't
>     there?
> [Comment]
> http://xmpp.org/extensions/xep-0301.html#transmission_interval
> In the "Transmission Interval" section, I said "approximately 0.7 
> second" and it refers to "continuously-changing message"
> Meaning this is valid: You could transmit varying intervals and stay 
> in spec, like 0.5s, 1s, 0.7s, 0.73s, 1.1s, 5s (typing paused then 
> resumes), 0.7s, 0.3s, 0.8s, etc.   You can technically  optimizes for 
> surges of faster and slower typing, as long as the default average 
> interval is approximately 0.7 seconds.   Some clients will use strict 
> calculations, and send at exactly 0.7 seconds whenever the message has 
> changed.  That behavior acceptable too.  (RealJabber does this)  ....
> Is there a minor wording that I can do, in "Transmission Interval" to 
> make this clear that it is allowed to vary, while satisfying people 
> who ask "is that a default value?" (that's why I added the word 'default')
<GH>I thought the "approximately" to mean something much more close to 
700 ms. I thought you wanted it just because of granularity of timers 
and that there is no need to be very precise. If you let it go down to 
300 ms, you may get problems with bandwidth and servers, if you let it 
go up over one seconds you miss occasionally the usability goals for 
good real-time text according to ITU-T F.700/F.703.
I do not usually regard 300 to be approximately 700. It would be strange 
to need to put a figure on it, but it could be "The interval should not 
vary more than 20%."
>     6.2.1 - I suspect this should be more prominent than buried inside
>     Implementation Notes
> [Comment & Question]
> I'm glad you think this section is important enough to be part of the 
> Protocol.  Activation Methods is quite an important inclusion in the 
> specification, even though some people may disagree (Gunnar prefers 
> real time text to be activated at all times, for example -- and 
> technically I agree -- but realistcally, implementers want to choose 
> their own activation mechanisms).
<GH>Well, you may need an activation method anyway.
> Perhaps I could split it into a "5.1. Business Rules" section (ala 
> XEP-0085) but I'm not sure that this is appropriate.
> Alternatively, I can add it as a "6. Business Rules" (bumping 
> Implementation Notes as a section 7)
> Peter, David, et cetra from XSF, any comments?
<GH>All of chapter 6 contains items that touch protocol, and here and 
there contain rules. So, I think we can safely call all of 6 something 
else. I find "Business rules" a bit strange selection of words for this 
technical area. How about "Application details".
Are there any habits in XSF already for such sections?
>     6.2.1 - I think that presence decloaking is probably a better approach
>     to this than sending init.
> [Change Made & Comment]
> /"Signalling first (by transmitting <rtt [[[event='init'/> as the 
> first <rtt/> element.)"/
> -- The primary purpose of <rtt event='init'/> is not for disco. 
> Therefore, decloaking has nothing to do with init (unless init is used 
> for disco)
> -- That includes any reason, such as activating before typing.  Some 
> implementers want an activation feature (e.g. button, menu, 
> preferences, etc)
> -- Timing of activation can be separate of the timing of sender 
> beginning to compose text.   The existence of <rtt event='init'/> 
> allows decoupling activation timing from actual transmission of 
> real-time text.
> -- Activation/Deactivation may occur multiple times during the same 
> chat session.  It's useful for signalling re-activation of real-time 
> text after <rtt event='cancel'/> because some implementations might 
> otherwise ignore real-time text for the remainder of the chat session 
> after  receiving <rtt event='cancel'/>.
> -- Clients can send <rtt event='init'/> even if they have some 
> real-time text to begin immediately. (i.e. <rtt event='init'/> 
> immediately followed by <rtt event='new'/>)  ... Thus, it is 
> acceptable for implementers to always send <rtt event='init'/>
> -- Theoretically <rtt event='init'/> could be made REQUIRED, but 
> that's not a good idea, especially because recipients can come online 
> after the sender has already started composing a message (includes MUC 
> and simultaneous login situations).
> Even if we eliminate the implicit discovery requirement and 
> "Determining Support" is always followed, the use of 'init' is still a 
> requirement for some implementers for activation/deactivation, to 
> decouple the timing of beginning of real-time text, from the timing of 
> actual creation of a real-time message.   So, sending init is still 
> useful even you follow Determining Support.
> Can you provide any suggestions of any further clarifications for <rtt 
> event='init'/>?
>     6.2.1 - That said, if people disagree and want another 85-ish
>     non-disco mess, I think this can be clarified a bit - at the moment it
>     sounds like disco and init discovery are alternatives, rather than
>     init only being a fallback for when disco isn't available. Perhaps
>     something like:
>     """
>     Activation of real-time text in a chat session (immediate or
>     user-initiated) can be done by:
>     * Immediately transmitting real-time text (if the feature is
>     advertised in by the recipient, as described in Determining Support);
>     or
>     * Where Disco knowledge isn't available (e.g. sending to an entity for
>     which presence information isn't available, and thus the full JID
>     isn't known and can't be queried) by sending a <message/> stanza
>     containing only a "<rtt event='init'/>". In this case there MUST be no
>     further transmission of RTT elements until the recipient indicates
>     support - either by exposing information necessary to use service
>     discovery, or by replying with a (non-cancel event) RTT element of its
>     own.
> [Comment]
> One observation, section 5.1 of XEP-0085:
> http://xmpp.org/extensions/xep-0085.html#support
> "Before generating chat state notifications, a User SHOULD explicitly 
> discover whether the Contact supports the protocol defined herein ..."
> Likewise, XEP-0301 already allows implicit negotiation simply by 
> sending <rtt event='init'/> which is also valid anyway even if you do 
> "Determining Support".  The primary purpose of <rtt event='init'/> is 
> not for disco, but to decouple activation timing from creation of a 
> new real-time message.  It just happens to be the best "first rtt 
> element" to use.  (It happens to be a conveniently valid element to 
> use with or without a sending client determining disco first, in the 
> same style as XEP-0085)
> As XEP-0301 was also designed to also behave as an extended chat 
> state, I'd rather keep it synchronous with XEP-0085 requirements
> (I should have studied it more closely months ago, and synced up 
> requirements much earlier -- to keep the debate simpler).
> I'll stay in sync with future changes to XEP-0085, too (if that's ever 
> done).
> [If discussion is needed on this point, let's split reply to a 
> separate thread again -- it's an a topic meriting its own separate 
> thread, since this could distract from all the other good minor 
> changes in this thread]
> Technically, that <rtt event='init'/> isn't "messy" because its 
> purpose is not designed for disco --
> it simply permits decoupling of the activation moment from the moment 
> of creating a new real-time message.
>     6.3 - "All action elements only have absolute positioning, and
>     positioning does not depend on previous action elements" - this isn't
>     true, positioning is dependent upon processing of previous action
>     elements - a deletion will effect a change of index in all subsequent
>     code points.
> [Change Made]
> /"Action elements only use absolute positioning (relative positions 
> are not used by this standard), so clients do not need to remember the 
> position value from previous action elements."/
> You do not need to keep track of the state of the previous cursor 
> position.  People with an ANSI escape code mindset (VT100, VT102, from 
> the old communications terminals days etc) were asking me if cursor 
> positions can be relative to the previous cursor position. (answer: 
> no). ... Relative cursor positions are never used.
> Bottom line: Clients don't need to remember cursor position state 
> information between action elements.  They only need it for display 
> purposes after processing an action element -- *the cursor position is 
> completely reinitialized after every action element*.   (except <w/> 
> elements which has no effect on the real-time message text or cursor 
> itself)
> The only dependancy is the length, if p is not defined, then p 
> defaults to the message length.
> Comments?  I have to keep the RFC4103 and VT100 "mindset" people happy 
> too.
>     6.4.1 - It might be useful to reference some method of calculating
>     this. It's not immediately obvious to me that it's trivial to work out
>     edits without resorting to something that ends up polynomial in the
>     worst case (or oversimplifying the edit), so some guidance would be
>     handy here.
> [Comment]
> -- It's actually simpler than it looks
> -- It is a linear calculation (CPU expense linearly proportional to 
> message length), not polynomial.
> -- Text change event occurs every key press, so most of the time, 
> message change is only 1 character between text change events!.
> -- In almost all cases, text change events will generate only 1 
> character of change.   Except for things like autotext, autocorrect, 
> and pastes -- then it's a single block event.
> -- Therefore, I don't bother to compute more than one edit per change 
> event.   It's not worth optimizing for this edge case (it'll just look 
> like one larger text change).
> Becaues of this, I don't bother to worry about two separate text 
> changes in two parts of the message -- I'm only worried about the 
> first and last changed character.  Then I create ONE or TWO action 
> element (either a text insert and/or delete event) Pretty simple. 
>  Programming algorithm is found here (line 592):
> http://code.google.com/p/realjabber/source/browse/trunk/Java/src/RealTimeText.java?r=24#592 
> Note: There are several simplifications that can be made for 
> implementers that choose to not implement a Remote Cursor (just define 
> position as -1), and less state information is necessary (in simplest 
> implementations, you really only need to remember the real-time 
> message string.  This is the minimum state info to keep for JavaScript 
> state preservation while waiting between calls, other than whether or 
> not real-time text is active or not)
>     6.4.3 - this says that implementations "may" do this, and I suspect
>     that it really is discouraged rather than truly optional (indeed, the
>     language elsewhere says as much).
> [Change Made]
> Beginning now says /"*It is possible for sender clients to* implement 
> [[[Message Reset]]] as the only method of transmitting changes to a 
> real-time message."/
> Although it already explains why it's discouraged, I've now removed 
> the word "may" to reduce the permissive-sounding tone.
<GH>I earlier proposed another title for this section. Calling it "Basic 
real-time text, it may attract interest from implementors.
Maybe "Using 'reset' for all transmissions."

>     6.4.4 - this looks like something discouraged, too, but this isn't
>     mentioned that I can see.
> [Comment]
> 6.4.4 is useful if it's not humans generating real-time text.  For 
> example transcription bots, gateways, etc.  So it's quite 
> simple/useful to have append-only real-time text (and you can still do 
> key press intervals, if needed, unless you're outputting 
> fully-transcribed words one full word at a time)
<GH>I suggest to delete the second paragraph of 6.4.4. It tries to 
reinstate the functionality you lost by following the simple rule of the 
section, and ends up open ended if statements. I can imagine other ways 
to solve the mid message editing, but I think it is out of scope. By 
mentioning that you lose keypress intervals you may also give the 
impression that that is for all of 6.4.4, but it is only for that 
described method to do mid-message editing anyway in a method intended 
to not have that functionality.
>     6.5 - "Upon receiving Action Elements in incoming <rtt/> elements,
>     they are added to a queue in the order they are received. This
>     provides immunity to variable network conditions, since the queueing
>     action smooth out the latency fluctuations of incoming transmission."
>     - it's not clear to me that it's the queuing that does anything to the
>     latency. Also 'action *will* smooth out'.
> [Change Made]
> /"This provides immunity to variable network conditions, since the 
> queueing action will smooth out incoming transmission (e.g. receiving 
> new <rtt/> while still processing a delayed <rtt/>)./
> Network issues can cause huge variability in transmission interval.
> -- The sender may be sending <rtt/> elements on a 0.7s, 0.7s, 0.7s, 
> 0.7s, 0.7s
> -- The recipient may be receiving <rtt/> elements delayed 0.9s, 1.3s, 
> 1.5s, 0.0s, 0.0s  (due to network conditions)
> This results in stall-surge behaviours of real-time text that need to 
> be smoothed out during playback, for the best user experience while 
> preserving key press intervals.
> Due to various network bufferings that occurs in intermediate servers, 
> and potentially reception issues (loss of wireless reception for a 
> fraction of a second), and other reasons of huge ping variability, the 
> recipient could get the messages in 0.9s, 1.3s, 1.5s, 0.0s, 0.0s  .... 
> That means the <rtt/> received at T+1.3s would be received only 0.4s 
> after the one at T+0.9s .... therefore, the recipient is still playing 
> the previous <rtt/> while it received the next <rtt/>.   It is 
> therefore, necessary to use the queueing/buffering action for improved 
> user experience.    Also observe after a lag (0.9s, 1.3s, 1.5s), the 
> the final 3 <rtt/> is received simultaneously (1.5s, 0.0s, 0.0s), 
> causing a whopping total 2.1s of <rtt/> actoin elements to be buffered 
> up for playback to the recipient's display.  That's also why I 
> included the last paragraph about speeding up playback if there's a 
> situation of excess amount of buffering of action elements from a 
> sudden incoming surge of <rtt/> elements.
<GH>Yes, the speeding up sentence is very important. It is better to 
skip displaying keypress intervals occasionally and get back to 
acceptable latency (<2 seconds end to end ) if there has been a delay.
>      6.5 - " In addition, it is best to process <w/> elements using
>     non-blocking programming techniques." - I don't really know what this
>     is doing here.
> [Change Made]
> /"In addition, it is best to process <w/> elements asynchronously, to 
> avoid interfering with client operation."/
> /
> /
> This is simply a generic comment that indirectly refers to timers and 
> multithreading, rather than inserting a "Sleep" statement in the 
> middle of a single-threaded program.  This causes freezing in user 
> interfaces, especially with long <w/> elements (e.g. <w n='500'/>) 
> could cause a 1/2 second program freeze while it's processing that 
> action element, which is bad. If you're doing MUC, or multiple 
> windows, and you have lots of <w/> elements simultaneously, they all 
> need to be processed asynchronously on their respectively real-time 
> messages.
<GH>This is a programming hint in the middle of protocol description. It 
might be better to delete it. It should be basic knowledge of 
communication programmers to not let the whole process hang on a timer. 
I see a small risk that the words "process asynchronously" are 
misunderstood for something complex in the transmission. Or at least 
cause extra wondering time before it is figured out that it is an 
internal hint.
>     6.6 - "There are other special basic considerations" - isn't that
>     nearly oxymoronic?
> [Change Made] -- removed sentence.  The heading "Other Guidelines" is 
> self-explanatory
>     6.6.1 - "For specialized clients that send continuous real-time text
>     (e.g. news ticker, captioning, transcription, TTY gateway), a Body
>     Element can be automatically sent when messages reach a certain
>     length. This allows continuous real-time text without real-time
>     messages becoming excessively large." - Is this true? Sending a body
>     means you reset the state to the content of the body and terminate
>     that RTT message, which doesn't seem consistent with continuing RTT.
> [Change Made]
> The change made:/ "...a [[[Body Element(link)]]] can be sent *and then 
> a new real-time message started immediately after*, every time a 
> message reaches a reasonable size"/
> It was meant to explain you just simply begin a new real-time message 
> in order to prevent real-time messages from becoming excessively 
> large.  Observe that in 
> http://www.marky.com/realjabber/anim/real_time_text_demo.html you can 
> begin a new real-time message immediately after sending a body element.
<GH>It requires XEP-0308 last message correction and support of the id 
rtt element to be really seamless and enable corrections just after 
shipment of the body. But I think the implementer will realize that, so 
no need to add any more complexity here.
> - This doesn't seem like the wrong approach if RTT is wanted
>     in a MUC (at least until we have per-MUC disco stuff), but I'm
>     somewhat worried about the effect this has as an amplification attack.
>     I don't know what we should say here, but if people can have a think
>     it'd be good.
> [Comment]
> I know MUC is a loaded can of worms, isn't it -- so a lot of things 
> are beyond scope of XEP-0301.
> So, for this reason, I even say that implementers can choose to 
> implement real-time text only for one-on-one conversation, and avoid 
> the MUC issues altogether.
> However, MUC is a requirement for some implementers.
> Technically, there are a lot of things that can be done:
> -- (within spec) Methods found in [[[Congestion Considerations]]]
> -- (beyond scope) Server-based methods of controlling this (i.e. 
> rate-limiting, bandwidth-optimizing such as merger of <rtt/> elements 
> (action elements can safely be merged in adjacent <rtt/> transmissions 
> if it's part of the same message), server-side disco for commanding 
> everybody's transmission interval, server-side signalling of changed 
> transmission intervals during changing server loads, etc)
> -- There are potential ideas beyond scope of XEP-0301 to improve the 
> MUC situation.
> ...MUC is a requirement of XEP-0301 but I wanted to include at least 
> minimal coverage to MUC that is reasonable.  It is a part of a Next 
> Generation 911 demo (InDigital Inc.  Although they rarely publicly 
> comment, search for "indigital.com <http://indigital.com>" recipients 
> in standards at xmpp.org <mailto:standards at xmpp.org> for their comments) 
> for XEP-0301 that was shown to FCC earlier (reasoning: allows other 
> PSdAP people joining to "monitor" a conversation between a caller and 
> an emergency operator.  Also technically provides a convenient 
> mechanism for transferring real-time text conversation from one person 
> to another).  Such an emergency optimized server would essentially be 
> intranet-optimized (i.e. not open to public XMPP).
> ....Also, as many of us already are familiar with, 2-person chats can 
> be turned into 3-person chats by inviting someone, and I needed to 
> include at least basic MUC info, for implementers that want to do that 
> in a reasonable seamless manner.  I feel that any MUC improvements 
> such as server-based interval control and other improvements, can be 
> specified in the future, perhaps as an extended XEP.   There are not 
> many implementations of MUC an XEP-0301 yet, and we need more field 
> experience without removing MUC from XEP-0301.   I do not foresee that 
> future MUC-specific optimizations would necessarily impact protocol
<GH>MUC support is important. It can be used as the multi-party bridge 
for XEP-0301 in multi-party multimedia calls, where typing and reading 
participants must have the same right to participate in rapid exchange 
as the speaking participants.
In such situations it is mainly one speaker at a time that transmits 
real-time text, so the load considerations are not terrible, but are there.

People without rtt have a tendency to send short incomplete phrase parts 
often to compensate for the lack of rtt, so in reality, rtt will "only" 
increase the load by a factor of 4 or so.

> - this seems inconsistent with an earlier section that (I
>     think) was recommending or mandating support for multiple full JIDs.
> [Comment]
> I made comments earlier.
> Even when senders send to the full JID, recipients can just process 
> real-time messages based on bare JID.
> This makes it simpler for implementers of clients to implement only a 
> single real-time message per chat window.
> It is a significant user interface complexity concern to gain the 
> capability of multiple simultaneous real-time messages in the same 
> chat window user interface.
> Also, it is intuitive behaviour because of "Keeping Real-Time Text 
> Synchronized" so a simultaneous login user switching computers, the 
> recipient would simply see their copy of the real-time message switch 
> instantly from the partially-composed message from the old system to 
> the partially-composed message from the active system.  (thanks to the 
> Message Reset feature of "Keeping Real-Time Text Synchronized".   Even 
> further enhanced if good resource locking is done, too.  This is 
> acceptable UX behaviour, as a login is meant (in 99%+ of cases) to 
> only have one typist.
>     6.6.5 - seems somewhat out of place. How many systems are there these
>     days that can't keep up with a human typist? And telling people that
>     they need to make their applications flicker-free just seems odd.
> [Comment]
> When retrofitting real-time text to an existing chat program, some 
> tend to make their software cause a repaint every key press, so it's 
> meritworthy to make a brief mention, although I agree it is quite 
> borderline from the perspective of a "specification".   At 10 key 
> presses per second for a 120WPM typist, the real-time message can be 
> repainted 10 times per second, and if the repaint is not done 
> efficiently, it can flicker or consume CPU, etc.
> Suggestions of a better wording is welcome?
<GH>Just keep the last sentence and change to a should: "Clients should 
distinguish the <rtt/> streams (via full JID and/or via <thread/>) and 
keep multiple concurrent real-time messages in similar manner 
toMulti-User Chat 
>     6.6.6 seems redundant.
> [Comment]
> It might be, but "Total Conversation" is quite significant among 
> accessibility circles in Europe, it's not used as much in North 
> America, but I must satisfy this audience, too.  Improved wording is 
> welcome though, but I don't think anything in 6.6.6 affects protocol
<GH>It is a hint on an important service environment for the RTT media 
component. I agree that it is more of an implementation note than many 
other sections in this chapter. So, if the chapter changes title, this 
section could move out to a real implementation note chapter.
>     7 - these examples seem to be to a bare JID, and therefore can't have
>     had caps already indicate support, but lack support discovery. It'd be
>     good to note this.
> [Change Made]
> /"For simplicity, these examples use a bare JID, even in situations 
> where a full JID might be more appropriate."/
> Good point, senders still really generally ought to send to full JID, 
> even though recipients don't have to keep track of real-time messages 
> per full JID (recipients can just track per bare JID).
> Also, when the resource is not locked yet (i.e. recipient hasn't 
> replied yet) it is fine to send real-time text only to the bare JID. 
>  This makes the real-time text show up on all concurrently-logged in 
> resources simultaneously.  XEP-0085 Chat States tend to work this way 
> too in most software.  Also, I've considered the 
> Activation/Deactivation scenarios in a Simultaneous Login scenario, in 
> both resource-locked and resource-unlocked states.  All of the 
> scenario combinations check out to intuitive and/or acceptable UX.
> (Once resource is locked, the real-time message pauses on unused 
> resources since the unused resources are no longer receiving any 
> real-time text, and eventually gets timed out via Stale Message handling)
<GH> Also add "and service discovery is supposed to have been done."
>     7.4.2 - this includes an RTT including a wait in the element with the
>     body - but once the body is received the RTT state is discarded and
>     the body replaces it, if I remember earlier in the XEP correctly (and
>     it was quite a while ago now).
> [Comment]
> That's right.
> -- However, it's important to finish sending <rtt/> for the whole 
> message, because it assists in verifying the integrity of real-time 
> text in some mission-critical implementations.   Basically, the text 
> in body can be compared to the real-time message, to make sure that 
> the message is identical, and if it isn't identical, it's possible to 
> display a warning indicator that the <rtt/> final text disagrees with 
> <body/> text.
> -- Technically, I could recommend that <rtt/> always be transmitted in 
> separate <message/> than <rtt/>, although I don't think it is a good 
> idea to make this REQUIRED.
> -- Also, some implementations may choose to finish playing back <rtt/> 
> before displaying the <body/>, although it's true that I generally 
> recommend catching up the message immediately to prevent lagging, but 
> I don't strictly require that:
> http://xmpp.org/extensions/xep-0301.html#receiving_realtime_text
> Comments appreciated, in the light of my reasoning?
>     8 - Why are we picking out Google Talk as an XMPP exemplar?
> [Change Made] -- 3 votes were received to this date, so I've now 
> removed even though I wanted to keep it as an example.  There are so 
> many programmers on Google Talk who doesn't quite realize that Google 
> Talk is XMPP.  That's why I wanted to keep at least one mention.
<GH>Good to delete it. I am not used to see brand names in standards.
>     8 - Why are we telling SIP clients what specs to use?
> [Change Made]
> Those are "examples" only, so I've made a change to make that clear.
> New text: /"For example, clients that use XMPP can utilize this 
> XEP-0301 specification, and clients that use SIP might utilize IETF 
> RFC4103, RFC5194 and ITU-T T.140."/
>     8 - All of this section seems somewhat out of place in a XEP.
> [Comment]
> I've managed to reduce the size of Interoperability Considerations 
> significantly (to the best of my ability), but there are several 
> people including actual implementers (outside the XMPP umbrella) that 
> are demanding this text be bigger than it is now.  Gunnar made a 
> gateway for SIP-to-XMPP interoperable real-time text, and it is a big 
> raison d'etre of keeping Section 8, he is also sharing his experiences 
> as well.  I'm also debating with people against making this size 
> bigger by not adding too much information to this section, since I 
> also agree that it's mostly out of scope of this specification.   
> Gunnar also repeatedly told me I should not make this section even 
> smaller too.  Edward Tie wants me to add more TTY info to it. (I 
> slipstreamed a small sentence "This can include TTY and textphones" 
> after the gateway servers sentences.  There are other implementers 
> outside of XMPP raising a big fuss about how to interoperate with 
> XMPP, so I think some *semblance* of section 8 is extremely critical 
> to satisfying a particularly vocal and important audience of 
> accessibility advocates.
> The current Section 8 a compromise between what XMPP wants and what 
> accessibility implementers want, as XEP-0301 is of interest to 
> accessibility vendors, moreso than other specifications, and there are 
> special reasons to make XEP-0301 intereoperate with other standards 
> used in accessible communications.
<GH>I think this section has a suitable size now. It is true that we 
shall not dictate what to use on the SIP side, so your small change is 
appropriate. We can describe what is available and briefly some actions 
to do to interoperate with these implementations. More details can go 
>     10.1 - "It is important for implementers of real-time text to educate
>     users about real-time text. " - this doesn't really seem right.
> [Change Made] -- Good catch, I see the redundancy.
> /"It is important for implementors to educate users about real-time 
> text"./
<GH>It is rather the application information that should make this 
clear. We do not need to chase implementors out in the field to become 
how about:
"It is important for application information to educate users..."

>     10.1 - I think a sensible Privacy note would be to make RTT opt-in.
> [Comment]
> That depends on the market.  Mainstream client? (opt-in) 
>  Accessibiltiy-market client? (opt-out) Emergency mode?
> I am in contact with different implementers who will pounce on me if I 
> suggest either direction (opt-in versus opt-out).
<GH>This is the natural text communication. In my view Opt-in is not 
needed and not wanted, but I know others have different views. Let it be 
out of scope for this protocol specification.
>     10.2 - "also needs to also "
>     10.2 - "(e.g. deferred XEP-0200)" - just XEP-0200, I'd have thought.
> [Change Made] for both
>     10.2 - I think blaming encryption for the increased number of stanzas
>     RTT generates is a little disingenuous.
> [Change Made]
> There was a debate on this mailing list about turning off <rtt/> when 
> stanza-level encryption is done, because of the extra overhead.  I 
> instead opted to add an explanation sentence, rather than recommending 
> turning off <rtt/> when stanza-level encryption is done.
> I will re-word the sentence to avoid sounding 'disingenious'.
> /"It is noted that real-time text can have a higher rate of message 
> stanzas, contributing to additional overhead. See [[[Congestion 
> Considerations(link)]]]"/
>     10.3 - "The nature of real-time text result in"
> [Change Made]
> /"The nature of real-time text *can* result in"/
>     10.3 - "than may otherwise happen in a non-real-time text
>     conversation. This may lead to increased" s/may/would/ s/may/will/
>     respectively will remove normative language.
> [Change Made]
>     10.3 - "including stanzas dropped by an overloaded server" - I think
>     "including stanzas dropped during a network or server error" would be
>     more appropriate.
> [Change Made]
> /"including stanzas dropped during a network issue or server error"./
> Your suggestion is good.
> Side note: It is a catch-all for error and non-error situations 
> ("network issue"), including DoS protection that kicks in, or even 
> plain server non-compliance (that doesn't return any error 
> conditions).   If networks would work perfectly, we would not even 
> need language like this.   Issues can even be obscure.  An example is 
> the HTTP layer of a BOSH connection (where the BOSH itself works 
> flawlessly, but a DoS situation cause HTTP to drop a few request, 
> which looks like dropped stanzas to the end software).   Or routers in 
> certain countries dropping packets due to a disallowed phrase, etc. 
>  (this can manifest itself as dropped message stanzas, until a new 
> message is created or the offending text is removed) Anyway, avoiding 
> politics -- but clearly, there's all sorts of weird reasons for 
> dropped message stanzas, regardless of reason of dropped stanza. 
>  Server non-compliance issues.  DoS protection kicking in.  All 
> conveniently caught by the catchall phrase "network issue and server 
> errors" without needing further explanation.
>     10.3 - "Use of this specification in the recommended way will cause a
>     load that is only marginally higher than a user communicating without
>     this specification." - do you have numbers for this? It seems quite
>     counterintuitive, I'd expect it to increase the server load due to
>     message routing roughly by a factor of the number of RTT transmitted
>     between each typical <body/>.
> [Comment]
> Not always necessarily true -- The average instant message is short, 
> often 1 to 5 words. (under 40 chars)
> Most people on chat networks don't type large messages.  (Programmer 
> types like me do, though)
> Hi!
> How are you?
> Good. You?
> What's up?
> Wanna go to movies?
> Sure.
> Star Wars XVIII?
> No, Jaws IX.
> Hey, have u heard? He did it!
> OMG, tell me!
> etc.
> Although an extreme example, you get the idea.  Someone else (Gunnar? 
> Gregg?) cited a paper that the average instant message length is less 
> than 40 characters.  Also, XMPP message transmissions are sent anyway 
> such as XEP-0085 chat states, and <rtt/> can be slipstreamed into some 
> of those same messages that are being transmitted anyway.   It is true 
> that RTT will result in some people hitting Enter less often, but 
> that's just replacing earlier behaviour anyway.
> Therefore, it's frequently about 2 to 3x more stanzas than what would 
> have happened without real-time text.  On top of this, for an 
> optimized server, additional messages sent shortly after the previous 
> message, have only a small additional 'cost' in resources.
> So the statement has truth to it --
> (Anyone, can someone remind me of the link to the university paper? 
>  It was also posted here over a year ago.  Maybe I should cite it, to 
> solidify this statement.)
<GH>I remember the paper, but do not have time to dig it up now. I think 
you can safely say a factor 4. ( see my comment under MUC)
>     10.3 - "Bandwidth overhead of real-time text is very low compared to
>     many other activities possible on XMPP networks including in-band file
>     transfers and audio" - This is a little disingenuous where IBB is a
>     fallback, and audio never travels over the XMPP network. I'd remove
>     the line completely.
> [Change Made]
> /"Bandwidth overhead of real-time text is very low compared to many 
> other activities possible on XMPP networks."/
> It is more generic.  I actually get questions of how much bandwidth 
> XEP-0301 uses, at least as a relative basis to other XMPP 
> technologies.  To go into further detail, I could insert some details 
> from the document that I made for Darren Sturman who said bandwidth 
> considerations make or break a standard -- and I could insert the 
> bandwidth-estimation formula I developed -- or go into generalities 
> such as "average typing speed consumes about X bytes per second".   
>  But I think this sentence should be sufficient; questions can be 
> addressed separately from the spec.   Comments?
<GH>Yes sufficient.
>     14 - (I appreciate the acknowledgement, thank you)
> [Comment]
> You're welcome!
> Any other contributors that I have forgotten?
>     14 - It's usual in XEPs that acknowledgements are done personally
>     rather than by affiliation, so I think it'd be sensible to just leave
>     the names in and remove affiliations.
> [Comment]
> OK, thanks -- I will make some inquiries if that is OK.  Some people 
> are used to RFC's where affilation is often mentioned.  In all 
> probability, I'll be making this change.
<GH>OK for me.
>     14 - I find the comment acknowledging the invention a bit odd. It's
>     assumed that the XEP is your own work, and "invention" is a term I've
>     more commonly come across in relation to patents - I assume there
>     isn't a patent associated with this that you're assigning to the XSF?
> [Comment & Question]
> There is no patent.  Many told me I should patent it, but I've instead 
> open-sourced the idea out into the open.  I believe this is the 
> world's first real-time text standard that preserves key press 
> intervals independently of transmission intervals.  A comparison 
> animation at 
> http://www.marky.com/realjabber/anim/real_time_text_demo.html )
>  -- Ideally, it would be nice to be acknowledged for this idea somehow 
> *somewhere*, one way or another, even if it just generically says 
> /"Mark Rejhon came up with the method of preserving key press 
> intervals, which is called "Natural Typing" at R3TF"/.   (The 
> technique is called "Natural Typing" within all of us at R3TF)
> Comments?
>     Appendix B - it's usual to just have author name, email and JID here.
>     We don't generally link out to the authors' websites.
> [Change Made]
> Thanks!
> Mark Rejhon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/standards/attachments/20120724/7c077129/attachment.html>

More information about the Standards mailing list