[Standards] Re: Proposed XMPP Extension: Jingle Remote Control

27 May 2024

Le mercredi 22 mai 2024, 16:46:47 UTC+2 Marvin W a écrit :
...
  Hi Goffi, 
Hi Marvin,

Seeing the proposition rejected is definitely disappointing, and I would like 
to have a clear statement of the reason why the Council thinks this work is 
"unacceptable".

For now, the biggest criticism I've seen is that this protocol specification 
is… specifying a protocol (which again is required by XEP-0166 for Jingle 
applications). This seems quite arbitrary to me, and I would like to have a 
clear statement on why this specification is "unacceptable" to the Council.

Specially when I've clearly stated several times that I'm open to changing the 
payload format and using an existing protocol if they prove easy to implement, 
flexible, and efficient. The experimental state is made for that.

Thanks in advance.

...
  The use case I'm thinking of has low throughput
and only short usage
 time. I might be sending 10 or 20 key events within a short time and
 then nothing for several hours.

 Technically, this can all be done with Jingle, but for just a few keys,
 the overhead of starting a Jingle session just for those keys probably
 adds way more latency than sending those keys via <message> through an
 XMPP server. And using Jingle would require way more complex software
 on both sides. 
Alright, I understand better now. That's true; because I have advanced 
features like Jingle and WebRTC implemented in my software, I'm willing to 
build on them, but they are not available everywhere. For ease of 
implementation, it could be indeed interesting to have an <message> based way, 
usable either via server or via Jingle.

On the other hand, this introduces complexity by itself (now the payload can 
go through two different ways), and I'm not sure that, especially for a niche 
feature that most clients won't implement, that we should use an inferior 
solution because hypothetically, in a niche use case within the niche feature, 
a client may not have Jingle implemented.

Jingle is a major stack of XMPP, and it should be implemented in any advanced 
client according to IM compliance suites.

WebRTC is already optional; you can use any streaming transport.

...
  Without a proper specification to send keys, I would
do this via non
 standardized body messages. Works, but isn't particularly nice.

 I also noticed that in cases where XEP-0174 Serverless Messaging is
 used, an additional Jingle connection probably doesn't add a lot of
 benefit either. 
That's another niche in the niche. That's really highly hypothetical, and even 
if this could be done directly, it doesn't hurt to add an additional Jingle 
connection.

But that said, I'm not firmly opposed to moving to <message> based payload, if 
that can unblock the situation.

...
  Well, XMPP clients that also speak a ton of other
protocols, including
 the one you are just creating.

 My point is that not only does this need to be specified in the
 business rules, but also a ton of other things. There are probably a
 lot of side cases that you don't cover and where I can't reasonably
 expect Council to think about them. 
Of course, a proto-XEP is not meant to be perfect at first edition; that's 
exactly what the experimental status is for. And it's not the job of the 
Council to think about side cases - that's what standards@ and feedbacks from 
the whole community are for.

Maybe I got it wrong, but for me, the job of the Council is to keep technical 
stuff on track by ensuring that advancements in XEP statuses are done in order 
(i.e., X independent implementations, Y feedbacks, etc. as stated in relevant 
XEPs), and vetoing things that are really unacceptable (e.g., copyright 
issues, something totally irrelevant, offensive content, etc.). And it's the 
role of the larger community on standard@ to work on technical stuff, side 
cases, ease of implementation, and optimization.

I realize that there isn't a real definition of what should be an
"acceptable" 
proto-XEP; maybe this should be specified? Because I've seen proto-XEPs refused 
by some Councils then accepted by others, and this seems quite arbitrary to 
me.

...
  [SNIP]

 Both Jingle and especially WebRTC come with huge complexity. Your
 WebRTC library and your existing code for working with it might take
 away most of this from you, but that doesn't mean it's not there. By
 using Jingle and WebRTC you're effectively excluding clients, devices
 and platforms that can't easily run libwebrtc or any other popular
 WebRTC implementation. 
Again WebRTC is not mandatory in my specification. Any streaming transport can 
be used, as designed by XEP-0166, including in-band via XEP-0261.

So we're just talking about Jingle, and this can be implemented on any 
platform, which is required for advanced IM client according to current 
compliance suit.

...
  I was already guessing it's not arbitrarily, but
probably what made
 sense in your setup and for your usecase. However, not knowing any of
 that it *seems* arbitrary. 
Use cases are already explained in the specification. For my current 
implementation, I have implemented a controlling device in a browser and a 
basic one in a CLI (currently sending only keyboard events for now).

I have also implemented a controlled device in a CLI, which works with Wayland 
and desktop portal. The implementation should not be a problem on other 
platforms that I target in the long run (Windows, Mac, Android, iOS, BSD, 
etc.). Actually, it should not be a problem on any platform.

...
  The RemoteDesktop portal was clearly designed for
remote desktop use
 cases, not other remote control cases. 
That's incorrect. Despite its name, you can actually only use the Remote 
Desktop portal to send input; the Screen Sharing part is entirely optional 
(and must be explicitly requested).

...
  However, as you already mention
 that you designed the data sent around what is needed for the
 RemoteDesktop portal, why not send the information directly in a format
 that matches the design of RemoteDesktop portal, instead of a mix of
 Web API interfaces and RemoteDesktop portal? 
The data matches, except for keyboard events that are represented using evdev 
codes on Linux, whereas I was looking for a more platform-independent 
solution. The Web API turned out to be the easiest option I've found, but I'm 
open to considering an alternative if needed.

...

 Also I noticed that the RemoteDesktop portal does not have a notion of
 an independent wheel, the mouse wheel is tied to the pointing device,
 why did you choose to not do it the same way? 
No, despite its method names (`NotifyPointerAxis` and 
`NotifyPointerAxisDiscrete`), the wheel device is independent of the pointer, 
actually no pointer coordinates are sent when sending wheel events.

And that makes sense: it's not the pointer coordinate that's important, but 
rather where the focus is. You can change focus with a keyboard, for instance.

As I've said in my previous message, the wheel device, while often associated 
with mice, can also be independent.

...
  [SNIP]

 The precision on a double (64 bit floating point) remain the same, no
 matter if you scale [0,1] or [0,<screen-width>]. The precision is about
 15 decimal digits which should be more than enough (you barely see
 screen coordinates with more than 4 decimal digits), even if you do
 calculations on them (which may result in a few bits of precision
 loss). 
The issue is not about the number of digits, but the fact that some numbers 
cannot be represented by doubles. The first case I'm thinking of is 1/3, which 
can lead to a rounding error and having the wrong pixel selected at the end. 
Whether or not this is a problem depends on the use cases we want to handle, 
but using pixels directly avoids this issue.

Anyway, using [0,1] is not a bad idea, as it avoids the need to transmit 
screen size and screen size updates. It can be a better solution indeed.

...
  Assuming you refer the FPS games, those
"lock" the cursor position to
 the screen center, so they never have that issue. To correctly
 reproduce this behavior you need a back channel to the controlling app
 so it can know the cursor position and/or lock if it is changed on the
 controlled device.

 (Above might not be correct on all platforms.)

 Also I did not intend to say that you shouldn't support movement
 vectors (like touchpads), I was just saying that absolute pointing
 could be relative to screen size, so that you don't need to know the
 absolute screen size. 
Indeed, it may be a better option. I can change that. I'll check how other 
protocols deal with this issue and may use one of them directly.

...
  The advantage of going down this rabbit hole is:
 a) We improve XMPP for other usecases
 b) You can specify this protocol using XML and use Jingle XML streams.
 As the CBOR<>XML translation will take care of creating the CBOR for
 you, you still get the CBOR for this protocol, but without the need to
 make it explicit. And in cases where people prefer to not use CBOR,
 they can still use this protocol, just with XML. It's a win-win for
 everyone (except that you as the specification author have more work). 
We have already EXI (XEP-0322) for that (I don't know how it compares to CBOR 
though).

Again, I'm not against getting rid of CBOR if it is a show stopper for people.

...
  If going forward, you still want to specify your own
 payload/application protocol (that is, the CBOR thing that is
 transferred with the Jingle streaming transport), I'd like to ask you:
 - To evaluate if a XEP is the right place to specify such a protocol,
 of if it is more a generic thing that could well be used outside XMPP
 and maybe should also be specified elsewhere. 
I'll evaluate other specifications.

But yes, a XEP is, in my opinion, definitely the right place to specify a 
protocol. The fact that part of it is a Jingle application doesn't change the 
fact that it's globally an XMPP Extension Protocol. XEP-0166 states that the 
application payload protocol must be specified.

And even if we use XML extensively, XMPP is not about XML. We already use many 
non-XML data formats.

...
  - If you consider a XEP to be the right place and want
to stick with
 your CBOR protocol, I'd like to ask you to split it into two parts: 1.
 the payload protocol (sections 8 and 9 of the proposal) and 2. The
 Jingle signaling protocol (sections 5 to 7 of your proposal). This way
 the protocol can be used and referenced easily for use outside of
 Jingle context. 
I'm willing to strike a balance between efficiency, ease of implementation, and 
flexibility. I don't care if it's CBOR or anything else. I've heard your 
argumentation, and will consider using <message>, or another existing 
protocol.

It will take time, though; I'm busy with other things at the moment, and my 
current implementation is working well. If anybody is interested in 
implementing this specification anytime soon, please contact me - I can try to 
re-order my priorities.

...
  If you feel it's possible to transition to a
<message> based approach,
 this can of course be a single XEP (that will barely have anything to
 do with Jingle except for anecdotal mentioning that it can be used with
 Jingle XML stream or serverless messaging for lower latency). 
Got it. I'll evaluate the various options we've discussed.

Thank you for your time and detailed feedback - it's much appreciated.

...
  Best,
 Marvin 
Best,
Goffi

2025

2024

2023

[Standards] Re: Proposed XMPP Extension: Jingle Remote Control