Le mercredi 22 mai 2024, 16:46:47 UTC+2 Marvin W a écrit :
Hi Goffi,
Hi Marvin,
Seeing the proposition rejected is definitely disappointing, and I would like
to have a clear statement of the reason why the Council thinks this work is
"unacceptable".
For now, the biggest criticism I've seen is that this protocol specification
is… specifying a protocol (which again is required by XEP-0166 for Jingle
applications). This seems quite arbitrary to me, and I would like to have a
clear statement on why this specification is "unacceptable" to the Council.
Specially when I've clearly stated several times that I'm open to changing the
payload format and using an existing protocol if they prove easy to implement,
flexible, and efficient. The experimental state is made for that.
Thanks in advance.
The use case I'm thinking of has low throughput
and only short usage
time. I might be sending 10 or 20 key events within a short time and
then nothing for several hours.
Technically, this can all be done with Jingle, but for just a few keys,
the overhead of starting a Jingle session just for those keys probably
adds way more latency than sending those keys via <message> through an
XMPP server. And using Jingle would require way more complex software
on both sides.
Alright, I understand better now. That's true; because I have advanced
features like Jingle and WebRTC implemented in my software, I'm willing to
build on them, but they are not available everywhere. For ease of
implementation, it could be indeed interesting to have an <message> based way,
usable either via server or via Jingle.
On the other hand, this introduces complexity by itself (now the payload can
go through two different ways), and I'm not sure that, especially for a niche
feature that most clients won't implement, that we should use an inferior
solution because hypothetically, in a niche use case within the niche feature,
a client may not have Jingle implemented.
Jingle is a major stack of XMPP, and it should be implemented in any advanced
client according to IM compliance suites.
WebRTC is already optional; you can use any streaming transport.
Without a proper specification to send keys, I would
do this via non
standardized body messages. Works, but isn't particularly nice.
I also noticed that in cases where XEP-0174 Serverless Messaging is
used, an additional Jingle connection probably doesn't add a lot of
benefit either.
That's another niche in the niche. That's really highly hypothetical, and even
if this could be done directly, it doesn't hurt to add an additional Jingle
connection.
But that said, I'm not firmly opposed to moving to <message> based payload, if
that can unblock the situation.
Well, XMPP clients that also speak a ton of other
protocols, including
the one you are just creating.
My point is that not only does this need to be specified in the
business rules, but also a ton of other things. There are probably a
lot of side cases that you don't cover and where I can't reasonably
expect Council to think about them.
Of course, a proto-XEP is not meant to be perfect at first edition; that's
exactly what the experimental status is for. And it's not the job of the
Council to think about side cases - that's what standards@ and feedbacks from
the whole community are for.
Maybe I got it wrong, but for me, the job of the Council is to keep technical
stuff on track by ensuring that advancements in XEP statuses are done in order
(i.e., X independent implementations, Y feedbacks, etc. as stated in relevant
XEPs), and vetoing things that are really unacceptable (e.g., copyright
issues, something totally irrelevant, offensive content, etc.). And it's the
role of the larger community on standard@ to work on technical stuff, side
cases, ease of implementation, and optimization.
I realize that there isn't a real definition of what should be an
"acceptable"
proto-XEP; maybe this should be specified? Because I've seen proto-XEPs refused
by some Councils then accepted by others, and this seems quite arbitrary to
me.
[SNIP]
Both Jingle and especially WebRTC come with huge complexity. Your
WebRTC library and your existing code for working with it might take
away most of this from you, but that doesn't mean it's not there. By
using Jingle and WebRTC you're effectively excluding clients, devices
and platforms that can't easily run libwebrtc or any other popular
WebRTC implementation.
Again WebRTC is not mandatory in my specification. Any streaming transport can
be used, as designed by XEP-0166, including in-band via XEP-0261.
So we're just talking about Jingle, and this can be implemented on any
platform, which is required for advanced IM client according to current
compliance suit.
I was already guessing it's not arbitrarily, but
probably what made
sense in your setup and for your usecase. However, not knowing any of
that it *seems* arbitrary.
Use cases are already explained in the specification. For my current
implementation, I have implemented a controlling device in a browser and a
basic one in a CLI (currently sending only keyboard events for now).
I have also implemented a controlled device in a CLI, which works with Wayland
and desktop portal. The implementation should not be a problem on other
platforms that I target in the long run (Windows, Mac, Android, iOS, BSD,
etc.). Actually, it should not be a problem on any platform.
The RemoteDesktop portal was clearly designed for
remote desktop use
cases, not other remote control cases.
That's incorrect. Despite its name, you can actually only use the Remote
Desktop portal to send input; the Screen Sharing part is entirely optional
(and must be explicitly requested).
However, as you already mention
that you designed the data sent around what is needed for the
RemoteDesktop portal, why not send the information directly in a format
that matches the design of RemoteDesktop portal, instead of a mix of
Web API interfaces and RemoteDesktop portal?
The data matches, except for keyboard events that are represented using evdev
codes on Linux, whereas I was looking for a more platform-independent
solution. The Web API turned out to be the easiest option I've found, but I'm
open to considering an alternative if needed.
Also I noticed that the RemoteDesktop portal does not have a notion of
an independent wheel, the mouse wheel is tied to the pointing device,
why did you choose to not do it the same way?
No, despite its method names (`NotifyPointerAxis` and
`NotifyPointerAxisDiscrete`), the wheel device is independent of the pointer,
actually no pointer coordinates are sent when sending wheel events.
And that makes sense: it's not the pointer coordinate that's important, but
rather where the focus is. You can change focus with a keyboard, for instance.
As I've said in my previous message, the wheel device, while often associated
with mice, can also be independent.
[SNIP]
The precision on a double (64 bit floating point) remain the same, no
matter if you scale [0,1] or [0,<screen-width>]. The precision is about
15 decimal digits which should be more than enough (you barely see
screen coordinates with more than 4 decimal digits), even if you do
calculations on them (which may result in a few bits of precision
loss).
The issue is not about the number of digits, but the fact that some numbers
cannot be represented by doubles. The first case I'm thinking of is 1/3, which
can lead to a rounding error and having the wrong pixel selected at the end.
Whether or not this is a problem depends on the use cases we want to handle,
but using pixels directly avoids this issue.
Anyway, using [0,1] is not a bad idea, as it avoids the need to transmit
screen size and screen size updates. It can be a better solution indeed.
Assuming you refer the FPS games, those
"lock" the cursor position to
the screen center, so they never have that issue. To correctly
reproduce this behavior you need a back channel to the controlling app
so it can know the cursor position and/or lock if it is changed on the
controlled device.
(Above might not be correct on all platforms.)
Also I did not intend to say that you shouldn't support movement
vectors (like touchpads), I was just saying that absolute pointing
could be relative to screen size, so that you don't need to know the
absolute screen size.
Indeed, it may be a better option. I can change that. I'll check how other
protocols deal with this issue and may use one of them directly.
The advantage of going down this rabbit hole is:
a) We improve XMPP for other usecases
b) You can specify this protocol using XML and use Jingle XML streams.
As the CBOR<>XML translation will take care of creating the CBOR for
you, you still get the CBOR for this protocol, but without the need to
make it explicit. And in cases where people prefer to not use CBOR,
they can still use this protocol, just with XML. It's a win-win for
everyone (except that you as the specification author have more work).
We have already EXI (XEP-0322) for that (I don't know how it compares to CBOR
though).
Again, I'm not against getting rid of CBOR if it is a show stopper for people.
If going forward, you still want to specify your own
payload/application protocol (that is, the CBOR thing that is
transferred with the Jingle streaming transport), I'd like to ask you:
- To evaluate if a XEP is the right place to specify such a protocol,
of if it is more a generic thing that could well be used outside XMPP
and maybe should also be specified elsewhere.
I'll evaluate other specifications.
But yes, a XEP is, in my opinion, definitely the right place to specify a
protocol. The fact that part of it is a Jingle application doesn't change the
fact that it's globally an XMPP Extension Protocol. XEP-0166 states that the
application payload protocol must be specified.
And even if we use XML extensively, XMPP is not about XML. We already use many
non-XML data formats.
- If you consider a XEP to be the right place and want
to stick with
your CBOR protocol, I'd like to ask you to split it into two parts: 1.
the payload protocol (sections 8 and 9 of the proposal) and 2. The
Jingle signaling protocol (sections 5 to 7 of your proposal). This way
the protocol can be used and referenced easily for use outside of
Jingle context.
I'm willing to strike a balance between efficiency, ease of implementation, and
flexibility. I don't care if it's CBOR or anything else. I've heard your
argumentation, and will consider using <message>, or another existing
protocol.
It will take time, though; I'm busy with other things at the moment, and my
current implementation is working well. If anybody is interested in
implementing this specification anytime soon, please contact me - I can try to
re-order my priorities.
If you feel it's possible to transition to a
<message> based approach,
this can of course be a single XEP (that will barely have anything to
do with Jingle except for anecdotal mentioning that it can be used with
Jingle XML stream or serverless messaging for lower latency).
Got it. I'll evaluate the various options we've discussed.
Thank you for your time and detailed feedback - it's much appreciated.
Best,
Marvin
Best,
Goffi