Hi Marvin,
Le lundi 27 mai 2024, 12:29:36 UTC+2 Marvin W a écrit :
[SNIP]
The first protocol (sections §5-7 of your proposal), which clearly
belongs into the XSF, is how to use Jingle to signal a remote desktop
session. As you rightly point out a Jingle application format (as per
XEP-0166 §12.1) must DEFINE how the "media data" is sent. However, it
doesn't need to SPECIFY the media data format. E.g. XEP-0167 defines
you should use RTP packets as specified in RFC3550.
Specifying an external protocol is basically specifying the payload format,
and specifying it inside the XEP is the same thing. In some case it's better
to use external ones, in other cases it's relevant or even better to specify
directly the payload.
That said, I'm not saying that my proposal is the best option. I'll evaluate
other protocols to see if there is a good match.
The second protocol (sections §8-9) is the one that is the media data
to send via the Jingle session. However that protocol is largely
independent of Jingle or anything else and could be and IMO better
would be specified entirely independent of that. When I said that you
should consider to evaluate if a XEP is the right place to specify such
a protocol, I was only referring to this part: As it stands right now,
this second protocol could easily be used independent of XMPP and
Jingle. I would thus see this protocol more as an RFC.
It's not the same hurdle to create another specification: it requires
additional time and energy (and writing this one already took a lot).
For a simple and small protocol like the one I've proposed, I really think
it's not worth it (given my limited available time, as I'm working extensively
on many things). The necessary work is definitely higher for a RFC, and I'm not
even sure where to start. For a bigger and more complex protocol, it would
definitely make sense yes, but I don't think that it's the case here.
Under the assumption that all remote control signals
are sent using
<message> and it is up to the receiving client to decide to accept or
error them, there is no additional complexity introduced, by adding the
possibility of a different transport path. In fact, you wouldn't even
need to specify which transport path to use, you specify the stanzas to
be sent to do remote control (to maintain remote control session and
for input events). The fact that one decides to send those via Jingle
XML streams and others use serverless messaging doesn't need to be
specified, it's exactly the same stanzas being sent in both cases.
We lose all the interest of having a jingle application here. Jingle XML
Streams is a Jingle Application, so if I use that, Remote Control can't be one
anymore.
We then don't have the possibility anymore to add remote control during a call
session, or to indicate the intent. We first need to establish the XML stream,
then find another protocol to advertise the remote control session.
Jingle is a major stack of XMPP, and it should be
implemented in any
advanced
client according to IM compliance suites.
Except that many clients are not meant to be advanced IM clients. I
wouldn't expect a remote desktop application to strictly also be an
advanced IM client.Looking at the feature set we require from advanced
IM clients, I'm pretty sure most existing remote desktop apps do not
qualify as advanced IM client (probably not even core im client,
because group chats are definitely not common for remote desktop apps).
If we talk about remote desktop, we need Jingle for the video stream anyway.
I've taken the IM example because it's the more common, but it works with A/V
Calling compliance too.
Again, You seem to be coming from the position that a
client
implementing this is already a very feature rich and advanced client
like yours, but this assumption comes with a huge amount of
restrictions.
I think that Jingle is a major feature that is reasonable to ask as a
prerequisite for remote control, it's the only "advanced" feature
necessary.
Many XEP do that already, or are assuming, e.g., that XEP-0045 is implemented,
should we get rid of those because an "advanced" feature is required?
This does not match my understanding of Council work.
I see Council as
clearly a technical position, not a mostly organization position. In
fact, some of the tasks you listed are clearly on the Editor side and
shouldn't even reach Council (e.g. copyright issues and offensive
content).
[SNIP]
This topic worth discussion, and has been spanned on another thread, so I
won't continue here.
So we're
just talking about Jingle, and this can be implemented on
any
platform
I do agree that Jingle can likely be implemented on any platform, but
it might be that you can only do so using Jingle IBB (e.g. because your
network controller can only maintain a single TCP connection), in which
case using Jingle is really not an improvement.
If you are working on restricted devices, you can have a "host" device and
only establish the stream connection with the controlled device. But anyway,
one hand people say that my proposal is too flexible, and on the other hand we
say that we should handle any niche case under the sun.
In the vast majority of case, a streaming Jingle connection should be
relatively easy to establish.
That's
incorrect. Despite its name, you can actually only use the
Remote
Desktop portal to send input; the Screen Sharing part is entirely
optional
(and must be explicitly requested).
I'm pretty sure you can't send absolute pointing events (via
NotifyPointerMotionAbsolute like for a drawing tablet) or touch motion
events (via NotifyTouchMotion) using the RemoteDesktop portal without
also opening a screen cast, because the PipeWire stream node of the
screen cast is a parameter for those APIs. So while some events can be
sent wihout the Screen Sharing, it's not entirely optional.
That's true for absolute pointing, that's why there are relative methods, and
my specification says to use relative ones when there is no attached video
stream.
That doesn't change the fact that Remote Desktop portal is designed to work
even if there is no ScreenCast.
As I've
said in my previous message, the wheel device, while often
associated
with mice, can also be independent.
The RemoteDesktop portal clearly ties it to the pointer device, it's
not only that the name is NotifyPointerAxis but you also must request a
POINTER device in the session (i.e. only requesting a KEYBOARD and
TOUCHSCREEN device does not allow you to use the API for scrolling).
That's true for freedesktop API, but that's an implementation detail, nothing
prevent to have dedicated permission.
Both web API and desktop portal use separate events for the wheel, I've just
followed that.
I know we have EXI and I know basically nobody
implements it. It's very
complex, has a lot of features and for best results requires to agree
on a common set of XML schemas. Having something else than EXI that is
easier to implement really might be a good idea, because I doubt EXI is
going to ever be successful.
Sure that could be nice, but I definitely don't have time to work on this in
the foreseeable future.
So as a summary, for me the deal breaker really is the
two protocols in
one XEP, one of which is not really an XMPP protocol. If you do like
the two protocols that you built - after all, you have implementation
experience that I entirely lack, so it may be that all my concerns are
invalid and I'm happy to have that in Experimental - I just feel that
the second one should best not be at the XSF and if there's really no
better place, make it at least a separate XEP.
I have done implementation that's true, but I'm not an expert in remote
desktop implementation either. I'm listening to feedback and comment, and will
take them into account.
I disagree with you and Singpolyma about the wire format description for the
stream, and don't see the problem to have such a simple and small protocol
described in the same specification. However, I'll explore alternative and see
if there are better options.
So to summarize:
- I'll explore alternative, notably RFB (or any other suggestion if somebody
has a good proposition). But only for remote control, I want by design to keep
the desktop screen sharing separated, and under XEP-0167 (or any future
specialised XEP if proven better).
- If I can't find a good alternative, I'll evaluate the use of <message>
instead of current CBOR based data, maybe in a separated XEP.
- I'll evaluate the use of a protocol usable via server or Jingle XML Stream.
But I'm currently really not convinced by that due to the reason exposed above
and previously.
However, I won't have time to work on that before months, I'm currently very
busy with other things. If anybody is interested in implementing remote
control meanwhile, please contact me.
Marvin
Thanks again for your time and extended explanation of your point of view.
Even if I disagree on some parts, I'm listening and many points are sensible.
Best,
Goffi