On Sun, 22 Jun 2025 at 09:46, Guus der Kinderen
<guus.der.kinderen(a)gmail.com> wrote:
I wonder if there is room here for the proxy to cache the data that is transmitted. That
way, the total amount of data that is transferred between domains (when more than one
person is to receive the data) could be drastically reduced. That is useful in certain
scenarios.
Okay, I spent a few hours on this this week. From the experiments, I
can report a few things:
- Proxies that terminate HTTPS are not really a thing in the HTTP
proxy world. This is unsurprising, as such a proxy would have the
ability to intercept and modify HTTPS traffic. I can understand why
the web ecosystem doesn't want this. However, this means there is no
way to do HTTPS *and* implement caching.
- To do HTTPS through a proxy, practically all implementations use
the CONNECT method to create a TCP tunnel to the remote server:port
and then do HTTPS within that tunnel.
- The CONNECT method is very open to "abuse" - it would allow the
client to connect to any remote server on any port and speak any
protocol (it isn't limited to HTTPS). The proxy can filter on the
target host/port, and many such proxies restrict outbound connections
to port 443 for example. In our ecosystem I know we have a bunch of
servers which use other ports (5281, 5443, 7443, ...).
- The CONNECT method does not pass through reverse proxies such as
nginx. I know that for Snikket, a lot of people run it behind such a
reverse proxy, and if it doesn't work in these scenarios, it won't
work for a significant number of deployments.
For anyone who wants to play with this on their own Prosody, we have a
module:
https://modules.prosody.im/mod_http_connect - just load it and
you'll find it advertised via XEP-0215. I'm not able to provide a test
server right now, as the one I would use has a reverse proxy in
front... :)
So despite my initial enthusiasm for off-the-shelf proxy support, it
seems that it might actually be worth considering other options. For
example, we could do something similar to Matrix, where you make a
normal HTTPS request to your own server, with the remote URL embedded
in the path. Your server will then transparently fetch the remote
resource and return it. Something like:
'GET
https://recipient.example/fetch/share.sender.example/d8c16ada-52b3-11f0-9de…
This has a number of advantages:
- It's "just" HTTPS, so should be easy enough for any client that
currently supports fetching HTTPS file shares, including web clients.
- The server is able to control the outgoing request, which ensures it
can only be used for HTTPS, and headers and methods can be restricted
(to GET, etc.).
- Caching can be optionally implemented.
- The server will see both the request and the response, but this is
already exposed to the origin server, and we have OMEMO/aesgcm already
taking care of the content encryption and signing.
I don't object to us implementing a prototype of this approach if it
seems like a route we would want to go down. I'm also open to other
suggestions.
Regards,
Matthew