Thank you Thilo,
On Thu, Oct 23, 2025 at 12:25 AM Thilo Molitor <thilo(a)eightysoft.de> wrote:
In SCRAM (RFC 5802) the client-first-message contains
3 possible values in the
GSS-header part:
1. y - The client would have used channel-binding, but the server did not offer
any.
2. n - The client does not support channel-binding (even if the server offered
any).
3. p=<cb-name> - The name of the channel-binding, the client wants to use.
The RFC explains, that sending "y" when the server advertised channel-binding
support is to be used as a MITM-detection:
An attacker stripping out all *-PLUS variants can be detected this way (the
server knows it advertised them and the client says via "y", that it would
have used them if it saw them, so the server can abort the authentication in
this case).
This works reasonably well, if there is only one single channel-binding
possible. But that's the exact assumption this XEP is going to change:
it is possible to advertise a list of different channel-bindings for the client
to choose. That's good for protocol agility, something that's important enough
to have algorithm negotiation in TLS, for example. I think we all can agree
here, that protocol agility is important and so this XEP is needed.
So now we have a problem: what to do if the server advertised a list of
channel-binding algorithms, but the client doesn't support any of these?
In the non-attacker case, the client can't send "y" in the GSS-header,
because
the server would then abort the authentication because it advertised channel-
bindings. But sending "n" and continuing without channel-binding is fine in
this case.
This isn't fully hypothetical. An older client might only support tls-unique,
while the server only advertises tls-exporter (which can be used for tls 1.2,
too, if the extended master secret is used).
Blocking the connection entirely is of course at the discretion of the client
developer, but it hinders interoperability, especially while phasing out one
channel-binding-type and introducing a new one.
Now the attacker case: Any MITM-attacker could just manipulate the list of
channel-bindings advertised using this XEP to just list some dummy mechanisms
(or mechanisms they know the client doesn't support). That would successfully
downgrade the client to non-channel-binding, if the client still sent "n" as
above. The client won't be able to distinguish the attacker case from the non-
attacker one.
How to fix this while not hindering interoperability? Implementing XEP-0474 is
one option (and an even better one). But for defence-in-depth and to support
cases when this isn't possible (for example when the SCRAM library used
doesn't support adding optional attributes), a few years back I proposed to
add a countermeasure to this XEP (the part we are discussing in this thread):
Having a MUST to implement and advertise tls-server-end-point when advertising
channel-binding types using this XEP.
Tls-server-end-point is the lowest denominator that can be implemented by
virtually everyone and even though it isn't as strong as tls-exporter or tls-
unique, it still catches many attacks (it would have detected the jabber.ru
one, for example).
Now with this MUST in place, a client seeing a list without tls-sever-end-
point advertised, immediately knows, that some attacker tampered with the list
of channel-bindings and can abort the authentication. It never needs to send
"n" even though it supports channel-binding and can still safely send
"y" if
it doesn't see any *-PLUS variants and channel-bindings announced. (Short
note: making this a SHOULD or MAY contradicts this, it has to be a MUST here.)
To put what you wrote into actionable terms for the client developer:
"If a client sees that that the server has 0440 support it MUST set
the 'y' flag regardless of the concrete binding mechanisms announced
by the server"
Is this a correct summary of what you wrote?
If this is basically the justification for making endpoint a
requirement then this strategy should by outlined in the Security
Considerations because it is definitely not obvious and Conversations
for example currently doesn’t do that.
This strategy is also somewhat at odds with the statement "Clients
using the information provided via <sasl-channel-binding/> MAY want to
indicate to the server that they do not support channel-binding (even
if they do) if no mutual supported channel-binding type was found."
which would have to be replaced then. (Which is fine)
Wouldn’t an attacker just strip out the entire 0440 announcement
though? Leaving the client developer to have to set the 'n' flag?
cheers
Daniel