[Standards] Proper SRV Record Fallback

Jonas Wielicki jonas at wielicki.name
Mon Mar 19 19:24:45 UTC 2018

Hi all,

Let’s do a neat TOFU mail to bump this thread (you’ll find a more detailed 
inline reply from me at the start of the thread).

In aioxmpp, we now (v0.10+) implement the following:

- pre-auth stream errors are treated like any other connection error (e.g. 
connection refused). This notably includes bad-format from receiving HTML 
where we expect XML. We try the next SRV record option in that case.

- if SASL is unavailable (e.g. no matching mechanism), we also try the next 

- authentication failures are treated as fatal and abort everything 

- TLS errors are like any other connection error and trigger the next option.

This plays nicely when connecting to a multiplexed server without ALPN 

(also, we now have ALPN support.)

Any comments?

kind regards,

On Dienstag, 9. Januar 2018 05:19:36 CET Travis Burtrum wrote:
> Hi all,
> I have not been able to find proper SRV record fallback behavior
> documented, and could not come to a clear consensus in the XSF MUC
> earlier, so I thought I'd bring the discussion on-list. (and hopefully
> document it in implementation notes of XEP-0368[1])
> The main question is when you should fall back to lower priority SRV
> records and when should you give up and fail the connection.
> First, what do docs say:
> RFC-6120[2] Section-3.2.1 #7 says:
> > 7. If the initiating entity fails to connect using all resolved IP
> > 
> >       addresses for a given FDQN, then it repeats the process of
> >       resolution and connection for the next FQDN returned by the SRV
> >       lookup based on the priority and weight as defined in [DNS-SRV].
> 'fails to connect' does this mean the TCP connection fails, or the XMPP
> connection fails?
> #8 might leave a hint:
> > 8. If the initiating entity receives a response to its SRV query but
> > 
> >       it is not able to establish an XMPP connection using the data
> >       received in the response, it SHOULD NOT attempt the fallback
> >       process described in the next section (this helps to prevent a
> >       state mismatch between inbound and outbound connections).
> This clearly says XMPP connection, but does it apply to #7 ?
> It is also clear I didn't think about this too hard when writing
> XEP-0368, because I clearly (to me) assume SRV fallback will happen if a
> complete XMPP connection is not successful, because under Implementation
> Notes I say:
> > Server operators should not expect multiplexing (via ALPN) to work in
> > all scenarios and therefore should provide additional SRV record(s)
> > that do not require multiplexing (either standard STARTTLS or
> > dedicated direct XMPP-over-TLS). This is a result of relying on ALPN
> > for multiplexing, where ALPN might not be supported by all devices or
> > may be disabled by a user due to privacy reasons.
> While I don't explicitly say it, if a port required ALPN to multiplex,
> it will generally end up connecting you to a non-XMPP server without
> ALPN, meaning you will get back invalid XML, other junk, and/or an
> invalid TLS cert.
> RFC-2782[3], defining SRV records, makes no mention of this.  Which
> actually makes sense because it doesn't even define possible protocols,
> UDP for example has no connection concept.
> Now that the docs are out of the way, on to the discussion:
> In my opinion, at least all of cannot-connect-to-port, non-XML,
> not-proper-stream and invalid TLS cert should trigger a fallback to the
> next highest priority SRV record.  Everyone in the MUC seemed to agree
> if authentication fails a fallback would be a bad idea.
> Sam Whited said that if a TCP connection is established fallback should
> cease, that it shouldn't have anything to do with or any knowledge of
> XMPP, and that it might have security implementations to do otherwise.
> (please correct and forgive me if I misunderstood)  I disagree with
> this, I think if Eve has control over DNS (and no DNSSEC) she can return
> arbitrary records anyway so SRV fallback doesn't matter.  If Eve
> controls a higher priority server, or the network between client and
> that server, she can trigger fallback or not regardless of what we
> decide.  The difference is if we decide not to fallback, then she can
> effectively DOS us by messing with 1 server instead of all.
> I think my proposal is even more generic than the above, I think
> authentication-response should be the point when fallback ceases.
> Regardless of what happens before that point, you fall back to the next
> SRV record, and after authentication, whether it's successful or not,
> you no longer fall back anymore.
> As to what actual clients do in the wild, Conversations falls back
> regardless of junk or invalid cert, dino does not.  I am unsure what
> gajim does (though from talking to lovetox 1.0-beta might also not
> fallback in all cases).
> Depending what we decide, I plan to set up various domain/SRV record
> combinations for testing, probably clients and servers both need this
> type of testing, and I doubt it is done often.
> Thanks,
> Travis
> [1]: https://xmpp.org/extensions/xep-0368.html
> [2]: https://tools.ietf.org/html/rfc6120#section-3.2.1
> [3]: https://tools.ietf.org/html/rfc2782
> _______________________________________________
> Standards mailing list
> Info: https://mail.jabber.org/mailman/listinfo/standards
> Unsubscribe: Standards-unsubscribe at xmpp.org
> _______________________________________________

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.jabber.org/pipermail/standards/attachments/20180319/0f127c50/attachment.sig>

More information about the Standards mailing list