[Standards] Punycode in stream 'to' attr

Peter Saint-Andre stpeter at stpeter.im
Thu Dec 29 16:11:34 UTC 2016

On 12/29/16 6:07 AM, Kim Alvefur wrote:
> Hi list!
> An issue was filed against Prosody¹ for not converting punycode in
> stream headers to Unicode.  Now I'm wondering if this is really
> something the server is expected to do.
> RFC7622 § 3.2.1.² states as following:
>> An entity that prepares a string for inclusion in an XMPP domainpart
>> slot MUST ensure that the string consists only of Unicode code points
>> that are allowed in NR-LDH labels or U-labels as defined in RFC5890.
> However, the next section states that:
>> An entity that performs enforcement in XMPP domainpart slots MUST
>> prepare a string as described in [previous section]
> This could possibly be interpreted as that the server should perform
> the toUnicode step if the client (or component in this case) does not.

The terminology in RFC 7622 and the other PRECIS-related documents (RFCs 
7564, 7613, 7700) seems to be slightly confusing to people. In 
particular, by "preparation" these documents mean something different 
than what's in RFC 3454. I've been working to clean this up in the 
PRECIS revision documents.

In any case, the interpretation you mention is, in my opinion as the 
author of the relevant RFCs, not correct. The very next sentence of 
Section 3.2.1 beyond what you have quoted states:

    This implies that the string MUST NOT include A-labels as
    defined in [RFC5890]; each A-label MUST be converted to a U-label
    during preparation of a string for inclusion in a domainpart slot.

The preparation stage applies to clients as well as to servers. It's 
only that the enforcing entity (typically a server) has a responsibility 
to apply all the rules (including those mentioned under preparation) so 
that it can ensure that the relevant string is correct in all respects 
(i.e., it can't really depend on other entities to do the right thing, 
but that doesn't mean those entities can do anything they please by 
sending A-labels or UTF-16 or something like that).


More information about the Standards mailing list