[Standards-JIG] LAST CALL: JEP-0106 (JID Escaping)

Peter Saint-Andre stpeter at jabber.org
Wed Apr 20 17:32:32 CDT 2005


On Wed, Apr 20, 2005 at 12:33:56PM -0500, Peter Saint-Andre wrote:
> On Thu, Apr 07, 2005 at 07:03:28PM +0100, Richard Dobson wrote:
> > Well im afraid that I have to vote -1 due to it still not using the 
> > internet standard of %, now I know this was done because of the MSN 
> > transport but no convicing reasons for why this could not be changed so we 
> > can just go with the standard have been presented so far.
> 
> FYI, this item is on the agenda for tomorrow's Council meeting.
> 
> Earlier concerns about MSN gateways may be moot if we restrict this
> mechanism (as version 0.4 of JEP-0106 does) to escaping only of the
> following characters:
> 
> whitespace
> "
> &
> '
> /
> :
> <
> >
> @
> 
> My understanding is that traditional domain labels may begin only with 
> a letter or a digit (see RFC 1034 etc.). Since no letters or digits are
> on our list of potentially-encoded characers, I think there will be no
> confusion with respect to traditional domain names. I am less sure about 
> internationalized domain labels, but perhaps Joe Hildebrand can weigh in 
> on that.
> 
> However, I want to look into this further before agreeing that use of %
> is the best way to proceed.

Regarding internationalized domain labels, the ToASCII transformation
defined in Section 4.1 of RFC 3490 is relevant, especially step 3:

   3. If the UseSTD3ASCIIRules flag is set, then perform these checks:

     (a) Verify the absence of non-LDH ASCII code points; that is, the
         absence of 0..2C, 2E..2F, 3A..40, 5B..60, and 7B..7F.

     (b) Verify the absence of leading and trailing hyphen-minus; that
         is, the absence of U+002D at the beginning and end of the
         sequence.

Where (RFC 3490, Section 2):

   The term "LDH code points" is defined in this document to mean the
   code points associated with ASCII letters, digits, and the hyphen- 
   minus; that is, U+002D, 30..39, 41..5A, and 61..7A. "LDH" is an
   abbreviation for "letters, digits, hyphen".

So the characters we want to encode are prohibited in internationalized
domain labels just as they are prohibited in traditional domain labels:

SP " & ' / : < > @

Since these are the *only* characters to which JID escaping will apply 
(it MUST NOT be applied to any other characers), then it should be safe
to escape these characters with:

%20 %22 %26 %27 %2F %3A %3C %3E %40

Peter

P.S. The concern about MSN gateway addresses is that any domain name 
     can be an MSN address. So let's say that someone's MSN address is
     pilot at af.mil and that person's JID once transformed by an MSN 
     gateway is pilot%af.mil at msn.example.com ... well, we certainly
     don't want to transform the %af in that JID to U+00AF (i.e., the
     MACRON character). But if JID escaping is strictly limited to the 
     characters listed above then we are OK, because all of those 
     characters are forbidden in domain labels (and thus in the first
     character of a domain name) as far as I can see. Naturally if I
     am wrong about that, please correct me. ;-)





More information about the Standards-JIG mailing list