[standards-jig] Jabber i18n proposal

Jacek Konieczny jajcus at bnet.pl
Wed Apr 23 07:14:25 UTC 2003


On yesterday's meeting the problem of internationalization of Jabber
protocol came out. I got some ideas how to solve this problem.

This should be probably written as a JEP, but I don't thing my English
is good enough to write an official document.

There is and xml attribute which can be used for language
identification: xml:lang

First lets look what XML and XMPP specs say about it:

XML:

http://www.w3.org/TR/REC-xml section: 2.12

	In document processing, it is often useful to identify the
	natural or formal language in which the content is written.
	A special attribute named xml:lang may be inserted in documents
	to specify the language used in the contents and attribute
	values of any element in an XML document.

XMPP:

draft-ietf-xmpp-core-10 section: 7.2.5

	Any message or presence stanza MAY possess an 'xml:lang'
	attribute specifying the default language of any CDATA sections
	of the stanza or its child elements. An IQ stanza SHOULD NOT
	possess an 'xml:lang' attribute, since it is merely a vessel for
	data in other namespaces and does not itself contain children
	that have CDATA. The value of the 'xml:lang' attribute MUST be
	an NMTOKEN and MUST conform to the format defined in RFC
	3066[16].

XMPP draft mentions only CDATA sections, but language identification
will be needed also for some attributes in Jabber protocol. The draft
also states, that xml:lang should not be used for IQ stanza. That is OK,
it will be used in IQ's child elements.

What XMPP specs say about xml:lang usage is not enough, and will not be
enough for Jabber, as new protocols should have exact usage of this
attribute defined.

Usually xml:lang is use to define multiple forms of an item (in
different languages). This should be avoided in Jabber, as it can be
very bandwidth consuming. Instead the language can be requested in
query. Only when item is targeted to multiple users (eg. published via
pub/sub) it may contain multiple language versions.

A JEP defining such protocol should include following information:
- what elements of the protocol are subject of i18n (eg. form field
  names should not be translated translated and are not subject of i18n,
  but field labels are).
- where the xml:lang attribute may be used

Some general rules may be defined though:
- xml:lang may be used at the toplevel element of given namespace. When
  used in that way it is also a request for any reply (eg. when used in
  form submission it is a request for language of result data set).
- xml:lang may also be used in some internal elements, when their
  language differs (eg. foreign language test may have foreign language
  questions, but have native language instructions and request native
  language replies).
- when no xml:lang is given, some preselected language should be used.
  It may be implementation specific (but should be English if possible
  then) or preconfigured by user (while registering or in any other
  way).
- if requested language is not available other language may be used. It
  should be the language which whould be used if no xml:lang was given,
  but may be chosen using some heuristics (eg. en instead of en-us).

Here are ideas for some specific JEPs:

JEP-0004 (data gathering and reporting):
- xml:lang attribute may be used in:
	- the <query/> element: defines default language of labels and
	  values and requested language of form requested, filled-in for or result
	- the <field/> element: defines language of label and default
	  language of value and/or requests language of the value in
	  reply
	- the <option/> element: defines language of label and default
	  language of value
	- the <value/> element: defines language of given value and/or
	  requests language of the value in reply
	- <instructions/>, <title/> - identify language of their content
	
Only title, instructions, labels and fields are subject of language
identification.

JEP-030 (service discovery):

- xml:lang attribute may be used in:
	- the <query/> element: defines default language of names
	  or requests language of reply
	- the <item/> element: defines language of its name
	- the <identity/> element: defines language of its name

Only item/identity names are subject of language identification.

JEP-050 (commands):

- xml:lang attribute may be used in the <command/> element to request
  language of reply payload. The language used in the payload should be 
  defined in the payload itself (eg. according to rules about JEP-0004
  given above).


Example:

1. Requesting command list using disco:

1a. not xml:lang given:

C->S:
<iq type="get" from="user at server.domain" to="server.domain">
	<query xmlns='http://jabber.org/protocol/disco#items' 
		node='http://jabber.org/protocol/commands'/>
</iq>

S->C:
<iq type="result" from="server.domain" to="user at server.domain">
	<query xmlns='http://jabber.org/protocol/disco#items' 
		node='http://jabber.org/protocol/commands'>
		<item jid="server.domain"
			node="shutdown"
			name="Shutdown server"/>
	</query>
</iq>

1b. xml:lang given:

C->S:
<iq type="get" from="user at server.domain" to="server.domain">
	<query xmlns='http://jabber.org/protocol/disco#items' 
		node='http://jabber.org/protocol/commands'
		xml:lang='pl'/>
</iq>

S->C:
<iq type="result" from="server.domain" to="user at server.domain">
	<query xmlns='http://jabber.org/protocol/disco#items' 
		node='http://jabber.org/protocol/commands'
		xml:lang='pl'>
		<item jid="server.domain"
			node="shutdown"
			name="Wyłącz serwer"/>
	</query>
</iq>

2. Command execution:

2a. no xml:lang given:

C->S:
<iq type="set" to="server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		node='shutdown'
		action='execute'/>
</iq>

S->C:
<iq type="result" from="server.domain" to="user at server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		sessionid='123'
		node='shutdown'
		action='executing'>
		<x xmlns='jabber:x:data' type='form'>
			<title>confirmation</title>
			<field var='sure' label="Are you sure"
				type='boolean'>
				<value>0</value>
			</field>
		</x>
	</command>
</iq>

C->S:
<iq type="set" to="server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		sessionid='123'
		node='shutdown'>
		<x xmlns='jabber:x:data' type='submit'>
			<field var='sure'>
				<value>1</value>
			</field>
		</x>
	</command>
</iq>

S->C:
<iq type="result" from="server.domain" to="user at server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		sessionid='123'
		node='shutdown'
		action='completed'>
		<note type='info'> Shutting down </note>
	</command>
</iq>

2b. xml:lang given:

C->S:
<iq type="set" to="server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		node='shutdown'
		action='execute'
		xml:lang='pl'/>
</iq>

S->C:
<iq type="result" from="server.domain" to="user at server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		sessionid='123'
		node='shutdown'
		action='executing'>
		<x xmlns='jabber:x:data' type='form' xml:lang='pl'>
			<title>confirmation</title>
			<field var='sure' label="Jesteś pewien"
				type='boolean'>
				<value>0</value>
			</field>
		</x>
	</command>
</iq>

C->S:
<iq type="set" to="server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		sessionid='123'
		node='shutdown' 
		xml:lang='pl'>
		<x xmlns='jabber:x:data' type='submit'>
			<field var='sure'>
				<value>1</value>
			</field>
		</x>
	</command>
</iq>

S->C:
<iq type="result" from="server.domain" to="user at server.domain" id="c1">
	<command xmlns='http://jabber.org/protocol/commands'
		sessionid='123'
		node='shutdown'
		action='completed'>
		<note type='info' xml:lang='pl'> Wyłączam </note>
	</command>
</iq>

Greets,
	Jacek



More information about the Standards mailing list