Hello Florian

Thanks for the mail, and thanks for sharing your experiences. I’ll try to answer your comments and questions, one at a time.

> Now, one key issue that the XEPs try to solve is that a thing ? think of

> the light-bulb next to you ? can not decide on its own if another thing

> is his friend or not. So, if the thing receives a friendship request, it

> basically proxies the request to its provisioning server (PS), which

> will wait for an decision from the thing's owner. This process is

> described in the Provisioning XEP and is basically a proxy mechanism for

> XMPP subscription requests.

It’s called ”delegation of trust”: I.e. if the thing cannot make the decision it self, it can delegate the responsibility to one or more third parties (provisioning servers). How these reach a conclusion is not specified more than through example: To ask the owner. A learning mechanism is also mentioned, but not specified.

> The concept of friendship is a fundamental building block of the IoT

> XEPs, yet it is nowhere formal specified what friendship actually is in

> terms of XMPP. The XEPs appear map friendship between things to

> subscription states between XMPP entities, but they do not specify if

> friendship is a symmetric relation or an asymmetric one. XMPP

> subscription states are asymmetric: Just because I'm subscribed to your

> presence, it doesn't mean that your are subscribed to mine. I believe

> that the IoT XEPs assume that an thing is a friend if both are

> subscribed to each others presence. But do we really want that?

The provisioning server keeps track of friendship relationsships. This is symmetric in the sense, that both can subscribe to the presence of the other, if they want to. They don’t need to.

I’ve added a note to add a description, or definition of the friendship relationship, and how it relates to presence subscriptions.

> While implementing we also discovered that the XEPs do not discuss an

> important protocol flow. One of the test scenarios I implemented in

> Smack consists of a provisioning server (PS), a thing, a owner and an

> XMPP entity trying to become a friend of the thing (the 'requestor').

> The requestor sends a subscription request to the thing, in order to

> befriend the thing. Now the thing asks the PS whether or not the

> requestor is a friend and the PS will immediately return that the

> requestor is not a friend, because there was no decision from the owner

> yet. Here ends the story in the XEPs. In our scenario, the owner will

> eventually accept or reject the friendship request in the PS's web

> interface. But how does the requestor get notified about the owners

> decision? Possible XEP-0324 ? 3.2.4 ? but do we want the requestor to

> act on recommendations send from arbitrary JIDs? We also need to discuss

> this.

If no rules are defined, rejection is returned by default to the device, since immediate feedback is always assumed. It is also assumed the owner is notified of the action. But since the owner might respond to the action at a much later stage, that response is seen as a separate event. If the owner accepts the request, this event is informed asynchronously, as described in §3.2.4, as you mentioned.

> My general impression is that the current IoT XEPs are to large and to

> complex. It reminds me of the XEP-0136 Messaging Archiving situation,

> where this big and complex XEP got not much traction because it is so

> heavyweight and hard to implement. And now we have XEP-0313 Message

> Archive Management, which is simple, covers most uses cases and is easy

> to extend, thus allowing the missing use cases to be added on top. We

> should think big, but write simple and modular XEPs.

And yet much smaller than pubsub, which to many seems to be one of the cornerstones of XMPP. They are modular, that’s why they’ve been separated into 322, 323, 324, 325, 326, 347.

> The XEPs are written in style which possible assumes that examples are

> normative: Some sections consist mostly of figures and examples. The IoT

> XEPs need more normative texts.

More normative text, and less examples? Many claim the XEPs are too long. But as you say, most of the text consists of examples.

> XEP-0323 - Data

> ===============

> The data read scheme out should follow the scheme we re-introduced in

> XEP-0313: Message Archive Management: IQ-request ? data1 ? ? ? dataN ?

> IQ-result. This would allow to remove the 'done' attribute and this

> scheme is a little bit easier to implement. I going to repeat myself,

> because I said this in MAM thread years ago: I believe we always should

> use this very scheme when requesting data which we expect to be

> delivered by multiple stanzas.

Are you suggesting that using data-forms would be “easier” for transporting sensor data? And that constant polling to extract asynchronously retrieved data would be simpler? I cannot see that that would be easier using any objective standard. It also presupposes knowledge about the amount of data to be retrieved, something that might now be known at the onset of the readout.

> Timestamp is not using XEP-0082: XMPP Date and Time Profiles.

Time zones are described in §6.1.

> Why does it not re-use XSD data types? (SenML [2] also seems like a

> potential candidate. Haven had time to look into it though.)

What XSD data types are you referring to? The field types available are based on the XSD data types.

Regarding SenML: I responded to that comment in a previous mail. As it presupposes knowledge about the data being transmitted, it is not a suitable candidate for interoperable interfaces. It works for simple transport of data, if you control all end-points, or if end-points can be adapted to specific hardware. I meant to present the design principles behind XEP-0323 using the following presentation, at the last IoT SIG meeting:

https://www.slideshare.net/peterwaher/xmpp-iot-sensor-data-xep0323

> The 32/64-bit integer types: Are they signed or unsigned? I shouldn't be

> required to look in ? 10. XML Schema to find that out.

Noted. Will update that.

> Example 7 shows an IQ send to a bare JID. This is likely meant to be an

> IQ addressed to a full JID.

I’ve put in the TODO list to update JIDs in examples.

> It's unclear if things without a node are allowed.

How do you mean? You mean in the request, response or data format? I’ve made a note do add some clarifying text regarding this, for devices that do not embed nodes.

> XEP-0324 - Provisioning

> ====================

>? 3.1.1 "?if the provisioning server is not available in the roster of

> the device?"

> This falsely mixes the existence of an roster entry with presence

> subscription. But the XMPP Roster and Presence Subscription are two

> (mostly) orthogonal concepts. You can have a presence subscription

> established without a related roster item, and you can have a roster

> item without being subscribed.

I agree on the second, but not the first statement. RFC 6121 §2.5.2. clearly states that removing a roster item automatically unsubscribes to any presence.

Still, added a note, since I believe the text can be updated to avoid confusion.

> ? 3.2.2 "Any resource information in the JID must be ignored by the

> provisioning server."

> Some XEPs of the IoT XEPs go with a different approach of explicitly

> disallowing the JID value to be a JID with resourcepart. I suggest

> consistency and I would favor the approach of explicitly forbidding the

> JID to contain a resourcepart instead of ignoring the resourcepart.

First, it seems to be Note 2, §3.2.1. But, it seems to be a misunderstanding. What is meant is that rules, in the provisioning server, must be based on the bare JID, not the full JID, of the device. The resource part is assumed to be a random value that can change over time.

> ? 3.2.4 "isFriend(jid) optional, for security"

> There is no gain in security if the Device asks the provisioning server

> if it really should add the JID as friend. If an attacker is able to

> spoof messages with a 'from' JID of the provisioning server, then he is

> very likely also able to spoof IQ replies too.

Very likely is not the same as guaranteed.

> ? 3.5.1

> Example 20 uses a bare JID for IQ. Also why isn't the response just an

> empty IQ result?

I’ve made a note to update JIDs in examples. I’ve also made a note to update the clear cache result, as you suggest.

> XEP-0325 - Control

> =================

> Why should we us this instead of XEP-0050: Ad-Hoc Commands? I've seen

> people controlling their 2015 home automation system (Homematic) with an

> very old Psi version using XEP-0050. Ad-Hoc Commands and Data Forms are

> a key strength of XMPP.

There are several reasons. While having access to ad-hoc commands and data forms might be a strength, if you use a full-stack XMPP implementation, it’s not, if you have a light weight implementation, something which is expected, especially within IoT. So, the simplest use cases do not use data forms at all. It’s also possible to map control parameters and sensor data. Another reason is the possibility to execute commands on multiple nodes at the same time, as well as executing multiple control commands in the same request, as well as the added semantics available for interoperability.

> ? 3.2.2/3: The 32/64-bit integer types: Are they signed or unsigned? I

> shouldn't be required to look in ? 10. XML Schema to find that out.

Noted.

> ? 3.1.2: Can we have an empty result IQ instead of an empty

> <setResponse/> result?

Yes. I’ll write that on the TODO-list for XEP-0325.

> ? 3.1.2 and ? 6.2: Why put 'xml:lang' in <set/> instead of simply in <iq/>?

If it’s better that it be placed on the iq element, then we can move it there. By setting it on the set element, I just wanted to make sure the attributed was passed on to the specific handler, regardless of underlying implementation.

> XEP-0347 - Discovery

> ====================

>? 3.9: Claiming a thing: Why are 5 values required to claim a thing,

> when a tuple of ID and Key would be sufficient and provide the same

> security guarantees? Do we really want the user to enter 5 strings,

> every one adding another error source, in order to claim a thing? Or is

> it up to the registry/thing/whatever to decide what is sufficient to

> claim a thing?

> Claiming should be as simple as possible. I've claimed over a hundert

> devices in the past months and was always annoyed that I had to enter 5

> values.

It’s up to the manufacturer to decide, what is required. And you should not have to enter anything. Instead, another transport method of the conceptual information from device/manufacturer to owner is suggested. The XEP mentions the use of QR-codes, for instance. The point here, is that the number of parameters used in the claim is variable, and defined by the manufacturer. All the XEP does, is require the owner to specify the same set of parameters, in the claim.

Note: While from a device perspective, it might be sufficient with the KEY to perform the claim, human users are not good at identifying devices using such information. If a human is claiming multiple devices, one way to inform the user what device is actually being claimed, is by presenting humanly understandable information at the same time, together with the machine understandable information. Depending on situation, it is therefore very possible, that the manufacturer wants to add such conceptual information to the claim, to make sure the owner understands what device is being claimed.

> Example 13 is missing 'cacheId'.

You mean cacheType? nodeId, sourceId and cacheType, are all optional, and provide 3 additional axes of identification of nodes behind a concentrator. More about these in XEP-0326.

> <claimed/> has two semantic: success response to <mine/> *and* claimed

> notification to thing. I suggest using two different elements.

Ok. I’ll write this on the TODO list.

> Example 46 and 47: When to use a full JID and when to use a bare JID as

> value of 'jid' in <disown/>?

Example 47 contains an error. Full JIDs are only used in the stanzas, for addressing. Rules and logic are based on bare JIDs. So, the jid attribute in Example 47 should be a bare JID.

> ? 5.2 Meta Tags: Why is 'V' (Version Number) of type 'numeric'? Is it

> supposed to be an integer, or a version string like "1.2.3-beta4"?

It’s numeric, to allow for numeric comparison. You might want to do a search for specific devices reporting a version larger than 8 for instance. You would want such a search to contain devices of version 10. Having it as a number, allows for only one decimal separator (major.minor). If (release.build) is also desired, or other information, such as beta, etc., you can always report those using other tag names. No such tag names have been specified however.

Best regards,

Peter