[Standards] Rayo feedback.

Kevin Smith kevin.smith at isode.com
Tue Jun 16 12:26:38 UTC 2015

Sorry this is terribly late - I’ve been reviewing the Rayo XEP prior to voting on Draft, and I had a couple of questions/comments. This only covers the first half of the XEP (up to the end of section 6), as it seemed more useful for me to get the comments out than sit on them until I’m finished.

0) The initial diagram shows SIP being used, with Jingle being optional on the other side. I think this is just an example, but is it worth calling this out more explicitly in the diagram perhaps by replacing “SIP” with “e.g. SIP” and Jingle similarly?

1) Does leading with the examples help or hinder here? I found the examples at the start of one particular use case left more more confused than I think I would have been jumping straight in to what it’s trying to achieve. (No impact on going to Draft)

2) 5.1 (Actors) places requirements that these JIDs for components/mixers can only be only be under subdomains - why is this? AFAIK, this is the only part of XMPP that implies any relationship between a domain and a subdomain, and it doesn’t immediately seem like a useful restriction.

3) 5.1.6 Is calling things Components the most useful terminology here, when Components have a well-established meaning in XMPP (and a RAYO server is likely to be such a component).

4) 6.1’s reliance on a <show>chat</show> seems odd at best - wouldn’t a normal available presence be better here? I’m also not sure that the requirement for it to be directed presence is waranted - why wouldn’t broadcast presence work here?

5) 6.1 - if you want to rely on presence here, isn’t an unavailable presence the best way to signal unavailability? I don’t think it’s covered what receiving unavailable would mean here at the moment.

6) 6.2.1 Is how these metadata are handled defined?

7) 6.2.1 the uri attribute seems like it might be underspecified here. The server SHOULD try to create at the appropriate URI, but what happens if it decides not to (It’s not a MUST)? Similarly, what restrictions are there on how a client should form such a URI?

8) 6.2.1 How does the client discover the available URI schemes for to/from?

9) “Third Party” is introduced as a term here for the first time, without explanation of which party this is.

10) Use of presence for sending of notifications like this seems off. I realise this boat may have sailed, but it doesn’t seem right to me.

11) Is it right that it has to treat this first as if there’s no join, and then process the join? So if it’s trying to join something that doesn’t exist, or is invalid, it should set up the call first, and only then say the join fails?

12) 6.2.2 Introduces “system” for the first time. Which of the entities is the system?

13) 6.6.2 Is requiring the server to immediately reject the call right here (I don’t know). I’m wondering if it might just let it ring, for example, until it has an available controlling party.

14) 6.6.2 MUST offer simultaneously - is this required? Why might it not offer to different entities in some staged order?

15) 6.6.2 MUST wait indefinitely - why is this required? If the original caller hangs up, for example, wouldn’t the server be able to stop waiting for a controller?

16) 6.3 The identifier for calls here is always a JID, isn’t it? If that’s the case, it’d make more sense to be using JIDs here, instead of adding the layer of indirection of a URI with a fixed scheme.

17) 6.3 I think here we’re getting into the territory where presence stanzas are really not inappropriate for this

18) 6.3.4 introduces a direction attribute that I don’t think has been defined anywhere at this point.

19) 6.4 "a server SHOULD represent a mixer internally using some alternative name scoped to the client's security zone and mapped to the friendly name/URI presented to the client for the emission of events and processing of commands” - I don’t entirely understand this. If it’s an internal representation, why is this important for interop?

20) "A mixer MUST be implicitly created the first time a call attempts to join it”. Is this required, or might there be scenarios where a mixer can’t/shouldn’t be created?

21) "Mixers MUST respect the normal rules of XMPP presence subscriptions. If a client sends directed presence to a mixer, the mixer MUST implicitly create a presence subscription for the client.” - but that isn’t the normal rule for presence subs, is it?

22) Example 43: It’s not immediately obvious to me what an empty output element means here, it seems to be different semantics to the use in Exmaple 6 of reading a document with text-to-speech.

23) Example 44: This introduces ‘active speaker detection’, but doesn’t explain what this is (or reference an explanation), I think.

24) "Once the last participant unjoins from the mixer, the mixer SHOULD be destroyed.” - in what scenarios would it be appropriate not to? Should this be discussed?

25) 6.5 "A server SHOULD implement all core components” - what are the implications for clients if the server doesn’t implement some of these?

26) 6.5.3 - a reference to SSML here would probably be appropriate.

27) "The component is created using an <output/> command, containing one or more documents to render” - I think this implies that the previous examples with <output…/> are invalid.

28) If the XML for SSML has to be escaped (which seems to be the case from the example), this should probably be called out.

29) - I’m not sure why this is a SHOULD instead of a MUST?

30) - I think a quick description of the necessary addressing here would be useful.

31) Example 69 - I think this doesn’t give the units of time for the seek except in the example title and would be worth calling out.

32) 6.5.4 I think some reference to DTMF and SRGS specs would be useful here.

33) 6.5.4 - How is discovery of the optional/extensible mechanisms discovered?

34) - the SHOULD here seems more like it should be a MUST - is there a reason to do otherwise (and are there security implications or client implications?)

35) - When would the nomatch expect to be triggered? Presumably it’s not firing off e.g. whenever anyone says anything that isn’t a DMTF when a DMTF input is configured? Can it trigger multiple times, or is it removed after a match?

36) 6.5.5 - I think the rules for what happens to the output when input begins aren’t defined. Although it’s implied that the output stops, does it continue again after input?

37) 6.5.6 says that there are options supplied, but the example shows none - should the text say they’re optional?

38) When there are joins involved, can’t there be multiple callers? If so, how does that affect e.g. "In send mode, only the audio sent by the caller is recorded.”?

39) Links like http://xmpp.org/extensions/xep-0327.html#def-component-record-initial-timeout seem to be deadends

40) are x-skill and x-customer-id defined anywhere? I think the <header…/> stuff is new here (it doesn’t seem consistent with previous use of <header…/>). What are the rules for header here?

41) 6.6.2 - if the client can’t handle the call, what’re the other options than rejecting it? (MAY)

42) 6.8.1 - is feature-not-implemented an odd error to use for a protocol violation?


More information about the Standards mailing list