[Standards] Message-IDs

Jonas Wielicki jonas at wielicki.name
Wed Feb 28 06:59:38 UTC 2018

On Montag, 26. Februar 2018 16:59:46 CET Simon Friedberger wrote:
> So, lest this discussion just die. Here is a proposal:
>   *
>     Client-A generates message-ID based on HASH(connection_counter,
>     server_salt). The connection_counter needs to be maintained only for
>     one connection. The server salt is server generated, anew for each
>     connection and is sent to.
>   *
>     Server-A checks that this is correct and uses it for MAM. This
>     should make life easier for clients because they only need to deal
>     with one ID.
>   * Two problems need to be considered here:
>       o The client needs to maintain a counter. I don't know if there
>         are cases where the client cannot persist this counter but keeps
>         a connection. In this case a sufficiently fine grained timestamp
>         to make it strictly monotonically increasing is suffcient. Even
>         though I called it a counter, it does not need to be contiguous.
>         It just needs to be increasing that the server can easily check
>         that for a given salt value it is unique.
>       o The server needs to check the validity of the counter. If the
>         server is actually replicated and consists of multiple machines
>         this is not strictly possible. However, assuming normal
>         operations the IDs generated by the client will be fine and if
>         the servers have any mechanism for eventual consistency a
>         misbehaving client will be detected. I think this fits the XMPP
>         model of "robust cooperation".
>   *
>     Server-B gets the message via s2s. It changes the message-ID to a
>     new one and stores the original as "origin-ID".
>   *
>     Client-B gets a message with only TWO IDs. message-ID is for
>     referencing locally for MAM, origin-ID is for referencing when
>     talking to the sender i.e. read receipts.
>   *
>     If a server generates follow-up messages it makes up a new
>     sender-ID. It should maybe set a “triggered-by-ID” so the client can
>     determine that it triggered this message. Maybe this is unnecessary.
>     The server definitely must send the message it inserted back to the
>     client to ensure a common view of history.
>   *
>     If a server changes a message it can keep the sender-ID but it MUST
>     notify the client who sent the message to make sure that clients
>     have the same view of the history.
> In this proposal stanza-IDs are not required. The message-ID is
> authoritative and when rewriting the original message-ID is kept as
> origin-ID.
> From my original mail this solves C1, C2, C3, C4 and C5. Mostly just by
> defining them. This does not give us a global message-ID (C6) or
> unforgeable message-IDs (C7).
> Note, that I would prefer to have a globally unique ID. This is possible
> under the assumption that everybody tries to generate unique IDs and
> that non-unique IDs and misbehaving parties can be removed from the
> system. Essentially, it would look just like this except that the
> message-ID would have to include an ID for the originating server. That
> would allow recipients to check that connection_counter is increasing
> and the server_salt is unique for this server. The latter check might be
> hard to perform, though. It can still be solved using timestamps. This
> proposal seems much simpler, and it solves most of the problems.
> Also note, to make this a simpler change the clients could set both
> origin-ID and message-ID. The stanza-ID for MAM would turn out to be the
> same. This would be very similar to what is probably currently the most
> widespread behavior. Except that the origin-ID should be used for
> read-receipts, etc.
> Opinions?

I find the overall concept very appealing. Thank you for taking the time to 
work this out.

I think you overestimate some complexities there (which is good) regarding to 
clustering etc. If a server uses a 128bit random number for the server salt 
and we enforce the counter to be continuous and monotonic, I don’t see any 
interaction between cluster nodes needed.

Likewise for the state keeping on the client side: If a client can keep a 
connection, it should be able to keep an 8 byte counter state along with it.

What needs to be specified is counter overflow. Could be done with a simple 
request from the client for a new salt.

I don’t see a good way to integrate the date in the message ID though (cc @ 
Zash). Even if we let the server define a must have prefix which they could 
incidentally set to the date, a way to handle date changes during a connection 
would be needed.

kind regards,
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.jabber.org/pipermail/standards/attachments/20180228/90441a8d/attachment.sig>

More information about the Standards mailing list