It does mean you can have a plain-text body, which is quite nice; but I
think one could decide that any jid-like "word" was in fact a jid and fetch
vCard data for it, rendering it as the name, etc - with false positives for
email, and so on, of course.

In other words, I might type:

I talked to Romeo today.

My client might note "Romeo" matched one of my contacts, and suggest I
meant "Romeo Montague". I affirm, and the message as sent is:

I talked to romeo at montague.example today.

A smart(ish) client then get vCard data and renders:

I talked to [Romeo Montague] today.

Rendering Romeo Montague as a hyperlink, perhaps with a tooltip (or
similar) with the avatar, jid, and so on.

This uses no markup (and no additional inlined metadata), and no
negotiation is required, but it's reliant on heuristics. Sending:

Romeo's email address is romeo at hotmail.example

... might confuse things; luckily it must be very rare that an email
address has an identical-looking jid used by someone else.

On the plus side, caching would work well here.

