[Operators] How-to fight with SPAM accounts

Jesse Thompson jesse.thompson at doit.wisc.edu
Wed Dec 9 15:51:29 CST 2009

On 12/3/2009 3:02 PM, Peter Saint-Andre wrote:
> On 12/2/09 2:22 PM, Jesse Thompson wrote:
>> Peter Saint-Andre wrote:
>>> On 11/25/09 11:53 AM, Jesse Thompson wrote:
>>>> Peter Saint-Andre wrote:
>>>>>> I think that the key for the 'right/best' anti-SPAM XMPP solution
>>>>>> is to
>>>>>> involve regular/polite XMPP users in any way.
>>>>> I have my doubts that normal users will bother to flag messages as
>>>>> spam.
>>>>> However, given that I have only ever received a few spam messages over
>>>>> XMPP (and even those I wasn't 100% sure about), perhaps it would not be
>>>>> such a huge burden.
>>>> I like the idea of account level reputation.  The current, most
>>>> troublesome, battlefront on the war against email spam is dealing
>>>> spammer-created freemail accounts,
>>> Most of the large, public XMPP IM services essentially offer "freechat"
>>> accounts. The use of CAPTCHAs at, e.g., jabber.org is a small hurdle.
>> CAPTCHAs won't stop them from creating accounts.
> Agreed. That's why I said the hurdle was small. :)

This idea is probably too elaborate, but I'll throw it out there since 
it would actually leverage CAPTCHAs nicely.

Would it be possible for the server to force the sender to solve a 
CAPTCHA for every "new" conversation between users that have not already 
authorized each other in their rosters?

And here is another off the wall idea:  Or is there some way to 
implement greylisting in XMPP?  The idea here is to initially tempfail 
(if that is even possible in XMPP) a "conversation", but then accept it 
when the sending server retries.

and another idea... is there a way to implement content scanning?  If I 
get an unsolicited message from someone not in my roster, then can the 
client or server send the content to a service for classification?

>> Take a look at this
>> list of email "phishing reply dropbox" email addresses that we have been
>> collecting over the past year or two.
>> https://aper.svn.sourceforge.net/svnroot/aper/phishing_reply_addresses
> Nice. And some of those double as IM addresses.
>>>> and with phished account credentials
>>>> on closed systems.
>>> I think we've seen less of this on the XMPP network because we don't
>>> have very good web integration.
>> No, the phishers just ask the users to reply via email with their
>> account credentials.  The link above is a list of these reply
>> destination email accounts.
>> Or, they put up a web form somewhere.
>> You would be surprised how many users will give away their credentials
>> to anyone that asks.
> Sadly, you're probably right.
>>>> You could apply an account-level reputation system at the server as well
>>>> as the client.
>>>> An XMPP operator could set up the server to block domains whose
>>>> trustworthy account ratio is below their tolerance level.  This would
>>>> effectively block domains that have only spammers.  But it would not
>>>> block domains like jabber.org or gmail that are trustworthy but have
>>>> spammers signing up for free accounts.
>>> Agreed.
>>>> For spamming accounts in trustworthy domains, the server operator could
>>>> set it up to block accounts that meet a certain untrustworthiness
>>>> threshold.
>>> So when mydomain.com receives an inbound stanza from user at jabber.org, it
>>> would check the trust score of the sender?
>> yeah
> That could generate a lot of traffic. Perhaps it could be optimized to
> check only on the first message received in a chat session (although
> "chat session" is mostly undefined at the protocol level).

Yeah.  However, keep in mind that the server/client can inherently trust 
any traffic that is between users that have already authorized each 
other in their rosters.

>>>> Or, the users could do it at the client level.
>>> That seems like more work. See above about user laziness. :)
>> My thought was more that ambitious developers will be more able to
>> integrate it into the clients before it is adopted into server software
>> and deployed by the operators.  Think of it as a way to bridge the gap.
> I'm not opposed to both methods, although I think that development of
> clients and servers is about equal in speed these days.

It has more to do with necessity.  Right now, there aren't enough users 
of a service that actually have a problem with spam to justify a service 
operator to spend time implementing anti-spam.  Those users who have a 
spam problem would gravitate to clients that support anti-spam in the 
mean time.

>> Anti-spam scanning was built into email clients well before it became
>> common on the server-side (around 2002.)  Once the servers caught up,
>> the client approach became less effective, but it is still useful in
>> some situations.
> Agreed. And the client-side approach might tie in nicely with rosters.
>>>> The key is to figure out how to collect and expose the data in a private
>>>> way.
>>> Your thoughts are welcome.
>>> Do you mean the scores need to be private, or the source data needs to
>>> be private?
>> I was initially thinking of a trust network: I trust someone who is
>> trusted by the people I trust.  I could then set it up so that people
>> who are very trustworthy are allowed to send me anonymous messages and I
>> will auto-authorize into my roster, someone who is completely foreign to
>> my trust network is blocked from sending messages, and various levels in
>> between.
>> Some of this data is already available within the server roster
>> databases, but otherwise it would have to be fed by opt-in contributers.
>>   The problem with this trust network approach is that the data could be
>> mined by spammers and phishers, so it would need to be kept private
>> somehow.
>> Otherwise, traditional DNSBLs (specifically, URIBLs of JIDs) are the way
>> to go.  It might be possible to work with the existing DNSBL providers
>> to create a new blacklist of JIDs.
> Yes, that's worth exploring, though I'd like a way to query it in XMPP
> and not over DNS.

As an analogy, in regards to cross-network IM, we have some clients that 
rely on transports, and others that implement the protocol directly. 
The DNS part of the implementation should be relatively easy since 
clients already look up SRV records, but the UI would be non-trivial 
regardless of how the client does the query.  If it's implemented using 
service discovery, then the user could configure their client to use an 
external anti-spam service just like they can use an external transport 
today.  But there has to be someone willing to run the service.

How the anti-spam plugin/service is implemented depends on where the 
data is stored.  If you want to leverage existing URIBLs, then the 
plugin would have to be capable to querying via DNS.  Are DNSBLs still 
using DNS because it is the best way to do it, or is it just legacy? 
That's something I'm not sure of.

I kind of wish that our service actually got spam so that I could try 
out some of these ideas. :-)  Spammers: hit me!


> Peter

   Jesse Thompson
   Division of Information Technology, University of Wisconsin-Madison
   Email/IM: jesse.thompson at doit.wisc.edu

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3317 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.jabber.org/pipermail/operators/attachments/20091209/700fcdae/attachment.bin>

More information about the Operators mailing list