August 7, 2009

Hotmail and Google docs being abused for spam

Historical

Rob Mueller

Founder & CTO

A user forwarded me a particular annoying bit of spam the other day that I realised is going to be quite hard to combat.

The email was sent from a Hotmail account. Clearly the spammers have
broken the Hotmail CAPTCHA process (again), and thus are signing up
10,000’s or more accounts to send their spam. The main issue is that
it means there’s no easy “source IP” to test against RBLs for
blocking or scoring purposes. Hotmail does add a “X-Originating-IP”
header, but that’s non-standard and for the cases I’ve seen, the IPs
are not on any known black lists.

This actually seems quite an effective process for spammers. Using
new spambot compromised machines to only send via reputable services
like Hotmail, Yahoo, etc. Basically I believe most RBLs are built
using systems that only check against the original incoming SMTP
connection (either at the SMTP stage, or via some feedback process
that later scans back through the Received headers). They generally
don’t look at custom headers like “X-Originating-IP”. So even if
spam checking software does check that header, not much RBL building
software will, so as long as the spammer can keep those IPs so
they’re only used for sending via other “trusted” services, the IPs
will probably stay off RBLs for a long time.

Given the constant battle Hotmail, Yahoo, Gmail, etc have stopping
mass signups, CAPTCHAs days seem numbered. Already in some cases,
Google have started requiring SMS verification for new gmail
accounts,
I expect this trend to spread to other services and companies over
time as the CAPTCHA systems employed to try and stop abuse appear to
be less and less effective every day.
The email contained a bunch of random text. Also not unusual, but it
makes any content analysis basically impossible
The email contained a link to a public Google Docs page. Again,
clearly spammers have broken the Google CAPTCHA process to signup
masses of Google Docs accounts and fill with their spam landing
pages. Again this means that URIBLs are ineffective against these
types of emails because they can’t go and block Google Docs domains.

The net result was that the emails in question contained very little information to block against. Some composite rules could be created (eg from a Hotmail account, with a Google Docs link in it), but they’re clearly far too broad and likely to result in many false positives.

At the moment, the main things we can do about this are:

Report the emails as spam to providers like Spamcop and others. This
should both end up reflecting badly on the services that are being
abused, but should also encourage improvements to make sure they do
look for X-Originating-IP headers and the like to help build IP RBLs
Report the Google Docs pages as abuse. I’d hope Google have good
internal systems to handle this, so that if a bunch of pages are
reported as abuse, they can track down similar pages and disable
them and the associated signups as well