I'm a SpamAssassin Evangelist

The blog is currently being ported from WordPress to over 12 years of static pages of content. If there's an article missing that you're hoping to see, please contact me and let me know and I'll prioritize getting it online.

March 30, 2006

When I started web hosting many years ago, I had to battle spam for other people for the first time. No longer did I have time to write to every ISP, trace every IP address, send nasty-grams, and get lots of confirmations back from sysadmins ...

When I started web hosting many years ago, I had to battle spam for other people for the first time. No longer did I have time to write to every ISP, trace every IP address, send nasty-grams, and get lots of confirmations back from sysadmins that accounts had been suspended or cancelled. I ended up hosting a friend's domain and he got more spam on his Email accounts alone than ALL of my other hosting clients *combined* - and I had over 100 clients at the time.

As I mention in my spam report, the day I set up SpamAssassin, even without training it, my friend's Inbox trickled to a few pieces of legitimate Email - he called me asking if the server was down because normally he'd have spend a half hour downloading 800-1000 messages every time he logged in, most of them junk. As soon as I started training SpamAssassin, I was hooked.

When I stopped my full-time web hosting, and entrusted w98.us to the nice folks at LunarPages, one of the big selling points was the availability of SpamAssassin. I quickly applied the knowledge I had about training SpamAssassin, along with some clever Perl scripting, and wrote a tutorial on how to train SpamAssassin which [LunarPages.com quickly picked up as a how-to guide](http://www.lunarforums.com/web-hosting-tutorials-faqs-and-resources/how-to-train-spamassassin-updated-april-27-2010/) and later posted as a 'sticky' at the top of their 'Email' help forum.

A former coworker of mine from PriceGrabber.com was talking to me about SpamAssassin one day, and told me of a tutorial he'd seen at his hosting provider that was really helpful - turns out he hosts with LunarPages too. We got a good chuckle out of the realization that I was the one that wrote that article.

Anyhow, today that same coworker send me a link for a new anti-spam gadget called a "Spam Cube", asking if it was a front-end to SpamAssassin. I decided to do a little homework on this device, and see what it was and how it operated.I couldn't read the NY Times article he'd sent since I don't have a NY Times username, so I google'd "Spam cube" and found spamcube.com and found some interesting stuff there:

For starters, their own site seemingly contradicts itself. They say repeatedly that there's no subscription fee, but say in numerous places that you have to pay them $52/year. According to Google, "subscription" can mean "In a digital library, a payment made by a person or an organization for access to specific collections and services, usually for a fixed period, eg, one year." After a fair bit of digging, the $52/year is for the anti-virus protection on messages and the site tells you in one place that it's an optional charge. Everywhere else on the site made me believe the $52/year was a mandatory payment.

And I wouldn't say it's a front-end for SpamAssassin. It's probably still using some sort of heuristical analysis of what makes an Email "spammy". They claim they don't use Bayesian filtering, so they probably have their own version of a heuristics database where they scan spam messages on their end, and upload those details to every SpamCube box. Not terribly efficient, since what I call spam may differ from what you'd call spam. Even my own wife and I disagree on what spam is - a newletter we get as part of being an online shopper at walmart.com is legitimate non-spam (aka 'ham') to me, but tends to get flagged as spam by Elizabeth. Sure you can catch common phrases, common obfuscation of words like how leet-speak substitutes numbers for letters.

What I find funny on their site are quotes like this:

Our technology is a more sophisticated, more accurate and more affordable method of stopping spam.

Hmm, they claim to be more affordable... first year will cost you $150+tax plus $52/year (likely also plus tax). I spent exactly $0 to get a copy of SpamAssassin. It costs me approximately $0 to maintain SpamAssassin at LunePages, and takes about 5 seconds for me to open a bookmark to a series of Perl scripts I have in place to juggle my spam/ham Email around and train SpamAssassin in the background. it costs me exactly nothing to install, setup, and maintain SpamAssassin.

Now, I'm not a math major like my buddy Jorge, but $0 plus $0 plus $0 *should* equal $0, unless you get that "new math" involved, then heaven only knows the answer...

Plus they readily advertise that their system will only filter mail for only 4 computers, but in fairness, unlimited Email addresses on those computers. SpamAssassin can run on one central system at an ISP to protect hundreds of thousands of Email accounts per system. Again, for free.

My guess (since they don't divulge any real technical information) is that the SpamCube watches traffic on known unsecured Email ports (25,143), catches the message mid-stream, analyzes it, and doesn't deliver the message if it seems too spammy. Since at the $150 price it's unlikely to decrypt SSL traffic on the fly, I can't see how it would monitor ports 993 and 995 for SSL-enabled IMAP or SMTP respectively, which I use almost exclusively. And there's no description on their site about what it does with the messages it decides needs to be tossed out, so what happens to false-positives? Do you log into this little box on a dynamic IP address and configure it with all of your Email account information and let it download your messages? SpamCube's web site was extremely vague in how the machine operates, because it's still pending on a patent application.

The biggest difference between this device, and SpamAssassin, is that this device requires you to download the spam messages for analyzing, where SpamAssassin does it when the messages are delivered to your ISP and keeps them from ever entering your mailbox. In that regard alone, the Spam Cube is no better at fighting spam than spam-filtering software running on your PC itself. Guys like my buddy Darin would still have to wait for his DSL line to fetch 800-1000 messages every few hours, most of them spam, just to analyze them.

I'm betting that if you buy one of these, and plug it in, and have access to your Email account via webmail as well, that you'll see the spam messages in your Inbox via webmail, but that the SpamCube removes them on-the-fly as it downloads them. SpamAssassin would never let the message in there to begin with.

And with the tutorial I wrote for LunarPages on spam management, there's no reason why people can't run a script (or even set a cron job to do it for them) to maintain their own SpamAssassin training like I do in about 5 seconds every few days.

So far, not worth the money in my opinion.

They claim 1 in ~10000 Emails will be flagged as a false positive, and that they detect spam with 96%-99% efficiency. Just yesterday (March 29 2006), SpamAssassin blocked 402 spam messages from my mailbox (see http://www.w98.us/spam/ for a running report on spam at my w98.us domain). At their advertised efficiency rate, the Spam Cube would have let 5 to 16 spam messages get through. That's about 4 to 15 messages too many. I can handle about one piece of junk mail slipping through, and that's with a lowered point setting of 3.5 in my SpamAssassin configuration where the default setting is 5.0.

The clincher though: SpamAssassin let exactly zero spam messages into my mailbox yesterday, after scanning 402 messages. And on March 15 2006 I had the first false-positive message get flagged as spam in over two years, and it only got flagged because it was a commercial Email about Roxio software I was eligible to upgrade, and they wrote in the message that it was a 'one time' upgrade sale, which added 1.6 points, pushing it over my 3.5 threshold with a total score of 3.7. One message ... I get ~100 legitimate Emails per day, so at their "1 in ~10000" false-positive rating, I'd have lost maybe 10 Emails in that time frame. That's about 10 too many.

Granted, if a customer simply signs up for DSL or Cable modem and they get their free mom.and.dad@someisp.net Email address included, then yes, the Spam Cube could be a good investment. However, I've stated it before and I'll say it again: I believe that ISP's and Hosting Providers have the moral obligation to provide free effective countermeasures to spam for all clients. They keep complaining about bandwidth costs - so why not block Darin's spam from ever being downloaded in the first place?

Overall, I don't feel that the SpamCube is going to be any more efficient or practical for blocking spam than running software on your PC that does the same thing, like Symantec's anti-spam solution as one example. The only differences are that you'd have one less software package running on your PC, and in a Windows world, less is more, and that running any sort of application from hardware instead of software (like firewalls) will typically be many times faster.