How Spammers Fool Bayesian Filters - And How to Stop Them

Effectively stopping spam over the long-term requires much more than blocking individual IP addresses and creating rules based on keywords that spammers typically use. The increasing sophistication of spam tools coupled with the increasing number of spammers in the wild has created a hyper-evolution in the variety and volume of spam. The old ways of blocking the bad guys just don't work anymore.

Examining spam and spam-blocking technology can illuminate how this evolution is taking place and what can be done to combat spam and reclaim e-mail as the efficient, effective communication tool it was intended to be.

One method used to combat spam is Bayesian Filtering. Named after Thomas Bayes, an English mathematician, Bayesian Logic is used in decision making and inferential statistics. Bayesian Filers maintain a database of known spam and ham, or legitimate email. Once the database is large enough, the system ranks the words according to the probability they will appear in a spam message.

Words more likely to appear in spam are given a high score (between 51 and 100), and words likely to appear in legitimate email are given a low score (between 1 and 50). For example, the words "free" and "sex" generally have values between 95 and 98, whereas the words "emphasis" or "disadvantage" may have a score between 1 and 4. Commonly used words such as "the" and "that", and words new to the Bayesian filters are given a neutral score between 40 and 50 and would not be used in the system's algorithm.

When the system receives an email, it breaks the message down into tokens, or words with values assigned to them. The system utilizes the tokens with scores on the high and low end of the range and develops a score for the email as a whole. If the email has more spam tokens than ham tokens, the email will have a high spam score. The email administrator determines a threshold score the system uses to allow email to pass through to users.

Bayesian filters are effective at filtering spam and minimizing false positives. Because they adapt and learn based on user feedback, Bayesian Filers produce better results as they are used within an organization over time. They are not, however, foolproof. Spammers have learned which words Bayesian Filters consider spammy and have developed ways to insert non-spammy words into emails to lower the message's overall spam score. By adding in paragraphs of text from novels or news stories, spammers can dilute the effects of high-ranking words. Text insertion has also caused normally legitimate words that are found in novels or news stories to have an inflated spam score. This may potentially render Bayesian filters less effective over time.

Another approach spammers use to fool Bayesian filters is to create less spammy emails. For example, a spammer may send an email containing only the phrase, "Here's the link?". This approach can neutralize the spam score and entice users to click on a link to a Web site containing the spammer's message. To block this type of spam, the filter would have to be designed to follow the link and scan the content of the Web site users are asked to visit. This type of filtering is not currently employed by Bayesian filters because it would be prohibitively expensive in terms of server resources and could potentially be used as a method of launching denial of service attacks against commercial servers.

As with all single-method spam filtering methodologies, Bayesian filters are effective against certain techniques spammers use to fool spam filters, but are not a magic bullet to solving the spam problem. Bayesian filters are most effective when combined with other methods of spam detection.

The Solution

When used individually, each anti-spam technique has been systematically overcome by spammers. Grandiose plans to rid the world of spam, such as charging a penny for each e-mail received or forcing servers to solve mathematical problems before delivering e-mail, have been proposed with few results. These schemes are not realistic and would require a large percentage of the population to adopt the same anti-spam method in order to be effective. You can learn more about the fight against spam by visiting our website at www.ciphertrust.com and downloading our whitepapers.

Dr. Paul Judge is a noted scholar and entrepreneur. He is Chief Technology Officer at CipherTrust, the industry's largest provider of enterprise email security. The company's flagship product, IronMail provides a best of breed enterprise anti spam solution designed to stop spam, phishing attacks and other email-based threats. Learn more by visiting http://www.ciphertrust.com/prod ucts/spam_and_fraud_protection today.

In The News:


STOPZilla Bursts the Pop-Up Bubble
Microsoft Certified Professional - Oct 11, 2008
Like spam, popups destroy what should be a wonderful computing experience. I lived with popups for too long, methodically shutting down window after ...

Ashampoo WinOptimizer 5.09
PC Authority Business Centre, Australia - Oct 1, 2008
There's a StartUp Tuner for controlling which programs are launched when Windows boots, an "IP Spam Blocker" and more. You can defragment your hard drive as ...

Help Net Security

WatchGuard unveils network security 'X-factor'
VNUNet.com, UK - Sep 16, 2008
Customers can choose spamBlocker with virus outbreak detection, WebBlocker for URL and content filtering and WatchGuard Gateway Anti-Virus and intrusion ...
WatchGuard Previews New High Performance Network Security ... Earthtimes (press release)
all 12 news articles

Protecting your information
UNM Daily Lobo (subscription), NM - Sep 16, 2008
"We have a really good spam blocker here at UNM, and it does block thousands of spam e-mails per day," Baca said. "But once in a while, some does get ...

Gates shows off Service Pack 2 security
Silicon.com, UK - Sep 14, 2008
The feature will allow users to block all pop-up ads, none or to ask permission each time an ad tries to appear. On the spam front, Gates outlined a caller ...

Tritons Topple No. 1 Coyotes
The UCSD Guardian Online, CA - Sep 30, 2008
They will take on Chico State in an Oct. 3 away game at 7 pm Readers can contact Robert Ingle at This e-mail address is being protected from spam bots, ...

Government e-mail system was down
Royal Gazette, Bermuda - Oct 2, 2008
By Robyn Skinner Government e-mail addresses and its website were down on Monday and Tuesday this week because of a spam blocker component. ...

Automattic Acquires Comment Plugin IntenseDebate
ReadWriteWeb, CA - Sep 23, 2008
Automattic promises that IntenseDebate will remain platform agnostic, just like Aksimet, Automattic's comment spam blocker. ...

UCSD Stumbles After Big Upset
The UCSD Guardian Online, CA - Oct 6, 2008
Readers can contact Robert Ingle at This e-mail address is being protected from spam bots, you need JavaScript enabled to view it .

Bleacher Report

College Football Line Moves for Week 5, 9/27/08
Bleacher Report, CA - Sep 27, 2008
I've emailed out my pick for Georgia-Alabama so if you haven't received it either check your spam blocker or send me an email and I'll resend it. ...
spam blocker - Google News

Customers Demand Internet Privacy

... and you'd better sit up and take notice! Customers concern over Internet privacy issues... Read More

The 4 Ws of Junk E-mail

Junk e-mail or spam has become the scourge of the modern computer world. It eats... Read More

Spam eMails Are Not Just Annoying - They Are A Main Distributor Of Viruses

Why is someone from India, Africa, or elsewhere writing you for information about your bank... Read More

Protecting Your Business From Spam

Even being as careful as possible with my email address, I still used to receive... Read More

The Trouble With Spam Is....

Each day we all face the same challenge. Spam. It doesn't matter if you're a... Read More

The Anti Spam Challenge ? Minimizing False Positives

Email is the quintessential business communication tool, so when it doesn't work like it's supposed... Read More

Invasion of the Email Snatchers

They're sneaky. And stealthy. They're quiet and mostly unobtrusive, but once you've been visited by... Read More

Protecting Yourself With A Porn Filter

The harmful affects of pornography use and addiction are well documented by science. As with... Read More

E-mail SPAM: Whats The Big Deal?

It absolutely amazes me how many people over-react to receiving e-mail SPAM.What is this obsession...this... Read More

Anti Trackback and Comment Spam Methods

What is spam ? Spam is text or... Read More