2010-05-01
Abstract
On the first anniversary of the VBSpam comparative review VB's team tested a record 20 full anti-spam solutions, together with one reputation blacklist. The number of VBSpam awards earned also reached a record high of 18. Martijn Grooten has the details.
Copyright © 2010 Virus Bulletin
This month we celebrate the first anniversary of the VBSpam anti-spam comparative review. Since finishing the first test, around this time last year, somewhere in the order of 50 trillion (50,000,000,000,000) spam messages have been sent around the world. If this number weren’t sufficient to prove the extent of the spam problem, then the fact that for one day last month, the members of the VB team were forced to cope without a working spam filter surely did: this was not just annoying, frustrating and distracting, but in having to delete hundreds of unwanted messages manually we ran the serious risk of accidentally deleting legitimate email from our inboxes.
Thankfully, there are plenty of solutions available to fight the spam problem and the number of solutions on offer is growing. This month, we tested a record 20 full anti-spam solutions, together with one reputation blacklist. And in a test which saw the methodology changed in several ways, the number of VBSpam awards earned also reached a record high of 18.
The core of the methodology, with all products being tested in parallel and in real time, has not changed from previous tests and can be found at http://www.virusbtn.com/vbspam/methodology/. What has changed is that for the first time we have been able to compute a ‘pre-DATA’ spam catch rate for some products.
In an SMTP transaction, the contents (header and body) of an email are preceded by a DATA command. Before this command is sent, however, the sender has already identified itself via its IP address and informed the recipient’s mail server about the domain name (EHLO/HELO), the email address of the sender (MAIL FROM) and that of the intended recipient(s) (RCTP TO). Using this information, many spam filters are capable of recognizing (suspected) spam and can thus block the email before it has been sent. This is important because it can save significant network resources; it can also greatly reduce the number of emails in spam folders or quarantines, thus making the task of finding false positives a lot easier.
Since all emails in our set-up are relayed through our MTA, products see all the email coming from a fixed IP address. This means that some tweaks have to be made to products for them to filter email pre-DATA. Two products were set up to filter pre-DATA using XCLIENT, a little known but extremely useful SMTP extension. Meanwhile, the nature of another product, Spamhaus, is such that most of its filtering happens pre-DATA already, even in our test set-up.
It should be added that most, if not all, other products are capable of blocking email pre-DATA in a real environment; the fact that they either chose not to or were unable to use pre-DATA filtering here is due to the limitations of our test environment. All products in the test, regardless of whether they used pre-DATA filtering, have been provided with the same information for every email.
As in previous tests, the products that needed to be installed on a server were installed on a Dell PowerEdge R200, with a 3.0GHz dual core processor and 4GB of RAM. The Linux products ran on SuSE Linux Enterprise Server 11; the Windows Server products ran on either the 2003 or the 2008 version, depending on which was recommended by the vendor. (It should be noted that most products run on several different operating systems.)
To compare the products, we calculate a ‘final score’, defined as the spam catch (SC) rate minus three times the false positive (FP) rate. Products earn VBSpam certification if this value is at least 96:
SC - (3 x FP) ≥ 96
(In previous reports I had added a % sign after the number 96; some readers have pointed out that this value should not have a unit.)
The test ran from 12:00pm BST on 7 April 2010 to 9:30am on 26 April 2010. The corpus contained 249,511 emails, 247,315 of which were spam, while the other 2,196 were ham. The former were all provided by Project Honey Pot and the latter consisted of the traffic to several email discussion lists.
For a number of these discussion lists, the emails were automatically reconstructed to the state they were in when they were received by the list server: headers added by the list software were removed and EHLO/HELO, MAIL FROM and the sending IP address were reconstructed to contain their original values. This increased the number of effective senders in the ham corpus from a small number of list servers to a large number of email users from all over the world.
Some of these discussion lists used a language of communication other than English; some even used different character sets, in particular Greek and Russian. Several products had a hard time with these, especially the Russian emails. For many an organization it may be a good idea to block all emails using the Cyrillic script, simply because no recipient is able to read them and, as a significant portion of spam these days is aimed at Russian users, this will automatically block a lot of spam too. However, we run our tests for a hypothetical organization that may have Russian employees and/or customers, and may even be based in Russia; hence we believe these emails should be identified correctly, just as emails in French and English should be. Participants had been given advance warning about the inclusion of Russian email in the ham corpus and they were given the chance to adjust their products’ settings if needed.
In previous tests, we have used our own email: both the legitimate email and the spam sent to @virusbtn.com addresses. Virus Bulletin is a real company, with real employees who receive real emails and who do not wish to see spam in their inboxes. This made the email corpus eminently suitable for testing purposes. However, privacy and confidentiality issues have meant that we have been unable to share the full details of these emails with participants, and this has become more and more of an issue. One of the purposes of the VBSpam tests is to help developers improve their products, and to do this, they need the full details of any legitimate emails they have accidentally blocked. We therefore decided to stop using our own email in the tests. However, for interest, details of products’ performance on the VB corpus (which consisted of 1,398 legitimate and 20,829 spam messages) have been included in this month’s results; these measurements did not count towards the VBSpam award.
Comparing products’ performance on the VB spam corpus against their performance on the Project Honey Pot corpus suggests that the latter is significantly easier to filter. We cannot stress enough that the spam catch rates and false positive rates in this test should be considered within the context of the test and in comparison with other products’ rates in the test, not as absolute numbers. For those who are tempted to think the Project Honey Pot corpus is ‘too easy’, it is good to know that more than 11% of the emails in this corpus were let through unblocked by at least one product.
As in the previous test, for each product no more than four false positives were counted per sender. Unlike in previous tests, emails that claimed to have been sent from @virusbtn.com addresses were not removed from the corpus; some organizations cannot afford to block email sent from their own domain, even from unknown sources. Products were not allowed to automatically block all email with senders on the @virusbtn.com domain, even if the ham corpus did not contain such emails. Finally, the ‘image spam’ and ‘large spam’ categories referenced in the test results are, respectively, spam messages containing at least one inline image, and those with a body size of over 50,000 bytes.
SC rate: 99.55%
SC rate (VB corpus): 91.15%
SC rate (image spam): 99.61%
SC rate (large spam): 99.78%
FP rate: 0.14%
FP rate (VB corpus): 0.50%
Final score: 99.14
BitDefender has been submitting its product for testing since the very first anti-spam test and the developers’ trust in their product has been rewarded with six VBSpam awards to date. They can now add a seventh award to their collection, which not only makes this the only product to have won an award in every single VBSpam test, but with a very decent and somewhat improved spam catch rate, and just three false positives, BitDefender achieved the second highest final score in this test.
SC rate: 98.01%
SC rate (VB corpus): 92.85%
SC rate (image spam): 95.88%
SC rate (large spam): 96.46%
FP rate: 0.23%
FP rate (VB corpus): 1.36%
Final score: 97.32
Fortinet’s FortiMail has won a VBSpam award without difficulty on each of the five occasions it has participated in our tests, but with spammers continually changing their tactics, previous accolades do not guarantee future success. However, Fortinet’s Canadian developers have been working hard to keep up to date with current trends in email and spam, and with another decent spam catch rate and just a handful of false positives, their hard work is rewarded with the product’s sixth VBSpam award.
SC rate: 98.01%
SC rate (VB corpus): 89.32%
SC rate (image spam): 97.84%
SC rate (large spam): 98.18%
FP rate: 0.00%
FP rate (VB corpus): 0.22%
Final score: 98.01
Anyone who thinks that the addition of Russian-language emails to the ham corpus would give Kaspersky an easy time (after all, the product is developed in Russia) should know that the vast majority of legitimate emails in the test were in other languages. Regardless of their language, however, all legitimate emails were correctly identified as such by Kaspersky which, combined, with a decent spam catch rate means that the product wins its sixth VBSpam award.
SC rate: 99.95%
SC rate (pre-DATA): 98.44%
SC rate (VB corpus): 96.91%
SC rate (image spam): 99.91%
SC rate (large spam): 99.85%
FP rate: 0.27%
FP rate (VB corpus): 0.86%
Final score: 99.13
Italian company Libra develops the anti-spam product Libra Esva (Esva being an acronym for Email Security Virtual Appliance) – a new face in our product line-up. Having started the product in VMware, the set-up process was quick and straightforward: answering a few questions in a web interface was enough to get the product up and running. Further adjustments can be made in the web interface – potential users should not be put off by the company’s Italian website: the product itself uses clear and simple English.
Libra Esva was one of the products set up to use pre-DATA filtering and blocked an impressive 98.37% of spam pre-DATA. Even more impressive was the fact that the subsequent content filtering lifted the spam catch rate to 99.95% – better than any other product in this test. A mere handful of false positives meant that the product achieved the third best final score and wins a VBSpam award on its debut appearance.
SC rate: 99.14%
SC rate (VB corpus): 95.40%
SC rate (image spam): 99.73%
SC rate (large spam): 99.78%
FP rate: 0.23%
FP rate (VB corpus): 0.50%
Final score: 98.46
Like most products this month, M86’s MailMarshal struggled with a few false positives on legitimate Russian email. But with a spam catch rate of well over 99% and a decent final score, the product easily wins another VBSpam award.
SC rate: 99.33%
SC rate (VB corpus): 94.48%
SC rate (image spam): 99.52%
SC rate (large spam): 99.74%
FP rate: 0.50%
FP rate (VB corpus): 0.72%
Final score: 97.83
As a product that had a relatively hard time filtering legitimate email from Eastern Europe in the previous test, McAfee’s Email Gateway might have been expected to struggle with the addition of Russian emails to the ham corpus. However, McAfee’s developers took measures to reduce those false positives and, while there were still some FPs, there were nowhere near enough to deny the product another VBSpam award.
SC rate: 98.84%
SC rate (VB corpus): 91.03%
SC rate (image spam): 94.37%
SC rate (large spam): 96.27%
FP rate: 0.18%
FP rate (VB corpus): 0.07%
Final score: 98.29
The second McAfee product on test saw its spam catch rate improve slightly since the last test, and with just four false positives (only one of which was in Russian), the new ham corpus caused few problems for the product. With a decent final score, another VBSpam award is added to McAfee’s collection.
SC rate: 99.33%
SC rate (VB corpus): 93.03%
SC rate (image spam): 98.54%
SC rate (large spam): 99.32%
FP rate: 1.82%
FP rate (VB corpus): 0.14%
Final score: 93.86
Being a relatively small UK company, one might expect MessageStream to have few customers who regularly receive legitimate email from Russia. It is therefore understandable that the product scored relatively poorly on these emails; all but two of the product’s false positives were Russian messages. Unfortunately, these were enough to deny the product a VBSpam award and the developers will have to show they can filter Russian email correctly too, without compromising too much on the product’s high spam catch rate.
SC rate: 99.93%
SC rate (VB corpus): 97.35%
SC rate (image spam): 99.77%
SC rate (large spam): 99.90%
FP rate: 0.23%
FP rate (VB corpus): 0.79%
Final score: 99.25
Microsoft’s Forefront Protection 2010 for Exchange Server was the clear winner of the last test, achieving the highest final score by some distance. The final scores of the various products were closer this month, but with the second highest spam catch rate and just a handful of false positives, Forefront was yet again the product with the highest final score and adds another VBSpam award to its collection.
SC rate: 99.25%
SC rate (VB corpus): 94.02%
SC rate (image spam): 98.69%
SC rate (large spam): 98.14%
FP rate: 1.55%
FP rate (VB corpus): 5.79%
Final score: 94.60
Vircom’s modusGate has had a few disappointing performances in previous tests and its developers decided to take some time to focus on improving its performance, sitting out the two previous tests. We were pleased to see the product return for this test and even more pleased to see the product catch well over 99% of all spam. The product’s false positive rate dropped too, but unfortunately it was rather over-zealous for a few days, just when the amount of legitimate email reached a peak, resulting in an increase in false positives again. And while this may not have been a problem for many of the Canadian company’s (potential) customers, it means the product is denied a VBSpam award.
SC rate: 98.81%
SC rate (VB corpus): 87.03%
SC rate (image spam): 97.71%
SC rate (large spam): 96.49%
FP rate: 0.05%
FP rate (VB corpus): 0.00%
Final score: 98.67
MXTools Reputation Suite, which combines Spamhaus ZEN + RBL (see below) with the SURBL URI blacklist and the Server Authority domain reputation service, uses a number of techniques to block spam by the IP address from which it is sent and the domains present during the SMTP transaction and in the email. When applied well, techniques like these can block the vast majority of spam, with few false positives. This is certainly the case with the techniques resold by MXTools as together they block 98.8% of all spam, with just a single false positive. With such a performance, MXTools easily earns a VBSpam award.
SC rate: 99.69%
SC rate (VB corpus): 95.29%
SC rate (image spam): 99.25%
SC rate (large spam): 99.13%
FP rate: 0.23%
FP rate (VB corpus): 0.57%
Final score: 99.00
Sophos’s email appliance made its debut in the previous test with a very respectable final score, and demonstrated consistency this month with a slightly improved spam catch rate and with just five false positives. With another excellent final score the Sophos Email Appliance wins its second VBSpam award.
SC rate: 98.02%
SC rate (VB corpus): 87.86%
SC rate (image spam): 98.52%
SC rate (large spam): 98.60%
FP rate: 0.41%
FP rate (VB corpus): 1.29%
Final score: 96.79
The web interface of SPAMfighter Mail Gateway states that since we started testing the product last summer, it has blocked millions of emails, which has saved us well over 200,000 (virtual) dollars. Of course, such numbers are to be taken with a generous pinch of salt, but it is good to be reminded of the importance of a decent spam filter. The Danish product takes away its fourth VBSpam award this month, and we were particularly pleased to see the number of false positives drop to below ten, with just four senders accounting for these emails.
SC rate: 98.52%
SC rate (VB corpus): 92.82%
SC rate (image spam): 99.41%
SC rate (large spam): 98.81%
FP rate: 0.14%
FP rate (VB corpus): 0.79%
Final score: 98.11
Prior to this test, SpamTitan’s developers adjusted the product’s settings to improve its performance on legitimate email, in particular if it was Russian email. The product’s spam catch rate dropped a little, but so did the false positive rate and with just three FPs, the product performed better than most on the legitimate email. Another VBSpam award is well deserved.
SC rate: 98.30%
SC rate (VB corpus): 94.63%
SC rate (image spam): 96.34%
SC rate (large spam): 95.69%
FP rate: 0.96%
FP rate (VB corpus): 3.15%
Final score: 95.43
Sunbelt’s VIPRE was on course for its third VBSpam award this month when, towards the end of the test, the number of false positives suddenly increased. The product’s developers are currently looking into the issue to determine the cause for the surge in FPs, but it should be noted the false positive rate has since returned to an acceptable level. Unfortunately for Sunbelt the product is denied a VBSpam award this time.
SC rate: 99.53%
SC rate (VB corpus): 93.54%
SC rate (image spam): 99.42%
SC rate (large spam): 99.32%
FP rate: 0.14%
FP rate (VB corpus): 0.43%
Final score: 99.12
Symantec Brightmail Gateway put in a commendable performance in its first two tests and continues to do very well. This month it saw its spam catch rate improve slightly compared to the previous test, and this was combined with a very low false positive rate. A final score of over 99 puts it in the top five, and earns the product another VBSpam award.
SC rate: 99.79%
SC rate (pre-DATA): 99.00%
SC rate (VB corpus): 95.74%
SC rate (image spam): 99.66%
SC rate (large spam): 99.74%
FP rate: 0.55%
FP rate (VB corpus): 1.07%
Final score: 98.15
The Email Laundry, a hosted anti-spam solution from Clean Communications in Ireland, has made a rather nice video to explain what it does (youtu.be/2pPrLvPr3wE): it filters spam and malware before they reach the organization’s premises, and is also capable of archiving email and providing email continuity in case of a server crash. The company believes its strongest point is its connection-level filtering and its developers were eager to see if our test could confirm that.
The results speak for themselves: the product blocked a stunning 99% of all spam pre-DATA without any false positives, which not only makes it the best performer among the three products that block email pre-DATA, but it also beats several solutions’ total spam catch rate. Email that has passed the various connection level tests is then further scanned for spammy content and almost four fifths of the remaining spam was blocked this way. It was not done without mistakes though, and most of the false positives were Russian messages. While some adjustments to the content scanning might improve it even further, the product’s final score was pretty decent and The Email Laundry easily earns a VBSpam award.
SC rate: 99.00%
SC rate (VB corpus): 92.29%
SC rate (image spam): 99.03%
SC rate (large spam): 99.03%
FP rate: 0.27%
FP rate (VB corpus): 1.72%
Final score: 98.18
We are always pleased to learn that a company takes research and development seriously. Vade Retro certainly does: 60% of its anti-spam team is dedicated to R&D and the French company has developed several anti-spam techniques. These are employed in various solutions, from desktop products, to software solutions for various Windows and Linux servers, to hosted solutions. We tested a hosted solution.
Easily set up for inclusion in our test, the product caught just over 99% of spam. With just six false positives (interestingly, none of which were in Russian) Vade Retro’s performance was more than enough to win the product a VBSpam award on its debut.
SC rate: 99.13%
SC rate (VB corpus): 89.16%
SC rate (image spam): 99.26%
SC rate (large spam): 99.30%
FP rate: 0.00%
FP rate (VB corpus): 1.00%
Final score: 99.13
Back in 2002, Hungarian company Vamsoft suffered from an NDR attack and to stop the attack it built a software tool. This eventually evolved into the full anti-spam solution ORF which runs on both Microsoft Exchange and Microsoft ISS Server; we tested it using the latter. Perhaps because the company itself was the first to use the product, a lot of attention has been paid to the ability for administrators to customize the product and to generate logs; we were impressed by how easily and intuitively both are done.
But such features would mean nothing if the product’s performance were not up to par. That, however, is not a problem. A spam catch rate of well over 99% is certainly impressive, but the fact that this is combined with zero false positives is even more so. A VBSpam award is absolutely deserved for this impressive debut.
SC rate: 98.98%
SC rate (VB corpus): 94.69%
SC rate (image spam): 98.66%
SC rate (large spam): 98.06%
FP rate: 0.23%
FP rate (VB corpus): 0.64%
Final score: 98.30
We received a kind notification on our account at Webroot’s server that the customers of its Email Security Service are to be upgraded to a new version of the product. We were pleased to see that the product’s developers are working on improvements, but just as pleased to see that the product’s performance has not suffered in the meantime: a spam catch rate of almost 99% combined with just five false positives wins the product its sixth consecutive VBSpam award.
SC rate: 98.68%
SC rate (pre-DATA): 97.84%
SC rate (VB corpus): 84.08%
SC rate (image spam): 97.71%
SC rate (large spam): 96.41%
FP rate: 0.00%
FP rate (VB corpus): 0.00%
Final score: 98.68
Spamhaus’s three IP reputation lists (combined in its Zen list) are a household name in the anti-spam industry and its recently added domain blacklist DBL helps it detect some of the spam that isn’t sent from blacklisted IP addresses. The first step in a multi-layered anti-spam solution, Spamhaus lives up to its reputation and it blocked 98.68% of all spam in the test (97.84% of which was blocked pre-DATA) without any false positives.
This month saw several significant changes to the test corpora, and it was interesting to see how products coped with a more international corpus of legitimate emails including different character sets.
The developers of the products that did not perform so well on this occasion will be eager to show in the next test that this was due to settings needing to be tweaked rather than a fault in the product. The top performers, of course, will need to demonstrate that their results weren’t just a coincidence and that they can perform well consistently; a complete picture of the quality of a product can only be gained by looking at the results of several VBSpam reviews and monitoring the performance of products over time.
As always, comments and suggestions from vendors, researchers and end-users are welcome. The next test is set to run throughout June; the deadline for product submission is 25 May 2010. Any developers interested in submitting a product should email [email protected].