2007-07-01
Abstract
John Graham-Cumming has the details of France's new national anti-spam service, Signal Spam.
Copyright © 2006 Virus Bulletin
On 10 May 2007 the French national online anti-spam platform, Signal Spam, was launched.
The service allows any French resident to send any spam they receive to Signal Spam for automatic handling. At the time of writing over 24,000 people have signed up to use the service and over 1 million messages have been received by Signal Spam, with an average of 30,000 messages received per day during the first 32 days of operation (happily the infrastructure of Signal Spam was built to handle 1 million messages per day).
Signal Spam is a non-profit organization (in French law it is ‘une association de loi 1901’), created as a partnership between the French government (through the Direction du développement des médias, which falls under the purview of the Prime Minister), a number of other French public bodies (such as the French data protection office: the Commission Nationale de l’Informatique et des Libertés), industry groups (such as the French ISP association: the Association des Fournisseurs d’Accès et de Services Internet, and French direct marketing groups including the Syndicat National de la Communication Directe) and private industry (including founding partner Microsoft). It receives funding from the French government as well as from member groups that join the association.
For individuals Signal Spam is entirely free of charge: the user simply visits http://www.signal-spam.fr/and signs up for a free account. Users can opt to provide full contact information if they are willing to be contacted in case of legal proceedings concerning messages they have sent in. However, the minimal user account requires just a username, password and a valid email address (which need not be the one at which the user is receiving spam).
Once signed up there are two ways to send a message to Signal Spam: copy and paste via a web form or through a plug-in for the email client. Since full email headers are vital for the analysis of any message there is no message-forwarding option, and the preferred method is the plug-in.
Plug-ins are currently available for Mozilla Thunderbird 2.0 and Microsoft Outlook 2003 and 2007.
At the insistence of Signal Spam the source code for the plug-ins is open source (the Thunderbird plug-in is released under MPL, GPL or LGPL; the Outlook plug-ins are released under the BSD licence) and the plug-in API’s specification is freely available (in French: https://www.signal-spam.fr/index.php/frontend/extensions/api_de_signalement).
The API itself is a simple REST interface running over HTTPS. The plug-in makes an HTTPS connection to http://www.signal-spam.fr/ using the path /api/signaler. The username and password created by the user are sent using basic authentication. The message being sent is base-64 encoded and sent as a simple POST as if it were a standard HTML form element with the name ‘message’.
The API replies with the HTTP return code 202 Accepted if the message has successfully been received, 400 Bad Request if there has been a problem with the request itself or another standard HTTP error (signalling a bad username/password for example).
Users are limited to sending a maximum of 500 messages per day. The openness of the API has already spawned a couple of third-party interfaces with a shell script and a mutt script. All the plug-ins and scripts are available at: https://www.signal-spam.fr/index.php/frontend/extensions; anyone creating their own method of signalling messages is encouraged to email [email protected] so that it can be included on that page.
Currently, the Microsoft Outlook plug-in is the most popular method for sending messages to Signal Spam (accounting for almost 48% of the messages), followed by the web form (31.79%).
Once a message is received by Signal Spam it is transferred across a secure link to a separate and isolated machine where it is stored for analysis. An automatic analysis process runs constantly, picking up new messages and performing the following sequence of steps:
Extraction of the following email headers: From, Subject, User-Agent/X-Mailer, Return-Path and Date. These are stored in the message database for fast searching.
Discovery of the injection IP address. This is the most complex part of the process, and involves walking the chain of Received headers and matching them up to look for the injection IP address and evidence of forgery.
Mapping of the injection IP address to the network AS number and the name of the service provider responsible for the AS. These details are also stored in the database. The AS information is also used to determine the source country for the message.
URL extraction. All URLs present in the message are extracted and stored in the database for searching and reporting purposes.
Fingerprint creation. The message is fingerprinted using the Vipul’s azor Ephemeral and Whiplash fingerprints. The actual fingerprint mechanism is extensible and other algorithms can be added as needed.
Currently the database shows that the top ten message-sending countries (in messages signalled to Signal Spam) are: USA, France, China, Germany, South Korea, Poland, Russia, Brazil, UK and Israel.
If the message originated inside France (from a French ISP or other entity that manages a block of IP space) then it’s possible for Signal Spam to send them automatically an anonymized report of the offending message. Any French entity that wishes to take part must join Signal Spam and provide information about the AS or IP address ranges that they control, along with an email address to receive abuse reports.
Abuse reports are generated automatically when the AS or IP address range matches a registered entity in the Signal Spam database and are sent using the ARF (Abuse Reporting Format, see http://www.mipassoc.org/arf/) specification. Prior to inclusion in the ARF report the message is anonymized by removing the headers To, Cc, Bcc, Apparently-To, Delivered-To, In-Reply-To, References, Reply-To and by removing email addresses from any Received header. The following shows an example ARF message as sent by Signal Spam:
From: <[email protected]> Date: Thu, 8 Mar 2007 17:40:36 EDT Subject: FW: Earn money To: <[email protected]> MIME-Version:1.0 Content-Type: multipart/report; report-type=feedback-report; boundary=”part1_13d.2e68ed54_boundary” part1_13d.2e68ed54_boundary Content-Type: text/plain; charset=”US-ASCII” Content-Transfer-Encoding: 7bit This is an email abuse report for an email message received from IP 10.67.41.167 on Thu, 8 Mar 2007 14:00:00 EDT. part1_13d.2e68ed54_boundary Content-Type:message/feedback-report Feedback-Type:abuse User-Agent:SignalSpam/0.1 Version: 0.1 Original-Mail-From: <[email protected]> Received-Date: Thu, 8 Mar 2007 14:00:00 EDT Source-IP: 10.67.41.167 Reported-Domain: example.net Reported-Uri: http://example.net/earn_money.html Reported-Uri:mailto:[email protected] part1_13d.2e68ed54_boundary Content-Type: message/rfc822 Content-Disposition: inline From: <[email protected]> Received: from mailserver.example.net (mailserver.example.net [10.67.41.167]) by example.com with ESMTP id M63d4137594e46; Thu, 08 Mar 2005 14:00:00 0400 To: <Undisclosed Recipients> Subject: Earn money MIME-Version: 1.0 Content-type: text/plain Message-ID: [email protected] Date: Thu, 02 Sep 2004 12:31:03 0500 Spam Spam Spam Spam Spam Spam --part1_13d.2e68ed54_boundary--
Another automatic feature of Signal Spam is identifying messages that were from genuine e-marketers that are not spam and informing the user of the correct unsubscribe procedure. Since it’s expected that many users will signal messages that are from legitimate e-marketers, Signal Spam built a system that identifies these messages and replies automatically to the user.
Any French marketer can join Signal Spam and provide details of their newsletters and marketing mails to Signal Spam along with a message on how to unsubscribe from each of them. When Signal Spam identifies a message from one of these partners it replies to the user who sent in the message with the appropriate information to help them unsubscribe.
Most of the Signal Spam website and software is invisible to the public and consists of an administration interface for the creation of reports, database searching and an interface to other parts of the French government (for example, the French Gendarmerie will have access when investigating cybercrimes).
At the most basic level an individual message (known as a ‘signalement’) can be viewed. Figure 4 shows how a message appears in the interface once it has passed through automatic analysis. In addition to the fields shown in the image the full message can be viewed, as well as the extracted URLs and the message fingerprints.
It’s very early days for Signal Spam, but the system is up and running and getting a lot of publicity in France, and Signal Spam has indicated that it is interested in sharing the entire system with other countries so that they can set up their own spam databases.
But the success of the system in helping in France’s fight against spam remains to be seen.
John Graham-Cumming will be presenting a paper entitled ‘The Spammers’ Compendium: five years on’, at VB2007 in Vienna (19–21 September). The full programme and details of how to register for the conference can be found at http://www.virusbtn.com/conference/vb2007/.