France’s anti-spam database

2007-07-01

John Graham-Cumming

Independent researcher, France

Editor: Helen Martin

Abstract

John Graham-Cumming has the details of France's new national anti-spam service, Signal Spam.

Table of contents

Introduction
Open source and an open API
Message analysis
Automated abuse reports
Unsubscribe assistance
Management backend
Conclusion

Introduction

On 10 May 2007 the French national online anti-spam platform, Signal Spam, was launched.

The service allows any French resident to send any spam they receive to Signal Spam for automatic handling. At the time of writing over 24,000 people have signed up to use the service and over 1 million messages have been received by Signal Spam, with an average of 30,000 messages received per day during the first 32 days of operation (happily the infrastructure of Signal Spam was built to handle 1 million messages per day).

Figure 1. Total number of messages sent to Signal Spam in the first 32 days of operation.

Signal Spam is a non-profit organization (in French law it is ‘une association de loi 1901’), created as a partnership between the French government (through the Direction du développement des médias, which falls under the purview of the Prime Minister), a number of other French public bodies (such as the French data protection office: the Commission Nationale de l’Informatique et des Libertés), industry groups (such as the French ISP association: the Association des Fournisseurs d’Accès et de Services Internet, and French direct marketing groups including the Syndicat National de la Communication Directe) and private industry (including founding partner Microsoft). It receives funding from the French government as well as from member groups that join the association.

For individuals Signal Spam is entirely free of charge: the user simply visits http://www.signal-spam.fr/and signs up for a free account. Users can opt to provide full contact information if they are willing to be contacted in case of legal proceedings concerning messages they have sent in. However, the minimal user account requires just a username, password and a valid email address (which need not be the one at which the user is receiving spam).

Once signed up there are two ways to send a message to Signal Spam: copy and paste via a web form or through a plug-in for the email client. Since full email headers are vital for the analysis of any message there is no message-forwarding option, and the preferred method is the plug-in.

Open source and an open API

Plug-ins are currently available for Mozilla Thunderbird 2.0 and Microsoft Outlook 2003 and 2007.

At the insistence of Signal Spam the source code for the plug-ins is open source (the Thunderbird plug-in is released under MPL, GPL or LGPL; the Outlook plug-ins are released under the BSD licence) and the plug-in API’s specification is freely available (in French: https://www.signal-spam.fr/index.php/frontend/extensions/api_de_signalement).

Figure 2. Plug-in interaction with Signal Spam.

The API itself is a simple REST interface running over HTTPS. The plug-in makes an HTTPS connection to http://www.signal-spam.fr/ using the path /api/signaler. The username and password created by the user are sent using basic authentication. The message being sent is base-64 encoded and sent as a simple POST as if it were a standard HTML form element with the name ‘message’.

The API replies with the HTTP return code 202 Accepted if the message has successfully been received, 400 Bad Request if there has been a problem with the request itself or another standard HTTP error (signalling a bad username/password for example).

Users are limited to sending a maximum of 500 messages per day. The openness of the API has already spawned a couple of third-party interfaces with a shell script and a mutt script. All the plug-ins and scripts are available at: https://www.signal-spam.fr/index.php/frontend/extensions; anyone creating their own method of signalling messages is encouraged to email [email protected] so that it can be included on that page.

Currently, the Microsoft Outlook plug-in is the most popular method for sending messages to Signal Spam (accounting for almost 48% of the messages), followed by the web form (31.79%).

Figure 3. Percentage of messages sent to Signal Spam by method.

Message analysis

Once a message is received by Signal Spam it is transferred across a secure link to a separate and isolated machine where it is stored for analysis. An automatic analysis process runs constantly, picking up new messages and performing the following sequence of steps:

Extraction of the following email headers: From, Subject, User-Agent/X-Mailer, Return-Path and Date. These are stored in the message database for fast searching.
Discovery of the injection IP address. This is the most complex part of the process, and involves walking the chain of Received headers and matching them up to look for the injection IP address and evidence of forgery.
Mapping of the injection IP address to the network AS number and the name of the service provider responsible for the AS. These details are also stored in the database. The AS information is also used to determine the source country for the message.
URL extraction. All URLs present in the message are extracted and stored in the database for searching and reporting purposes.
Fingerprint creation. The message is fingerprinted using the Vipul’s azor Ephemeral and Whiplash fingerprints. The actual fingerprint mechanism is extensible and other algorithms can be added as needed.

Currently the database shows that the top ten message-sending countries (in messages signalled to Signal Spam) are: USA, France, China, Germany, South Korea, Poland, Russia, Brazil, UK and Israel.

Automated abuse reports

If the message originated inside France (from a French ISP or other entity that manages a block of IP space) then it’s possible for Signal Spam to send them automatically an anonymized report of the offending message. Any French entity that wishes to take part must join Signal Spam and provide information about the AS or IP address ranges that they control, along with an email address to receive abuse reports.

Abuse reports are generated automatically when the AS or IP address range matches a registered entity in the Signal Spam database and are sent using the ARF (Abuse Reporting Format, see http://www.mipassoc.org/arf/) specification. Prior to inclusion in the ARF report the message is anonymized by removing the headers To, Cc, Bcc, Apparently-To, Delivered-To, In-Reply-To, References, Reply-To and by removing email addresses from any Received header. The following shows an example ARF message as sent by Signal Spam:

From: <[email protected]>
Date: Thu, 8 Mar 2007 17:40:36 EDT
Subject: FW: Earn money
To: <[email protected]>
MIME-Version:1.0
Content-Type: multipart/report;
report-type=feedback-report;
boundary=”part1_13d.2e68ed54_boundary”

part1_13d.2e68ed54_boundary
Content-Type: text/plain; charset=”US-ASCII”
Content-Transfer-Encoding: 7bit

This is an email abuse report for an email message received from IP
10.67.41.167 on Thu, 8 Mar 2007 14:00:00 EDT.

part1_13d.2e68ed54_boundary
Content-Type:message/feedback-report

Feedback-Type:abuse
User-Agent:SignalSpam/0.1
Version: 0.1
Original-Mail-From: <[email protected]>
Received-Date: Thu, 8 Mar 2007 14:00:00 EDT
Source-IP: 10.67.41.167
Reported-Domain: example.net
Reported-Uri: http://example.net/earn_money.html
Reported-Uri:mailto:[email protected]

part1_13d.2e68ed54_boundary
Content-Type: message/rfc822
Content-Disposition: inline

From: <[email protected]>
Received: from mailserver.example.net
(mailserver.example.net [10.67.41.167]) by
example.com with ESMTP id M63d4137594e46; Thu, 08 Mar 2005
14:00:00 0400
To: <Undisclosed Recipients>
Subject: Earn money
MIME-Version: 1.0
Content-type: text/plain
Message-ID: [email protected]
Date: Thu, 02 Sep 2004 12:31:03 0500

Spam Spam Spam
Spam Spam Spam
--part1_13d.2e68ed54_boundary--

Unsubscribe assistance

Another automatic feature of Signal Spam is identifying messages that were from genuine e-marketers that are not spam and informing the user of the correct unsubscribe procedure. Since it’s expected that many users will signal messages that are from legitimate e-marketers, Signal Spam built a system that identifies these messages and replies automatically to the user.

Any French marketer can join Signal Spam and provide details of their newsletters and marketing mails to Signal Spam along with a message on how to unsubscribe from each of them. When Signal Spam identifies a message from one of these partners it replies to the user who sent in the message with the appropriate information to help them unsubscribe.

Management backend

Most of the Signal Spam website and software is invisible to the public and consists of an administration interface for the creation of reports, database searching and an interface to other parts of the French government (for example, the French Gendarmerie will have access when investigating cybercrimes).

At the most basic level an individual message (known as a ‘signalement’) can be viewed. Figure 4 shows how a message appears in the interface once it has passed through automatic analysis. In addition to the fields shown in the image the full message can be viewed, as well as the extracted URLs and the message fingerprints.

Figure 4. A message analysed by Signal Spam.

Conclusion

It’s very early days for Signal Spam, but the system is up and running and getting a lot of publicity in France, and Signal Spam has indicated that it is interested in sharing the entire system with other countries so that they can set up their own spam databases.

But the success of the system in helping in France’s fight against spam remains to be seen.

John Graham-Cumming will be presenting a paper entitled ‘The Spammers’ Compendium: five years on’, at VB2007 in Vienna (19–21 September). The full programme and details of how to register for the conference can be found at http://www.virusbtn.com/conference/vb2007/.

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…

Bulletin Archive