2010-08-01
Abstract
Sender authentication is a hot topic in the world of email. It has a number of uses and a number of suggested uses. Which ones work in real life? Which ones don’t quite measure up? Can we use authentication to mitigate spoofing? Can we use it to guarantee authenticity? And how do we authenticate email, anyway? Terry Zink provides the answers to these questions and more, this month focusing on SenderID.
Copyright © 2010 Virus Bulletin
In my previous article (see VB, July 2010, p.16), we saw how SPF can be used to authenticate messages from people that we want to hear from, and discard messages from senders who are merely pretending to be those people. Yet SPF has a drawback: visual cues that a regular person uses to identify who a message is from are not always addressed by SPF and can be exploited by spammers.
In email, there are usually two ‘senders’ of a message:
The sender, or from address, in the envelope sender. This is the MAIL FROM address specified in the SMTP transaction and is called the P1 From.
The sender, or from address, in the message headers. This is the From: address that you see in your email client and is called the P2 From. Sometimes there is a Sender: address in which case you might see a ‘sent on behalf of' message displayed in your email client.
In many cases, the P1 and P2 Froms are the same. However, this is not always the case. When mail is sent on behalf of someone else, they can be different. For example, if a large company such as Oceanic Airlines wants to send out a communication to their subscriber list, they might use a third-party mailer to do it for them – for example, Big Communications, Inc. In this case, the email would have a P1 From of [email protected], while the P2 From would be [email protected]. To the end-user, it would appear that the message was from [email protected], the airline that they know and trust. Of course, the message didn’t come from them, it came from Big Communications’ mail servers.
An SPF check is done against the domain bigcommunications.com and the sending IP address is checked against it as well. Everything checks out alright, and everyone wins: the mail is authenticated, and the members of Oceanic’s user base see the company’s communications which have been outsourced to a large mailing company.
The problem arises when someone attempts to exploit how SPF works. What if a spammer were to put a domain in the P1 From that didn’t have an SPF record? And what if they put a trusted domain into the P2 From? This way, the message would return an SPF check of neutral, and the user would see the trusted domain in their inbox.
In Figure 2, the email that the user sees is from Twitter support (this is from an actual phish I received). The P1 From, the one that an SPF check is performed on, is ‘from’ the domain @yahoo.com, which has no SPF records. The SPF check returns a none result and the message sails through to the user’s inbox. The user sees that the message is ‘from’ Twitter and trusts it. The average email user doesn’t know the difference between P1 and P2 Froms. For the most part, this doesn’t matter because they are frequently the same. But when a spammer is spoofing a trusted domain, it absolutely does matter because the sender they see in their inbox is not the sender the anti-spoofing mechanism was analysing. The natural end-user assumption is that the message is legitimate – after all, wouldn’t a filter have flagged this message as spam?
The scenario described above is one of the biggest concerns we encounter. SPF doesn’t address this case.
SenderID is a protocol advanced by Microsoft that deals with the problem of email authentication in a similar manner to SPF, but weighs more heavily against spoofing than SPF does. From Microsoft [1]:
'The Sender ID Framework is an email authentication technology protocol that helps address the problem of spoofing and phishing by verifying the domain name from which email messages are sent. Sender ID validates the origin of email messages by verifying the IP address of the sender against the alleged owner of the sending domain.'
This is similar to SPF. SPF works to validate the origin of email messages by verifying the IP address of the sender against the authorized IPs that are allowed to send mail from the domain in the envelope sender. SenderID does this as well, but it can be implemented to work on either the envelope sender or another address in the message headers.
SenderID introduces the concept of a Purported Responsible Address (PRA). Acquisition of the PRA is described in RFC 4407 [2]. Briefly, PRA is obtained by examining the message headers and extracting one of the following fields:
Look for the first non-empty Resent-Sender header. If it exists, use the domain in this field. If not, proceed to step 2.
Look for the first non-empty Resent-From header. If it exists, use the domain in this field. If not, proceed to step 3.
Look for the Sender header. If it exists, use the domain in this field. If not, proceed to step 4.
Look for the From header (not to be confused with the MAIL FROM, or envelope header). If it exists, use the domain in this field. If not, the message is malformed and we cannot extract a PRA.
Most of the time, the PRA will turn out to be the email address in the From: field that shows up in the email client. It’s also the one that is most useful to the end-user because that’s the one they actually see. Anyhow, a SenderID check extracts the domain from the PRA and performs a check on it by looking at the domain’s SenderID record and then performing the same actions as a regular SPF check (hard fail, soft fail, etc.).
However, things are a bit more complicated than this. Not only does SenderID introduce the concept of PRA, it also introduces new syntax for SPF records.
SenderID records begin with a version identifier (2.0) and may also include a scope upon which the SenderID check may be applied. The rest of the syntax is the same as SPF. A domain that explicitly specifies SenderID-compliant records could use the following syntax:
Example 1
spf2.0/mfrom,pra mx ip4:192.168.0.100 -all
This defines an SPF record that can be used for either MAIL FROM or PRA checks. If the IP is in the domain’s MX record or is 192.168.0.100, return a pass. Otherwise, return a hard fail.
Example 2
spf2.0/pra mx ip4:192.168.0.100 ~all
This defines an SPF record that can be used only for PRA checks. If the IP is in the domain’s MX record or is 192.168.0.100, return a pass. Otherwise, return a soft fail.
Example 3
spf2.0/mfrom mx ip4:192.168.0.100 ?all
This defines an SPF record that can be used only for MAIL FROM checks. If the IP is in the domain’s MX record or is 192.168.0.100, return a pass. Otherwise, return a neutral.
Thus, the SenderID record indicates whether to check against the domain in the MAIL FROM, PRA, both or neither. The question naturally arises: how do we know whether to extract and check the domain in the PRA or the domain in the MAIL FROM? The answer is that it depends on how you want to implement it.
Example 1
Suppose you are running an email service and you want to implement SenderID on the PRA only. That means you will extract the domain in the PRA and not extract the domain in the envelope sender (the MAIL FROM). You look up the domain’s TXT record, which is the following (both SenderID and SPF records are stored in a domain’s TXT record. Only the syntax is different):
spf2.0/mfrom,pra mx ip4:192.168.0.100 -all
First, we see that this domain supports SenderID. Success! Second, the record indicates that the TXT record can be used to verify either the domain in the MAIL FROM or the domain in the PRA. If the transmitting IP is 192.168.0.100 or is the reverse DNS of the domain’s MX record, then we have a SenderID pass. Otherwise return a hard fail.
Example 2
From the example above, suppose that the TXT record instead was the following:
spf2.0/mfrom mx ip4:192.168.0.100 -all
This record only specifies the IPs in the MAIL FROM domain that are authorized to send mail. It says nothing about which IPs in the PRA are permitted. Therefore, since we are checking the domain from the PRA, the result of this SenderID check is a none.
SenderID allows the implementer the flexibility to protect either the envelope sender or sender in the message headers (usually the From: address). However, the standard does not specify which one should be checked so it is up to the implementer (the email receiver) to decide how to do it. In the real world it is most commonly done on the PRA.
A major difference between SenderID and SPF is that SenderID allows the spam filter to check TXT records of the envelope sender or the PRA. However, SPF requires that they are checked on the envelope sender.
If a spam filter extracts the domain in the envelope sender and performs an SPF check, then when it queries DNS it must find a v=spf1 record in order to do an SPF check. If it does not, it returns SPF none.
If a spam filter extracts the domain in the PRA and performs an SPF check, then when it queries DNS it can do a check on a v=spf2.0 record or a v=spf1 record. Section 3.4 of RFC 4406 says the following:
'In order to provide compatibility for these domains, Sender ID implementations SHOULD interpret the version prefix ‘v=spf1’ as equivalent to ‘spf2.0/mfrom,pra’, provided no record starting with ‘spf2.0’ exists.'
In other words, if you have a SenderID implementation that checks the envelope sender (i.e. just like SPF), this will function exactly like regular SPF. On the other hand, if you have a SenderID implementation that checks the PRA (which is much more likely to be the case), but no SenderID record exists, then default back to use the SPF record instead to check the PRA. Thus, the recommended behaviour of your SenderID implementation is that existing SPF records should protect either the MAIL FROM or PRA.
The RFC goes on to say the following:
'Administrators who have already published ‘v=spf1’ records SHOULD review these records to determine whether they are also valid for use with PRA checks. If the information in a ‘v=spf1’ record is not correct for a PRA check, administrators SHOULD publish either an ‘spf2.0/pra’ record with correct information or an ‘spf2.0/pra ?all’ record indicating that the result of a PRA check is explicitly inconclusive.'
The reason this warning is given is because it’s possible that the behaviour of the envelope sender could be different from PRA. Because SPF was designed to be used to protect the MAIL FROM, it is not necessarily true that the PRA will behave the same way. As the warning above states, to prevent any confusion, domain administrators should explicitly publish SenderID records that do not explicitly say one way or the other whether the PRA is protected (i.e. return neutral).
Why does this matter? It matters because while most of the time the MAIL FROM and PRA are the same, many times they are not. The most common occurrence of this is newsletters. Let’s revisit our previous example. Oceanic Airlines has contracted Big Communications, Inc. to send its mail campaigns.
MAIL FROM: [email protected] -> This is what an SPF check is performed on
bigcommunications.com v=spf1 292.14.15.0/24 -all
From: [email protected] -> This is the PRA and it is what the SenderID check is performed on
oceanic.com v=spf1 258.14.15.0/24 –all
From this, if Big Communications, Inc. sends a message from 292.14.15.75, this would pass an SPF check because it is in the range of 292.14.15.0/24. However, SenderID performs a check on oceanic.com, sees that the sending IP is not in the range 258.14.15.0/24 and assigns a SenderID fail. This is incorrect because neither Oceanic Airlines nor Big Communications, Inc. meant for the domain in the PRA to be extracted. They both published SPF records, not SenderID records. SenderID assumes that the PRA can be done against an SPF v1 record, but neither Oceanic nor Big Communications has made that explicit and in this case it has caused a false positive. Thus, the trade-off when performing a SenderID check on an SPF record is that you catch more spoofed spam, but introduce more false positives.
Feature | SPF | SenderID |
---|---|---|
DNS records | v=spf1 | v=spf2.0 |
Domain that it works on | Envelope sender (P1) | PRA (P2 – much more common) or envelope sender (much less common) |
How does it treat SPF records? | Works as normal | Treats it like a SenderID record if the SenderID record does not exist |
How does it treat SenderID records? | Ignores it | Works as normal |
Strengths | - Can stop some phishing, good for some whitelisting | - Better at stopping phishing (or spoofing) that tricks the user visually |
- Can prevent backscatter by only sending bounces to messages that pass an SPF check | - The PRA derives from actual Resent-* headers and Sender and From headers; this makes validation on forwarded mail theoretically possible | |
- Can reject some messages in SMTP before accepting any data | ||
Weaknesses | - Doesn’t catch phishing when the P1 From is neutral or none and the PRA is spoofed | - Prone to false positives when mail is sent on behalf of another |
- Doesn’t work on forwarded mail | - Doesn’t work on forwarded mail |
Table 1. Comparison of SPF and SenderID features.
Email forwarding is a major issue with SPF and SenderID. There is no official standard on how email is to be forwarded (in terms of rewriting the headers). Suppose that Mail Server A sends a message and everything complies with SenderID or SPF – the envelope sender is correct, the domain has its SPF or SenderID records set up correctly, and so forth. The message goes through some internal routing, but then is subsequently forwarded by another outside mail server with no change to the email headers. Or, consider the case of receiving mail at one mail host on your network which then relays it to a central mail server.
What happens?
Since the last hop of the message router is the transmitting IP from which the receiving email server receives the message, it uses that IP and checks it against the SPF/SenderID record for the domain in the envelope sender/PRA. Since nothing has been rewritten in the message headers, this will fail a sender authentication.
Figure 3. Tony’s mail is forwarded from my Hotmail account to my Gmail account, to my personal server.
In the above diagram, Tony sends a mail to my Hotmail account, which forwards to my Gmail account, which forwards to my personal mail server. My personal mail server performs an SPF check on IP3, Gmail’s outbound IP, which is not in Tony’s SPF record and therefore will generate an SPF/SenderID failure.
The creators of SPF admit [3] that this is a problem and suggest whitelisting the IP as a possible workaround.
The reality is that the whitelisting of mail servers has a very long tail – you will be forever finding new mail servers that you have to whitelist. When you think you’ve found one forwarder, another one pops up.
One technique is to tweak the recommended implementation. Instead of rejecting mail that fails an authentication test (as recommended by SPF and SenderID), score it aggressively. For example, if we have a spamminess scale based upon probability that runs from 1 to 10, with 1 being non-spam and 10 being spam, assume that if a message scores higher than 5, it is considered spam. The recommendations for SPF and SenderID are to reject mail based on a test failure, so their probability grades would be 10. Even if, combined with other elements in the mail that reduce its spamminess, it’s unlikely that the score will fall beneath the spam threshold. Instead, an authentication failure can be scored at a weight of 4.8 – nearly enough to get the message over the spam threshold, but not quite.
Most spam contains elements that mark it somewhat spammy anyhow, while non-spam contains elements that make it non-spammy. A message with an authentication failure will often have other elements that will push it over the spam threshold, while a non-spam message with a failure will usually be able to be kept under the threshold. Of course, there are times when spam will stay under (false negatives) and non-spam gets pushed over (false positives), but it is generally better to err on the side of reduced false positives.
Thus, rather than rejecting on a hard SPF failure, for most people, using it as a heavier weight makes more sense. Some receivers want to automatically reject mail that fails SenderID or SPF checks but this implementation is not right for everyone.
How would a spammer get around SPF? One way is the method used by Spammer-X in his book Inside the Spam Cartel. Spammer-X is a retired spammer and reveals a lot of the details of his former career in his book. According to Spammer-X, SPF stops novice spammers but not the professionals. The best way to beat SPF is to join it.
First, Joe Spammer rents a dedicated spam host in a spammer-friendly location, such as Russia.
Next, he registers 100 domain names, each of which is registered under a fake name and address.
Next, DNS entries for each of the hosts are set up, including a valid pointer record (PTR), an MX record and reverse DNS entries for each domain.
In other words, spammers do everything that owners of legitimate domains do when they set up a domain (although owners of legitimate domains don’t use fake names and addresses, of course).
Next, a self-published SPF record is appended to each domain’s DNS entry, identifying the host as a valid, self-created SPF host that is responsible for any email coming from its domain. An example for superspammer.com might be the following:
v=spf1 mx ptr a:spammerplayground.superspammer.com -all
Reading this, we see that the permitted IPs that can send mail for this domain are any IP in the domain’s MX record (i.e. get the MX record of the domain in the envelope sender) if the sender ends in superspammer.com, or if the IP of the A-record of spammerplayground.superspammer.com is sending mail.
With all of these set up, a spammer can send mail from any of these 100 domains and they will all happily pass SPF checks because the IPs are authorized to send mail.
What if the spammer did this:
v=spf1 mx ptr a:spammerplayground.superspammer.com ?all
This is yet another evasion technique: even if the mail is not authenticated it falls back to a neutral. In other words, if the domain is spoofed, a spam filter should not treat it as such and should accept the mail.
The flaw in this theory is that Spammer-X goes on to say that the majority of spam filters will treat email with an SPF pass with a higher level of legitimacy. My own internal statistics suggest that SPF-authenticated mail is still marked as spam around 15% of the time. So, mail that is verified by SPF is by no means guaranteed to be valid. Mail that is verified by SPF and comes from a source that you trust is treated with a higher level of legitimacy, but not all on its own.
Secondly, even if a domain with valid SPF checks were found to be sending spam, it could get blacklisted very quickly. Spam filters could use such domains to build a reputation list.
Spammer-X does have a point, however; a flaw in SPF is that there is no external third-party verification of SPF records – anyone can sign up for it. VeriSign, for example, goes out and verifies websites to make sure that they are secure when their owners sign up for SSL. If it isn’t a good website, it won’t get a ‘Verified by VeriSign’ stamp. However, there is no equivalent ‘Signed by SPF’ authority that makes sure that whoever signs up for it truly deserves to get it.
SenderID and SPF both have their strengths and weaknesses. They are similar, but are different enough that the employment of one will yield trade-offs that don’t exist had you used the other. Here are some guidelines for the implementation of both:
Do not use SPF records that end with +all. This provides no protection at all – it means that if anyone spoofs your domain, a receiver should accept mail from it. ?all also provides little protection. In this case, a ?all is meant to authenticate a sender’s domain; they are implicitly saying that SPF/SenderID should not be used to detect spoofing.
Do not include PTR records in your SPF record. While permissible by the standard, Hotmail does not support the use of records with a PTR. Such inclusion may induce fails and result in mail being junked and/or deleted. The inclusion of a PTR within an SPF record will create unnecessary DNS traffic, be more prone to errors and will not function in implementations where SPF records are cached on local servers.
Use ~all when you don’t control all IPs that use your domain to send mail. Some mobile employees send email from hotels or other ‘guest’ email servers when working remotely. The best option in this case is for mobile users to send email over a VPN connection or by using a web-based email client. This way their email flows through your regular email servers and you don’t need to make any changes to your SPF record.
This isn’t always possible, in which case, you may wish to include a ~all in your SPF/SenderID records. Their mail will still fail a check, but it tells the receiver not to reject the mail. Instead, assign a lighter weight to it and use it as a consideration as part of a spam filter.
There really is no elegant workaround in the absence of webmail because there’s no guarantee that a hotel will be SenderID or SPF-compliant. There isn’t an elegant workaround to the problem of forwarded mail, either.
We’ve now seen SPF and SenderID and how they work to authenticate mail and detect spoofing. They are relatively simple protocols and while that’s an advantage, it is also a drawback. It ties IP addresses to domains. IPs change and have to be updated. Isn’t there a better way? And can we get around the issue of forwarded mail?
That’s a subject for my next article.
[1] http://www.microsoft.com/mscorp/safety/technologies/senderid/default.mspx [page no longer exists].