Research paper shows it may be possible to distinguish malware traffic using TLS

Posted by   Martijn Grooten on   Jun 22, 2017

Researchers at Cisco have published a paper (PDF) describing how it may be possible to use machine learning to distinguish malware command-and-control (C&C) traffic using TLS from regular enterprise traffic, and to classify malware families based on their encrypted C&C traffic.

The need for malware to communicate with its operators, so that it can receive instructions and exfiltrate information from infected systems, is a weak point – it can't easily hide its activity from security products scanning network traffic. For this reason, the trend among malware of using SSL/TLS – the protocol over which a significant portion of today's web and email traffic is sent – is an understandable one.

A good encryption protocol makes encrypted content indistinguishable from random noise, but while TLS uses top-class encryption standards, it cannot avoid the use of metadata that can give away some essential details of the communication.

Even if one ignores the remote IP address and the domain sent in the certificate, both of which can help detect a known malware family, TLS includes explicit metadata, such as the cipher suites and TLS extensions offered and used, as well as more implicit metadata, such as the length and frequency of the packets and the variation seen in them.

The Cisco researchers trained their machine-learning classifier using a combination of malicious TLS traffic and legitimate enterprise TLS traffic. The classifier was able to identify the TLS traffic of most malware families with high accuracy – even that of families that had not been present in the training set.

tlsmalwareindicators.png

The research is very much a work-in-progress and, as befits a good research paper, its authors openly admit the limitations to their work. For instance, the malware was run in Windows XP-based sandboxes, which could have helped the detection: malware often inherits TLS properties from the operating system in which it runs. At the same time, malware is mostly likely to live on older operating systems, making this set-up not too different from a real-world scenario.

It is also important to note that the classifier was not able to say anything about the content of the traffic; it would thus be useless as part of a data-loss prevention system. TLS, especially its most recent versions, is one of the strongest Internet protocols, and the fact that it properly protects  content is a very good thing, even if it can be frustrating for malware analysis and detection.

At Virus Bulletin, we have repeatedly shown how malicious web traffic can be blocked by security products. Organizations using a web security gateway will have to make a decision as to whether to have it inspect TLS-encrypted web traffic as well. While I think that, in most scenarios, inspecting the traffic is a compromise worth making, this research shows that one may be able to block malware's ability to connect to its owners without being able to decrypt the traffic.

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest posts:

VB2019 paper: Domestic Kitten: an Iranian surveillance program

At VB2019 in London, Check Point researchers Aseel Kayal and Lotem Finkelstein presented a paper detailing an Iranian operation they named 'Domestic Kitten' that used Android apps for targeted surveillance. Today we publish their paper and the video…

VB2019 video: Discretion in APT: recent APT attack on crypto exchange employees

At VB2019 in London, LINE's HeungSoo Kang explained how cryptocurrency exchanges had been attacked using Firefox zero-days. Today, we publish the video of his presentation.

VB2019 paper: DNS on fire

In a paper presented at VB2019, Cisco Talos researchers Warren Mercer and Paul Rascagneres looked at two recent attacks against DNS infrastructure: DNSpionage and Sea Turtle. Today we publish their paper and the recording of their presentation.

German Dridex spam campaign is unfashionably large

VB has analysed a malicious spam campaign targeting German-speaking users with obfuscated Excel malware that would likely download Dridex but that mostly stood out through its size.

Paper: Dexofuzzy: Android malware similarity clustering method using opcode sequence

We publish a paper by researchers from ESTsecurity in South Korea, who describe a fuzzy hashing algorithm for clustering Android malware datasets.

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.