2013-12-02
Abstract
In the latest of his ‘Greetz from Academe’ series, highlighting some of the work going on in academic circles, John Aycock looks at a tool designed to detect JavaScript containing malicious evasions.
Copyright © 2013 Virus Bulletin
As the weather here in Calgary changes from the depressing snowfalls of October to the embittering snowfalls of November and the downright irritating snowfalls of December, my thoughts turn to Christmas. Specifically, how is Santa able to compile his lists of good and bad JavaScript code? It’s a perplexing problem.
But fear not, for Kapravelos et al. are here to help, with their paper ‘Revolver: An Automated Approach to the Detection of Evasive Web-based Malware’ [1]. The paper was presented at the 2013 USENIX Security Symposium in August, and the researchers’ aim is to automatically be able to detect when JavaScript has malicious evasions added to it. Their tool, Revolver, is interesting in one sense because it leverages existing resources that many anti-malware companies either already have, or could put together in short order: a malicious JavaScript detector, and corpora of benign and malicious JavaScript code.
Revolver operates on the premise that malicious JavaScript evolves over time as it is tweaked by the bad guys to avoid detection. This implies that earlier versions of the malicious code probably exist in security researchers’ repositories – whether picked up via honeyclients, submitted to detectors by the bad guys themselves, or delivered by magical sleigh and reindeer, it matters not. If a new sample is classified by some detector as benign, but the sample’s code looks suspiciously similar to an older sample classified as malicious, then the new sample may have been modified to include evasive code.
The tricky part is detecting when two pieces of code, possibly obfuscated on purpose, are suspiciously similar. Surprisingly, most of the similarity analysis performed by Revolver is static, with the results of some dynamic analysis thrown in to pick up dynamically generated code and note which code was actually executed. Unnecessary detail that could throw off comparison is abstracted away from the JavaScript and, in keeping with the Christmas theme, an abstract syntax tree (AST) representation of the JavaScript code is used. (The tree nodes are decorated with dynamically gathered execution information for that festive look.)
To trim the search space down, ASTs are summarized as fixed-length vectors, where each vector element is the frequency with which a particular type of AST node appeared. This summary allows the researchers to look efficiently for the nearest neighbours in the JavaScript corpora, filtering out code that is unlikely to match. They then use a linearized representation of the ASTs (basically strings) and use the edit distance between them as a similarity measure. There’s more to their technique, but suffice it to say that they employed much cleverness in their design.
I had originally chosen to look at this paper because of its evasion detection technique, but reading through it unfortunately highlights the academic perception of anti-virus technology. I thought we were past the anti-virus-as-glorified-string-search notion, but perhaps not. To quote: ‘attackers may obfuscate their code so that it does not match the string signatures used by antivirus tools’ (pp.637–8) and ‘code obfuscation is effective against tools that rely on signatures, such as antivirus scanners’ (p.639). I could quote more, but this gives the general flavour.
Antiquated perceptions aside, however, Revolver isn’t just an academic proof of concept, but is designed to scale. With only four machines, the researchers were able to process just under 600,000 samples per day, which was as much as their detection ‘oracle’ could feed them. Lots of tuning and algorithmic tricks are used to allow the system to scale up, and all the details are given in the paper; a good implementer should be able to reproduce Revolver from the description.
For sorting out the good and bad JavaScript, Saint Nick need not play Russian Roulette with Revolver. Happy holidays!
[1] Kapravelos, A.; Shoshitaishvili, Y.; Cova, M.; Kruegel, C.; Vigna, G. Revolver: An Automated Approach to the Detection of Evasive Web-based Malware. Proceedings of the 22nd USENIX Security Symposium, 2013, pp.637–651. http://seclab.cs.ucsb.edu/media/uploads/papers/usenix2013_revolver.pdf.