Experiences in malware binary deobfuscation

Hassen Saidi SRI International
Phil Porras SRI International
Vinod Yegneswaran SRI International

Malware authors employ a myriad of evasion techniques to impede automated reverse engineering and static analysis efforts. The most popular technologies include 'code obfuscators' which serve to rewrite the original binary code to an equivalent form that provides identical functionality while defeating signature-based detection systems. These systems significantly complicate static analysis, making it challenging to uncover the malware intent and the full spectrum of embedded capabilities. While code obfuscation techniques are commonly integrated into contemporary commodity packers, from the perspective of a reverse engineer, deobfuscation is often a necessary step that must be conducted independently after unpacking the malware binary.

In this paper, we describe a set of techniques for automatically unrolling the impact of code obfuscators with the objective of completely recovering the original malware logic. We have implemented a set of generic debofuscation rules as a plug-in for the popular IDA Pro disassembler. We use sophisticated obfuscation strategies employed by two infamous malware instances from 2009, Conficker C and Hydraq (the binary associated with the Aurora attack) as case studies. In both instances our deobfuscator was able to enable a complete decompilation of the underlying code logic. This work was instrumental in the comprehensive reverse engineering of the heavily obfuscated P2P protocol embedded in the Conficker worm. The plug-in is integrated with the Hexrays decompiler to provide a complete reverse engineering of malware binaries from binary form to C code and is available for free download on the SRI malware threat center website: http://www.mtc.sri.com/deobfuscation/.