Defeating polymorphism: beyond emulation

Adrian E. Stepan Microsoft

The most used method of detecting malware relies on signatures extracted from the malware body. Attempting to defeat this method and evade detection, malware writers have resorted to code obfuscation techniques, thus creating polymorphic viruses.

There are several well-known methods of decrypting polymorphic viruses, such as emulation, cryptanalysis (X-Ray) and dedicated decryption routines. Each of these methods has some limitations: X-Ray can only handle simple decryptions; dedicated routines require significant development effort and neither scales well with the number of detected viruses. Emulation doesn't have these weaknesses but emulating code is significantly slower than executing it on a real CPU. Therefore a very complex polymorphic virus would take unreasonably long to emulate until it is decrypted.

This paper proposes a new method of dealing with polymorphic malware. The method relies on dynamically disassembling the analysed code and performing just-in-time compilation targeted for the host CPU. The code obtained as a result can be safely executed on the host CPU, with little degradation in execution speed, compared to the original code. This provides the same flexibility as emulation, but performance, in terms of speed, is dramatically improved. Additionally, the method could be used for other purposes, such as generic unpacking of packed executables, and behaviour-based analysis of complex code.