2007-12-01
Abstract
In the second installment of this three-part series on exepacker blacklisting, Robert Neumann provides a more detailed look at the different types of blacklisting and describes the tools that are available for analysis.
Copyright © 2007 Virus Bulletin
In the first part of this article (see VB, October 2007, p.14) Gabor Szappanos presented a general overview of exepacker blacklisting and considered both the positive and negative aspects of the practice. In this, the second part of the article, we continue with more detailed information about the different types of blacklisting, and take a look at the tools that are available for use during analysis.
As mentioned in part one of this article, we can divide executable packaging tools into four main categories. The categories separate the tools based upon their primary purpose and common behaviour:
Compressors: the only goal of these tools is to decrease the size of the executable using either common or custom-made compression algorithms. The likelihood of them having any anti-debug-related code is usually close to zero. The most well known tools in this category are UPX, FSG, MEW, PECompact and Upack.
Cryptors: this category covers packers which utilize simple encryption algorithms to make reverse engineering more difficult. They usually have basic anti-debugging code, but no compression. A few well known cryptors are Yoda Crypter, UPolyx and Morphine.
Protectors: these are the ‘big guns’, combining multiple compression and encryption algorithms along with complex anti-debugging code, and sometimes even custom-made virtual machines. The most representative of this category are ASProtect, Armadillo, SVKP and Themida.
Installers: this category is somewhat different from the other three – these are applications that are capable of creating self-installing packages. We decided to include these applications in a separate category since we are seeing a fair amount of malware using them now. NSIS, Inno Setup and Wise Setup are the most common.
A little over two years ago we realized that there was a need for some kind of united effort among AV researchers to help each other deal with different types of packers, so a mailing list and a collection of known packers was born. The idea was welcomed within the AV community and the mailing list has been growing steadily ever since. Meanwhile, the collection of packers has grown to cover around 95% of the known packers (both public and non-public). The current collection can be broken down into the various categories as follows (also see Figure 1):
76 different cryptors, 172 versions in total
49 different compressors, 305 versions in total
50 different protectors, 291 versions in total
20 different installers, 198 versions in total
Overall this means almost 200 different applications and close to a thousand different versions. Only the Win32 packers are counted here, since DOS-based malware no longer forms part of our daily work. Just for the love of the numbers the entire collection consists of 264 different packers totalling 1,195 different versions – and it grows by around 6–8% every month.
Nowadays it is hard to believe that any anti-virus product could survive without some kind of exepacker support. Most product developers aim for combined solutions such as native unpacking together with the use of a powerful emulator. As there are different goals to achieve, there are different approaches and solutions for each.
Native packer support: this method is the most powerful, but also the most time consuming. An analyst has to fully reverse engineer the given packer, mapping all the compression and encryption algorithms in the usual massive amount of assembly code, then rewriting them in a high-level programming language. Once this has been done it will be very easy to unpack the specific packer (at least until the next version appears) and a working unpacked executable can be obtained within seconds. There is a downside though: malware authors try to trick the native recognizers of AV engines through trivial or non-trivial modifications (e.g. PE-Patch, various UPX cryptors, fake section names etc.) and even slight modifications can render the unpacking process impossible. Some packers are open source (e.g. UPX, PeX, Morphine), so altering them is an easy task, as has been observed before (e.g. PeX/Bagle).
Emulator-based unpacking: creating a powerful emulator can save a lot of research time – it is not a quick or easy process, but the benefits will be enjoyed in the long run. Once an emulator has been created we no longer have to worry about each new packer version and very few of the small custom-made ones will pose any problem. However, as with every good thing there is a downside: emulator-based unpacking comes at the cost of performance since emulating through packer code is much slower than unpacking the same with native support.
Hybrid solutions: to combine the best of two worlds, namely native support and emulator-based unpacking, some AV vendors have come up with hybrid solutions [1]. In this case the common compression and encryption algorithms are supported by native code and a specific emulator is used with the support of a custom script-like language. These scripts control the whole unpacking process by utilizing both native code and the emulator, whenever they are needed. This method requires a lot less CPU time compared to the emulator-only unpacking, yet it is very flexible and easily expandable.
Simple blacklisting: blacklisting is probably the easiest solution. Whenever we decide that a packer should be blacklisted, one generic detection is enough to make it happen. It is not time consuming by any means and new versions of the same packer can be added very quickly.
Whichever form of exepacker support is used, researchers will need the help of a couple of different tools for sample analysis.
At the outset we have no information as to whether a malware sample is packed. A well trained eye using a simple hex editor is usually able to judge if a file is packed, but it takes quite a long time to gain such experience. Otherwise, the most common way to discern whether a sample is packed is by checking the file’s entropy. PEiD is a handy tool capable of calculating the entropy of a given executable (alongside many other useful details). Of course it has an internal database of known executables, but let’s look at an approach for dealing with an unknown and possibly new packer.
If the calculated entropy indicates that our executable is packed, then we need to take a closer look with the help of a disassembler or debugger.
Processing the sample with the IDA disassembler will give us enough information to be able to judge whether we need to look for a debugger instead, or whether the deadlist along with IDA’s features are sufficient to complete the task. Static disassembly is usually sufficient if we are facing a packer from the cryptor or compressor category, however it quickly becomes more of a pain once the sample has multiple layers of compression and/or encryption.
At this point our second best friend is VMware (or any similar virtual machine) – unless we happen to have an additional dedicated computer that is isolated from the network such that a possible outbreak won’t affect it. Since there are many different ways to detect the use of virtual environments, we must either try to prevent that happening (e.g. by tweaking VMware’s config file) or get ourselves another PC which can be sacrificed at the altar of science [2]. Regardless of this, we will certainly need a debugger to be able to trace through thousands of lines of packer code before finally reaching the original entry point of our executable.
Unless for some reason we need a ring 0 debugger, the slightly outdated OllyDbg is the most well suited for the task. It is quite a powerful ring 3 debugger on its own, but when used with the excellent user-made plug-ins such as OllyScript, OllyDump or Olly Advanced, it can be extended to an even greater level. There are other ring 3 debuggers such as the newcomer Immunity and the aging TRW, but overall we consider OllyDbg to be the best choice for this task.
In case OllyDbg doesn’t suit our needs and we are desperately searching for a ring 0 debugger, the options available are rather frustrating. A few years ago we would have recommended SoftIce within the blink of an eye, but unfortunately support was dropped for the kernel debugger last year. However, we can still use SoftIce up to Windows XP SP2, and unless vast amounts of Vista-specific ring 0 malware appear on the horizon in the near future, we shouldn’t be too worried. Beside SoftIce there are other options available for kernel debugging. We can either use the not-so-pretty, but still quite decent tool Windbg (with two computers or connecting it to a VMware client through a named pipe) or the SoftIce heritage-like Syser debugger.
Once we arrive at the original entry point of the executable we can consider our task to be complete, at least for now. A proper memory dump (and, depending on the feature set of the packer, rebuild of the import table, restoration of stolen code parts and so on), combined with static analysis and generic detection is usually enough to determine what’s inside, but that’s beyond the scope of this article.
Here is a quick overview of the tools we should have in our arsenal:
Hex editors: Hiew or PE Explorer, depending on whether one prefers console or Windows-based applications (both have a built-in disassembler).
Disassemblers: an outstanding product, IDA is unquestionably our recommendation.
Debuggers: OllyDbg, SoftIce, Windbg – the choice is really up to personal preference.
Other tools: PEiD and RDG Packer Detector are must-haves for known packer detection.
The volume of daily incoming malware has reached such a high level that processing and unpacking each and every piece of packed malware manually is no longer possible – and would be a waste of precious human resources. With an average 44% of the total number of incoming samples having some kind of packer on them [3], about 20% of which can easily be unpacked by native support (the likes of UPX, FSG, Upack etc.), the remaining 20–24% still gives us a run for our money.
To be able to utilize further blacklisting, we need to separate the samples according to packer type. Once we have a handy list of the most common packers we can decide what kind of support to plan for them. For the purposes of gathering such information the use of PEiD and RDG alone might not be enough. First they are GUI-oriented applications which makes automation a bit of a complicated task, and second they are not updated on a regular basis (more like yearly in the case of PEiD).
On close examination it turns out that both tools basically work with large collections of packer-specific sequences, along with some advanced detection methods. The key is the sequences – it’s quite a simple task to collect a few external PEiD databases, sort out the duplicated detections, remove the junk and merge our own custom sequences into it. Now we can code our own packer detector which will fully suit our needs, and can be run through large collections of malware to gather accurate statistics. (Note: Metasploit Framework has a built-in packer detector using an external PEiD database.)
I’m sure that most of the big AV vendors are taking advantage of automated sample processing systems by now – we certainly are. Since human resources are limited almost everywhere, it was an obvious step to automate some tasks in order to free up highly valuable researcher time. Combining the automated systems with the above-mentioned packer detector gives us the opportunity to do whatever we want with a specific packer.
As stated in the first part of the article, the main problem with blacklisting is that we cannot simply blacklist all packers on arrival. Our current approach is to blacklist all the ‘pure’ black ones, as they will never be found on anything but malware. The middle or ‘grey’ category is always going to involve some kind of risk management due to the small – but significant – number of potential false positives. Having a powerful Win32 emulator can be the solution here.
We don’t touch the white packers for obvious reasons – most of them are commercial products intended to protect other shareware applications, and it is just unfortunate when malware authors use them. Aiming for native support is the only negotiable – and most of the time pretty rough – way to handle this category.
Exepackers are here to stay for a long while; their roots go back to the shady days of DOS and their future is yet to be seen.
Taking a look at the global picture clearly shows a few things: commercial software developers are stuck right now, with new and ground-breaking ideas sparse on the horizon. The heavyweights of the past years such as ASProtect or Armadillo have entered into a state close to hibernation. An update appears for them once in a while, but these are generally only bug fixes. The only active (and promising) members of this category right now are Themida, the successor of the old Xtreme-Protector, and VMProtect. Both of these feature a built-in custom virtual machine.
One should never paint the devil onto the wall, as the old saying goes, but I for one wouldn’t like the idea of dealing with new families of malware where one of the previously mentioned virtual machines is properly implemented into the code. Do we know that this is going to happen? We can be fairly confident since our experience shows that whenever a new professional product gets out of the door and a legal (or more likely illegal) copy finds its way to the various RCE (Reverse Code Engineering) and AV-related boards, we can expect malware to be packed with it within a few days. It’s an unfortunate situation where both parties take advantage of the very same sources. We can only be thankful that these products haven’t yet been used maliciously to their full potential.
Vista has also put a new twist on things with its new driver policy: we can say goodbye to ring 0-based solutions, as everything is returning to ring 3 once again.
Whether as a result of the above or for other reasons, malware authors nowadays more often develop and maintain their own custom packers. This gives them the opportunity to alter the source whenever they want to, which is a powerful option for them in the fight against AV products.
Since size doesn’t really matter any more, as most of the world has entered into the age of broadband Internet, the possibility of new basic compressors suddenly appearing is rather low. Upack was the last real crusader in this area, and it is pretty dead (unless we count the ever-growing number of PECompact betas).
The only real live and growing category of packers remains the pure black ones. We can expect new black packers to continue to appear from time to time, as creating a small cryptor isn’t really time consuming and an as-yet-unknown packer will be always capable of hiding malware for a couple of days, or under very extreme circumstances, weeks. However, this behaviour is their main weakness as well: since these packers will never be used on operating system files, not even on shareware applications alone, we won’t have to think twice about blacklisting them.
Talking about different tools and approaches in theory is like explaining to someone how it feels to ride a bicycle without letting him try. If we want to work quickly and efficiently, we always have to be capable of making quick decisions about what tool or application best suits the current situation. Knowing this alone is only a part of the success, mastering their usage to a level where it feels like second nature is another. Hiew, OllyDbg and SoftIce are all very powerful tools on their own, although selecting them for the right task is sometimes more complicated than it would seem.
In the third and closing part of the article we will look at how all this works in reality.