2012-08-01
Abstract
As a form of anti-debugging/anti-emulation, some malicious programs insert garbage code within their instructions. Raul Alvarez looks at the use of garbage code and unsupported or rarely used APIs by recent malware.
Copyright © 2012 Virus Bulletin
As a form of anti-debugging/anti-emulation, some malicious programs insert garbage code within their instructions. Garbage code is code that is not needed by the malware to carry out its malicious actions, and it keeps the anti-virus researchers busy reading irrelevant information. FPU (floating point unit) instructions are also used to confuse anti-virus emulators, making it harder to produce decrypted or readable information.
And, as if inserting all that useless code wasn’t enough, the latest malware also uses unsupported or rarely used APIs to make analysis more difficult. Most of these APIs are not supported by anti-virus engines due to the fact that they are not used in normal programming – the anti-virus engines bypass or skip these APIs when emulating the malware code. However, there is a problem with this approach: what if those unsupported or seldom-used APIs are actually needed by the malware? What if we really have to emulate those APIs in order to follow the malware’s execution? Malware authors are aware of the practice of skipping over such API executions, which gives them the opportunity to use it to their advantage.
Unsupported and rarely used APIs that have callback functionality play an important role in the exploitation of code skipping. Since their executions will be skipped, malware authors include them as a means to achieve their malicious goal.
Jumping over such APIs can have an impact on analysis and emulation. There are two possible scenarios:
Emulation by anti-virus software will be aborted prematurely because the call to the routine to decrypt or execute the malware will be skipped.
Anti-virus engineers won’t be able to observe how the malware performs its malicious actions, since, during the analysis, the callback function will be executed without breaking into the beginning of the malware routine.
In this article we will look at a handful of APIs that appear harmless, but which are exploited by malware. The callback parameter of the API is used to execute the malware routine.
The code shown in Figure 1 looks normal. When we look at the call e1421f1f.0041000 at address 0040109C (highlighted in blue-green) we can easily tell that the main malware routine should start at 00401000. But look again – before we can execute that call, there are two sets of instructions:
push -1 call Sleep
These instructions will set the application to sleep for an infinite amount of time. How can our malware routine at 0040100 be triggered? If we are debugging this, then we can skip the call to the Sleep API and proceed directly to the malware routine. If an anti-virus engine is emulating this, skipping the Sleep API is the best step to consider.
What if there is no call to e1421f1f.0041000 after the call to Sleep? Will it be able to trigger the malware at 00401000? Yes, the malware routine can still be triggered even without a call to e1421f1f.0041000. Take a look at the instructions from 0040107B to 0040108B (highlighted in yellow), located before the call to Sleep function.
PUSH 0 PUSH 0 PUSH e1421f1f.00401000 ; Entry address PUSH 1 PUSH 2C9 CALL <JMP.&winmm.timeSetEvent>
The block of code above uses the timeSetEvent API, a multimedia timer function.
As defined by MSDN [1], ‘The timeSetEvent function starts a specified timer event. The multimedia timer runs in its own thread. After the event is activated, it calls the specified callback function or sets or pulses the specified event object.’
The third parameter of timeSetEvent is the pointer to the callback function that will be executed once the required condition is executed. In this case, the callback points to the very beginning of the malware routine.
Even if we have an infinite sleep mode, the malware will still be triggered because of the timeSetEvent API. It would be easy to overlook the block of code that triggers the malware thanks to the deceptive nature of the code structure. Using visual inspection, we could easily conclude where we need to go to find the malware routine, which would lead us to different code altogether. Alternatively, the malware may have a totally different call instruction which will not point us to the right malware routine.
Perhaps the timeSetEvent callback is too easy to spot. Our next case will show a typical garbage code insertion with many APIs that are not relevant to the malware routine. If we follow the code in debugging, it will take us a long time to figure out what the malware actually does.
Figure 2 shows a typical code listing at the entry point of a piece of malware that is heavily injected with garbage code. It has API calls that don’t affect the malware structure or executions. The malware author’s goal is to make the analysis process longer and to throw off any emulation attempt by anti-virus software. For this particular sample, the whole 2,244 bytes of code (not including the different subroutine called) are irrelevant to the malware. (The parts of code highlighted in red are the irrelevant API calls.)
If during the analysis we keep skipping over those subroutines and unsupported APIs, we run the risk of skipping over a rarely used API that might be important in the malware’s execution. Yes, one of those meaningless-looking APIs is actually responsible for executing the payload of the malware.
Figure 3 shows a continuation of the code shown in Figure 2. It doesn’t look any different from the unwanted code in the previous snapshot. Just more junk code and junk API calls. But take a closer look at the code before the call to ExitProcess. The code starting at 00403703 (highlighted in yellow ) is as follows:
PUSH 28 PUSH 4E PUSH 904ef9a1.004063C7 CALL 904ef9a1.004058C6 ; JMP to glu32.gluQuadricCallback
The rarely used glu32.gluQuadricCallback API is responsible for initializing of the malware. It has a callback parameter that points to the beginning of the malware routine.
As defined by MSDN [2], ‘The gluQuadricCallback function defines a callback for a quadric object’, and ‘The gluQuadricCallback function is called when an error is encountered. Its single argument is of type GLenum, and it indicates the specific error that occurred. Character strings describing these errors can be retrieved with the gluErrorString call.’
The gluQuadricCallback is used mostly in graphics applications. We seldom see it used in malware, but the callback function plays a big part in its inclusion. The callback function is called when it encounters an error. Given that it is very uncommon to see this function in malware code, our first thought would be to skip or step over it during debugging.
But unlike timeSetEvent discussed earlier, the callback function’s starting point in the malware is not clear. One of the parameters of timeSetEvent is the callback function’s address, while the address of gluQuadricCallback is within the GLUquadric object, the first parameter of the gluQuadricCallback API.
The GLUquadric object for this sample starts at 004063C7:
PUSH 904ef9a1.004063C7 CALL 904ef9a1.004058C6 ; JMP to glu32.gluQuadricCallback
If we go back to the instruction at address 004036D2 (highlighted in green):
MOV DWORD PTR DS:[4063D7],904ef9a1.004043DD
This instruction copies 004043DD (the starting location of the malware routine) to DWORD PTR DS:[4063D7]. The location 4063D7 is inside the GLUquadric object found at 004063C7.
The actual sequence of instructions without the garbage code should look something like this:
MOV DWORD PTR DS:[4063D7],904ef9a1.004043DD ... ... ... PUSH 28 PUSH 4E PUSH 904ef9a1.004063C7 CALL 904ef9a1.004058C6 ; JMP to glu32.gluQuadricCallback
Visually, we would not suspect that a graphics-related API would be used by the malware to jump to its malicious routine, especially when it is wrapped up with other junk APIs and junk code. When we are tired of skipping, stepping over and executing irrelevant code during a debugging session, the tendency is not to notice that a completely innocuous-looking API will do the trick.
The last case for discussion in this article is not about time or graphics, but about fonts. Yes, fonts, which have nothing to do with infection, downloading files, or code injection. We do not even have a GUI to concern ourselves with fonts. Any font-related API will certainly be categorized as unsupported by anti-virus software, and anti-virus engineers are likely to skip over it (myself included). But the idea of a font-related API deserves a quick look.
Figure 4 shows a snapshot of a piece of malware from the entry point that looks like a simple GUI application. We notice that after calling EnumFontFamiliesExW, there is a call to exit the process. It seems interesting that it won’t do much. Having the knowledge that any API can be a trigger for the malware, the logical choice is to look up the definition of EnumFontFamiliesExW.
As defined by MSDN [3], ‘The EnumFontFamiliesEx function enumerates all uniquely named fonts in the system that match the font characteristics specified by the LOGFONT structure. EnumFontFamiliesEx enumerates fonts based on typeface name, character set, or both.’
There is nothing unusual about this API, except that, like timeSetEvent and gluQuadricCallback, it is capable of calling a separate function. Similar to timeSetEvent, the callback function pointer is one of the parameters of EnumFontFamiliesEx:
PUSH 0 ; Flags = 0 PUSH 23612a08.001009F2 ; lParam = 1009F2 PUSH 23612a08.00100A5E ; Callback = 23612a08.00100A5E PUSH 0 ; pLogfont = NULL PUSH EAX ; hDC CALL <JMP.&gdi32.EnumFontFamiliesExW>
The starting location of the malware routine at 00100A5E can be seen straight after the call to ExitProcess. But if it is unsupported by the anti-virus engine, the emulation will not pass through the malware routine, thus exiting the execution.
We now have an idea that not all unsupported, rarely used, or unheard-of APIs are irrelevant from the point of view of analysis. It will now take us longer to analyse malware containing garbage code, yet this will give us the opportunity to learn about the other capabilities of those APIs.
Remember: if a meaningless-looking API has a callback parameter or can call another function, it is likely to be one of those interesting APIs that we need to support.
[2] gluQuadricCallback. http://msdn.microsoft.com/en-us/library/dd368679(v=vs.85).aspx.
[3] EnumFontFamiliesEx. http://msdn.microsoft.com/en-us/library/dd162620(v=vs.85).aspx.