Inside W32.Xpaj.B’S infection – part 2

2014-02-03

Liang Yuan

Symantec, China

Editor: Helen Martin

Abstract

Xpaj.B is one of the most complex and sophisticated file infectors in the world. It is difficult to detect, disinfect and analyse. Liang Yuan provides a deep analysis of its infection.

Table of contents


Polymorphic stack-based virtual machine
Execution route
Conclusion

Xpaj.B is one of the most complex and sophisticated file infectors in the world. It is difficult to detect, disinfect and analyse. This two-part article provides a deep analysis of its infection. Part 1 dealt with the initial stages of infection [1], while this part concentrates on the implementation of the small polymorphic stack based virtual machine that the virus writes to the target subroutines.

Polymorphic stack-based virtual machine

Once the target subroutines have been found, the virus writes a small polymorphic stack-based virtual machine to them. The implementation of the virtual machine is highly polymorphic, and it can be generated with the following features:

Random size of stack frame and stack offset
Instructions with random registers and stack offset
Junk instructions with random opcode, register, stack offset and immediate value
Random appearance of junk instructions (due to the varied number of junk instructions)
Random instruction pairs.

Because the overwritten areas of subroutines are mostly separate from each other, when one overwritten area runs out, the virus will write a jmp instruction at its end in order to jump to the next overwritten area, and continue to write the code.

When the infected file is executed, and once the instruction that calls the overwritten subroutines or the redirected call instruction is executed, the virtual machine starts to work. First, it calls a subroutine named ‘get_base’ to get the base address (as shown in Figure 1). The base address is the return address of the ‘call get_base’ instruction. Then it locates the encrypted array by using the base address; the encrypted array is used to describe the sequence of operations executed by the virtual machine. It then executes the operations one by one until it reaches the end of the sequence (as shown in Figure 1 – note that this is a clean virtual machine without junk code and the stack offsets may be different for other infections). The sequence of operations encoded by the array forms a program that locates the address of the ZwProtectVirtualMemory API, calls this API to modify memory protection of the section containing the virus code or data, then constructs and executes the decryptor to decrypt the virus body, and constructs and executes the jumper to execute the payload.

Figure 1. Execution of operations.

The virus uses three DWORDs to describe one operation, with the following structure:

Struct operation{
  DWORD Offset;//+0 operation address offset to the base address
  DWORD Argument1;//+4 Argument1 for the operation
  DWORD Argument2;//+8 Argument2 for the operation
} operation_info;

When executing an operation, it decrypts its offset and arguments from the array, saves the arguments to the specified stack offsets, then computes the operation address by using the base address, and calls it to execute the operation. At the same time, it updates the position of the array for the next operation. It continues to execute operations until it reaches the end of the sequence. In most cases, there are 0xd5 operations in the sequence.

For the version of Xpaj.B I analysed, there are seven basic operations. To obtain a clean virtual machine and better understand its operation, it was necessary to patch the code. The following operators were derived from the clean virtual machine:

Call – call the address at the top of the stack and save the result on the top of the stack (as shown in Figure 2).
Figure 2. Call operation (for the stack offsets see Figure 1).
Get_PEB – push fs:[xxx] to the stack. xxx is the value at the top of the stack, which is always 0x30 in order to get the PEB and locate the address of the ZwProtectVirtualMemory API function (as shown in Figure 3).
Figure 3. Get_PEB operation (for the stack offsets see Figure 1).
Push_argument1 – push argument1 to the top of the stack (as shown in Figure 4).
Figure 4. Push_ arg1 operation (for the stack offsets see Figure 1).
Load – load one DWORD onto the top of the stack from the memory location specified by vm_esp, argument1 and argument2 (as shown in Figure 5).
Figure 5. Load operation (for the stack offsets see Figure 1).
Store – store the DWORD from the top of the stack to the memory location which is specified by vm_esp, argument1 and argument2 (as shown in Figure 6).
Figure 6. Store operation (for the stack offsets see Figure 1).
Add – add two numbers to the top of the stack and push the result to the stack (as shown in Figure 7).
Figure 7. Add operation (for the stack offsets see Figure 1).
Je – compare two values at the top of the stack. If they are not equal, continue to execute the next operation; if they are equal, add argument1 to the array to execute the other operation (as shown in Figure 8).
Figure 8. Je operation (for the stack offsets see Figure 1).

I also let the virus build a polymorphic version of the virtual machine with the same size of stack frame and stack offsets as for the virtual machine. Figure 9 shows the difference between the call operations from the two virtual machines.

Figure 9. Difference between call operations (for the stack offsets see Figure 1).

Xpaj.B builds its polymorphic virtual machine in a very similar way to that in which the virtual machine works. The code implements a number of operations and an interpreter is controlled by encrypted binary data that is stored inside the virus. The virus decrypts the binary data, and the sequence of operations encoded by it forms a program which builds the virtual machine. For the variant I analysed, the size of the binary data was 0x288. Figure 10 and Figure 11 show how Xpaj.B uses the binary data to build the main frame of the virtual machine.

Figure 10. Construct main frame of VM.

Figure 11. Construct main frame of VM.

I wrote an IDA python decryption script that emulates the function named ‘decrypt_dword’ (shown in Figure 10 and Figure 11) to get the called addresses, and added some comments describing what the addresses do. (As shown in Figure 12 and Figure 13, xxx, nn and reg in the comments are specified by the binary data; nn is derived from the stack offsets list for the virtual machine.)

Figure 12. Handles and encrypted binary data.

Figure 13. Handles and encrypted binary data.

Now the key is to analyse the binary data. I created a python script to do this, get the xxx operators and print the main procedure of building the virtual machine. The output result is as follows:

zero flag_constructing_junk_code
set using_random_junk_ins as true
generate the size of stack frame and stack offsets for vm internal use
push ebp 
mov ebp,esp 
sub esp, xx
set flag_constructing_junk_code as true
push regs
zero flag_constructing_junk_code
save the following ins to ins_log
add the following ins to branch_ins_in_VM
call next_ins_va
set flag_constructing_junk_code as true
get one free reg for internal use
mov reg, dword ptr [ebp+nn]
zero flag_constructing_junk_code
save the following ins to ins_log
add reg, imm32
...
mov reg, dword ptr [ebp+nn]
add dword ptr [ebp+nn], reg
set specified reg as free
ins that jmp to the dispatcher to execute the next operation
save the following ins to ins_log
success

Note that the construction of junk instructions is not included in the log result. There is one subroutine, named ‘junk_code_construction’, that is responsible for constructing the junk code. This is called in every iteration if flag_constructing_junk_code is true (as shown in Figure 10). There is one seed as argument to control the chance of constructing a junk instruction. The smaller the seed, the greater the chance of constructing a junk instruction. It tries to create as many junk instructions as possible, but the size of overwritten areas is limited, and if the space runs out, it will enlarge the seed (thereby decreasing the number of junk instructions) to rebuild the virtual machine until it is successful. The subroutine can construct five different types of junk instructions (some of which can be seen in Figure 9):

Mov/add/or/adc/sbb/and/sub/xor reg, dword ptr [ebp+nn]

Mov/add/or/adc/sbb/and/sub/xor reg1, reg2

Mov/add/or/adc/sbb/and/sub/xor reg, imm32(random)

Mov/add/or/adc/sbb/and/sub/xor dword ptr [ebp+nn], reg

Mov/add/or/adc/sbb/and/sub/xor dword ptr [ebp+nn], imm32(random)

If using_random_junk_ins is false, the virus either uses the mov instruction directly, or else it chooses one from: add, or, adc, sbb, and, sub and xor to construct the junk instruction.

When constructing the instructions that are used to jump to the dispatcher (instructions at the bottom of Figures 2–9), the virus tries to add junk instructions among them from the five types listed above. It randomly selects one of the following instruction pairs in order to jump to the dispatcher (nn is the stack offset that stores the dispatcher address):

Pair 1
```
push dword ptr [ebp+nn]
retn
```
Pair 2
```
jmp dword ptr [ebp+nn]
```
Pair 3
```
mov reg, ebp
jmp dword ptr [reg+nn]
```
Pair 4
```
mov reg, dword ptr [reg+nn]
jmp reg
```

Pair 5

mov reg, nn
add reg, ebp
jmp dword ptr [reg]

You might notice that the junk instructions are very similar to some of the virtual machine’s instructions (as shown in Figure 9). How does Xpaj.B construct the junk instructions? As can be seen in the first few lines of the output result, it first creates a stack frame with the specified size (large enough for the virtual machine) and the stack offsets list for the virtual machine’s internal use; the stack offsets in the stack frame are for storing the local variables of the virtual machine. It also creates an array whose size is 8 for showing which register is free or busy: array[0] represents eax; array[1] represents ecx; and so on. The value of an array item can be 0, 1 or 2. Value 2 means ebp and esp (they can’t be used to construct a junk instruction); 0 indicates that the register is free and can be used; 1 indicates that the register is busy and can’t be used. The array is initialized to [0,0,0,0,2,2,0,0] – this means that all registers except for ebp and esp are free at the beginning. Xpaj.B will update the array according to the context when building the instructions of the virtual machine. If it wants to use a register, it will choose one at random from the free registers and set it as busy. If the register isn’t used in the following instructions, it will set it as free. As a result, the virus can construct the junk instructions by using registers which are free and the stack offsets that the virtual machine doesn’t use. Note that the busy registers, ebp and esp, can be used as the source operand of any junk instruction.

From the output result log, we can see the main frame of the virtual machine. But the virtual machine is not ready yet – it needs to be fixed. The virus records some instruction information when building the virtual machine. The information will be used to fix the instructions. It uses the following structures to log the information:

struct branch_ins_in_VM{
 DWORD item_num;//+0	 
 branch_ins_info_in_VM info[item_num];//+4
} branch_ins;
struct branch_ins_info_in_VM 
{	
 DWORD operand_va;//+0 start address of the operand of branch instruction
 DWORD index;//+4 for indexing the destination address of the branch ins
} branch_ins_info_in_VM;
/*

For example:

0012D4F4 00000002--> total items
0012D4F8 00C23D9A--> see Figure 12
0012D4FC 00000000--> see Table 1
0012D500 00C23E81--> see Figure 8 jnz dispatcher
0012D504 00000004--> see Table 1
*/
struct ins_log_info
{
  DWORD index;//+0 
  DWORD va;//4 virtual address of the instruction
} ins_log_info;
struct ins_log
{
  DWORD item_num;//+0
  ins_log_info info[item_num];//+4
} ins_log;

Once the main frame of the virtual machine has been built successfully, the virus will fix the places as shown in Figure 14 (this is clearer if you compare it with Figure 1). This is necessary as the destination addresses for branch instructions (including call, jmp, jcc etc.) are not always known up front. The virus uses the address of the next instruction as the operand, which makes sure it is easy to fix the branch instruction (as shown in Figure 15). When fixing other places, the virus needs to analyse the ins_log structure to get the relevant instruction address by given index. There is one subroutine named ‘get_va_from_ins_log_by_index’. This iterates through the info field of the ins_log structure and gets the relevant virtual address by given index. If it is not found, it will return the virtual address of the last ins_log_info in the array (as shown in Figure 16). The information about ins_log for the variant I analysed is shown in Table 1.

Figure 14. Tweaked places.

Figure 15. Fixes the branch ins in VM.

Figure 16. get_va_from_ins_log_by_index.

Index	VA (from mapped image)	Description
0	00C23E1D	Destination address of call get_base
1	00C23D99	VA of instruction ‘call get_base’
2	Not used
3	00C23DA1	see Figure 17
4	00C23DBF	see Figure 17
5	00C23E45	address of push_arg1operation
6	00C23E51	address of load operation
7	00C23E3D	address of get_peb operation
8	00C23E62	address of store operation
9	00C23E73	address of add operation
0xA	00C23E7B	address of je operation
0xB	00C23E90	patched end address
0xC	00C23E24	address of call operation
0xD	00C23E19	part of jumper
0xE	00C23DB6	see Figure 17
0xF	00C23DC7	see Figure 17
0x10	0C23DE8	see Figure 17
0x11	00C23DFB	see Figure 17

Table 1. Information about ins_log.

Figure 17. Fixes place 2 (for the index, see Table 1).

The virus fixes the following places with the exception of the branch instruction:

Place 2 in Figure 14: fixes the instruction which is used to locate the encrypted array (as shown in Figure 17).
Place 3 in Figure 14: fixes the instruction that is used to get the address of the dispatcher (as shown in Figure 18).
Figure 18. Fixes place 3 (for the index, see Table 1).
Places 4, 5, 6 in Figure 14: fixes the instructions for initializing the key to decrypt the VM array (as shown in Figure 19).
Figure 19. Fixes places 4, 5, 6 (for the index, see Table 1).

After the instruction has been fixed, the virus will start to fix the operation structure array, including the offset and argument fields. It first decrypts the encrypted operation structure array. After decryption, the offset field of the operation structure is the index (as shown in Figure 20), which it can use to get the operation addresses. Next, it fixes the offsets of the operation structure array (as shown in Figure 21). It fills the argument field with random DWORDs, then it fixes the argument field of the operation structure array in order to ensure that the virtual machine executes correctly. After that, it encrypts the array and writes it to the inserted section (as shown in Figure 22), which is to make sure the virtual machine decrypts the array correctly (as shown in Figure 1). At this point, the virtual machine is ready.

Figure 20. Decrypted operation array.

Figure 21. Fixes the offsets.

Figure 22. Encrypts the array and writes it to inserted section.

Finally, it updates the checksum in the PE header to complete the infection process.

Execution route

Let’s look at the main execution route of Xpaj.B. Usually there are three routes (as shown in Figure 23). When the infected file is executed, once either the instruction that calls the first overwritten subroutine (Route 3 in Figure 23), the instruction that calls the other overwritten subroutines (Route 2), or the redirected call instruction (Route 1) is executed, the return address of the call is saved into the stack (for Route 2, the address is the return address of the instruction ‘call_start_address_of_first_overwritten_subroutine’) and the virtual machine starts to work. The virtual machine locates the address of the ZwProtectVirtualMemory API and calls this API to modify memory protection of the area that contains the encrypted virus body, then it constructs and executes the decryptor to decrypt the virus body, and constructs and executes the jumper to execute the virus code.

Figure 23. Execution routes.

When the virus is started, it gets the return address from the stack and converts it to RVA. Then it iterates through the patch structure list and gets the proper patch structure for the call instruction. If it fails to get the relevant patch structure for the call instruction, this means the executed call instruction is the call to the first overwritten subroutine. Thus it uses the patch structure of the first overwritten subroutine as the matched patch structure. It decrypts the matched patch structure and executes the code from its code field. If the reloc_count field of the matched patch structure is not zero, it will fix the relocations, storing them in the reloc_offset field of the matched patch structure. This allows the infected executable to continue working.

Conclusion

Xpaj.B is not only one of most sophisticated file infectors but also one of stealthiest. It uses several techniques to prevent detection and remain under the radar. Those techniques demonstrate that the authors favour discretion over efficiency and want the virus to persist for as long as possible once the infection has occurred.

Bibliography

[1] Yuan, L. Inside W32.Xpaj.B’s infection – part 1. Virus Bulletin, January 2014, p.13. http://www.virusbtn.com/pdf/magazine/2014/201401.pdf.

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…

Bulletin Archive