2014-02-03
Abstract
Xpaj.B is one of the most complex and sophisticated file infectors in the world. It is difficult to detect, disinfect and analyse. Liang Yuan provides a deep analysis of its infection.
Copyright © 2014 Virus Bulletin
Xpaj.B is one of the most complex and sophisticated file infectors in the world. It is difficult to detect, disinfect and analyse. This two-part article provides a deep analysis of its infection. Part 1 dealt with the initial stages of infection [1], while this part concentrates on the implementation of the small polymorphic stack based virtual machine that the virus writes to the target subroutines.
Once the target subroutines have been found, the virus writes a small polymorphic stack-based virtual machine to them. The implementation of the virtual machine is highly polymorphic, and it can be generated with the following features:
Random size of stack frame and stack offset
Instructions with random registers and stack offset
Junk instructions with random opcode, register, stack offset and immediate value
Random appearance of junk instructions (due to the varied number of junk instructions)
Random instruction pairs.
Because the overwritten areas of subroutines are mostly separate from each other, when one overwritten area runs out, the virus will write a jmp instruction at its end in order to jump to the next overwritten area, and continue to write the code.
When the infected file is executed, and once the instruction that calls the overwritten subroutines or the redirected call instruction is executed, the virtual machine starts to work. First, it calls a subroutine named ‘get_base’ to get the base address (as shown in Figure 1). The base address is the return address of the ‘call get_base’ instruction. Then it locates the encrypted array by using the base address; the encrypted array is used to describe the sequence of operations executed by the virtual machine. It then executes the operations one by one until it reaches the end of the sequence (as shown in Figure 1 – note that this is a clean virtual machine without junk code and the stack offsets may be different for other infections). The sequence of operations encoded by the array forms a program that locates the address of the ZwProtectVirtualMemory API, calls this API to modify memory protection of the section containing the virus code or data, then constructs and executes the decryptor to decrypt the virus body, and constructs and executes the jumper to execute the payload.
The virus uses three DWORDs to describe one operation, with the following structure:
Struct operation{ DWORD Offset;//+0 operation address offset to the base address DWORD Argument1;//+4 Argument1 for the operation DWORD Argument2;//+8 Argument2 for the operation } operation_info;
When executing an operation, it decrypts its offset and arguments from the array, saves the arguments to the specified stack offsets, then computes the operation address by using the base address, and calls it to execute the operation. At the same time, it updates the position of the array for the next operation. It continues to execute operations until it reaches the end of the sequence. In most cases, there are 0xd5 operations in the sequence.
For the version of Xpaj.B I analysed, there are seven basic operations. To obtain a clean virtual machine and better understand its operation, it was necessary to patch the code. The following operators were derived from the clean virtual machine:
Call – call the address at the top of the stack and save the result on the top of the stack (as shown in Figure 2).
Get_PEB – push fs:[xxx] to the stack. xxx is the value at the top of the stack, which is always 0x30 in order to get the PEB and locate the address of the ZwProtectVirtualMemory API function (as shown in Figure 3).
Push_argument1 – push argument1 to the top of the stack (as shown in Figure 4).
Load – load one DWORD onto the top of the stack from the memory location specified by vm_esp, argument1 and argument2 (as shown in Figure 5).
Store – store the DWORD from the top of the stack to the memory location which is specified by vm_esp, argument1 and argument2 (as shown in Figure 6).
Add – add two numbers to the top of the stack and push the result to the stack (as shown in Figure 7).
Je – compare two values at the top of the stack. If they are not equal, continue to execute the next operation; if they are equal, add argument1 to the array to execute the other operation (as shown in Figure 8).
I also let the virus build a polymorphic version of the virtual machine with the same size of stack frame and stack offsets as for the virtual machine. Figure 9 shows the difference between the call operations from the two virtual machines.
Xpaj.B builds its polymorphic virtual machine in a very similar way to that in which the virtual machine works. The code implements a number of operations and an interpreter is controlled by encrypted binary data that is stored inside the virus. The virus decrypts the binary data, and the sequence of operations encoded by it forms a program which builds the virtual machine. For the variant I analysed, the size of the binary data was 0x288. Figure 10 and Figure 11 show how Xpaj.B uses the binary data to build the main frame of the virtual machine.
I wrote an IDA python decryption script that emulates the function named ‘decrypt_dword’ (shown in Figure 10 and Figure 11) to get the called addresses, and added some comments describing what the addresses do. (As shown in Figure 12 and Figure 13, xxx, nn and reg in the comments are specified by the binary data; nn is derived from the stack offsets list for the virtual machine.)
Now the key is to analyse the binary data. I created a python script to do this, get the xxx operators and print the main procedure of building the virtual machine. The output result is as follows:
zero flag_constructing_junk_code set using_random_junk_ins as true generate the size of stack frame and stack offsets for vm internal use push ebp mov ebp,esp sub esp, xx set flag_constructing_junk_code as true push regs zero flag_constructing_junk_code save the following ins to ins_log add the following ins to branch_ins_in_VM call next_ins_va set flag_constructing_junk_code as true get one free reg for internal use mov reg, dword ptr [ebp+nn] zero flag_constructing_junk_code save the following ins to ins_log add reg, imm32 ... mov reg, dword ptr [ebp+nn] add dword ptr [ebp+nn], reg set specified reg as free ins that jmp to the dispatcher to execute the next operation save the following ins to ins_log success
Note that the construction of junk instructions is not included in the log result. There is one subroutine, named ‘junk_code_construction’, that is responsible for constructing the junk code. This is called in every iteration if flag_constructing_junk_code is true (as shown in Figure 10). There is one seed as argument to control the chance of constructing a junk instruction. The smaller the seed, the greater the chance of constructing a junk instruction. It tries to create as many junk instructions as possible, but the size of overwritten areas is limited, and if the space runs out, it will enlarge the seed (thereby decreasing the number of junk instructions) to rebuild the virtual machine until it is successful. The subroutine can construct five different types of junk instructions (some of which can be seen in Figure 9):
Mov/add/or/adc/sbb/and/sub/xor reg, dword ptr [ebp+nn]
Mov/add/or/adc/sbb/and/sub/xor reg1, reg2
Mov/add/or/adc/sbb/and/sub/xor reg, imm32(random)
Mov/add/or/adc/sbb/and/sub/xor dword ptr [ebp+nn], reg
Mov/add/or/adc/sbb/and/sub/xor dword ptr [ebp+nn], imm32(random)
If using_random_junk_ins is false, the virus either uses the mov instruction directly, or else it chooses one from: add, or, adc, sbb, and, sub and xor to construct the junk instruction.
When constructing the instructions that are used to jump to the dispatcher (instructions at the bottom of Figures 2–9), the virus tries to add junk instructions among them from the five types listed above. It randomly selects one of the following instruction pairs in order to jump to the dispatcher (nn is the stack offset that stores the dispatcher address):
Pair 1
push dword ptr [ebp+nn] retn
Pair 2
jmp dword ptr [ebp+nn]
Pair 3
mov reg, ebp jmp dword ptr [reg+nn]
Pair 4
mov reg, dword ptr [reg+nn] jmp reg
Pair 5
mov reg, nn add reg, ebp jmp dword ptr [reg]
You might notice that the junk instructions are very similar to some of the virtual machine’s instructions (as shown in Figure 9). How does Xpaj.B construct the junk instructions? As can be seen in the first few lines of the output result, it first creates a stack frame with the specified size (large enough for the virtual machine) and the stack offsets list for the virtual machine’s internal use; the stack offsets in the stack frame are for storing the local variables of the virtual machine. It also creates an array whose size is 8 for showing which register is free or busy: array[0] represents eax; array[1] represents ecx; and so on. The value of an array item can be 0, 1 or 2. Value 2 means ebp and esp (they can’t be used to construct a junk instruction); 0 indicates that the register is free and can be used; 1 indicates that the register is busy and can’t be used. The array is initialized to [0,0,0,0,2,2,0,0] – this means that all registers except for ebp and esp are free at the beginning. Xpaj.B will update the array according to the context when building the instructions of the virtual machine. If it wants to use a register, it will choose one at random from the free registers and set it as busy. If the register isn’t used in the following instructions, it will set it as free. As a result, the virus can construct the junk instructions by using registers which are free and the stack offsets that the virtual machine doesn’t use. Note that the busy registers, ebp and esp, can be used as the source operand of any junk instruction.
From the output result log, we can see the main frame of the virtual machine. But the virtual machine is not ready yet – it needs to be fixed. The virus records some instruction information when building the virtual machine. The information will be used to fix the instructions. It uses the following structures to log the information:
struct branch_ins_in_VM{ DWORD item_num;//+0 branch_ins_info_in_VM info[item_num];//+4 } branch_ins; struct branch_ins_info_in_VM { DWORD operand_va;//+0 start address of the operand of branch instruction DWORD index;//+4 for indexing the destination address of the branch ins } branch_ins_info_in_VM; /*
For example:
0012D4F4 00000002--> total items 0012D4F8 00C23D9A--> see Figure 12 0012D4FC 00000000--> see Table 1 0012D500 00C23E81--> see Figure 8 jnz dispatcher 0012D504 00000004--> see Table 1 */ struct ins_log_info { DWORD index;//+0 DWORD va;//4 virtual address of the instruction } ins_log_info; struct ins_log { DWORD item_num;//+0 ins_log_info info[item_num];//+4 } ins_log;
Once the main frame of the virtual machine has been built successfully, the virus will fix the places as shown in Figure 14 (this is clearer if you compare it with Figure 1). This is necessary as the destination addresses for branch instructions (including call, jmp, jcc etc.) are not always known up front. The virus uses the address of the next instruction as the operand, which makes sure it is easy to fix the branch instruction (as shown in Figure 15). When fixing other places, the virus needs to analyse the ins_log structure to get the relevant instruction address by given index. There is one subroutine named ‘get_va_from_ins_log_by_index’. This iterates through the info field of the ins_log structure and gets the relevant virtual address by given index. If it is not found, it will return the virtual address of the last ins_log_info in the array (as shown in Figure 16). The information about ins_log for the variant I analysed is shown in Table 1.
Index | VA (from mapped image) | Description |
---|---|---|
0 | 00C23E1D | Destination address of call get_base |
1 | 00C23D99 | VA of instruction ‘call get_base’ |
2 | Not used | |
3 | 00C23DA1 | see Figure 17 |
4 | 00C23DBF | see Figure 17 |
5 | 00C23E45 | address of push_arg1operation |
6 | 00C23E51 | address of load operation |
7 | 00C23E3D | address of get_peb operation |
8 | 00C23E62 | address of store operation |
9 | 00C23E73 | address of add operation |
0xA | 00C23E7B | address of je operation |
0xB | 00C23E90 | patched end address |
0xC | 00C23E24 | address of call operation |
0xD | 00C23E19 | part of jumper |
0xE | 00C23DB6 | see Figure 17 |
0xF | 00C23DC7 | see Figure 17 |
0x10 | 0C23DE8 | see Figure 17 |
0x11 | 00C23DFB | see Figure 17 |
Table 1. Information about ins_log.
The virus fixes the following places with the exception of the branch instruction:
Place 2 in Figure 14: fixes the instruction which is used to locate the encrypted array (as shown in Figure 17).
Place 3 in Figure 14: fixes the instruction that is used to get the address of the dispatcher (as shown in Figure 18).
Places 4, 5, 6 in Figure 14: fixes the instructions for initializing the key to decrypt the VM array (as shown in Figure 19).
After the instruction has been fixed, the virus will start to fix the operation structure array, including the offset and argument fields. It first decrypts the encrypted operation structure array. After decryption, the offset field of the operation structure is the index (as shown in Figure 20), which it can use to get the operation addresses. Next, it fixes the offsets of the operation structure array (as shown in Figure 21). It fills the argument field with random DWORDs, then it fixes the argument field of the operation structure array in order to ensure that the virtual machine executes correctly. After that, it encrypts the array and writes it to the inserted section (as shown in Figure 22), which is to make sure the virtual machine decrypts the array correctly (as shown in Figure 1). At this point, the virtual machine is ready.
Finally, it updates the checksum in the PE header to complete the infection process.
Let’s look at the main execution route of Xpaj.B. Usually there are three routes (as shown in Figure 23). When the infected file is executed, once either the instruction that calls the first overwritten subroutine (Route 3 in Figure 23), the instruction that calls the other overwritten subroutines (Route 2), or the redirected call instruction (Route 1) is executed, the return address of the call is saved into the stack (for Route 2, the address is the return address of the instruction ‘call_start_address_of_first_overwritten_subroutine’) and the virtual machine starts to work. The virtual machine locates the address of the ZwProtectVirtualMemory API and calls this API to modify memory protection of the area that contains the encrypted virus body, then it constructs and executes the decryptor to decrypt the virus body, and constructs and executes the jumper to execute the virus code.
When the virus is started, it gets the return address from the stack and converts it to RVA. Then it iterates through the patch structure list and gets the proper patch structure for the call instruction. If it fails to get the relevant patch structure for the call instruction, this means the executed call instruction is the call to the first overwritten subroutine. Thus it uses the patch structure of the first overwritten subroutine as the matched patch structure. It decrypts the matched patch structure and executes the code from its code field. If the reloc_count field of the matched patch structure is not zero, it will fix the relocations, storing them in the reloc_offset field of the matched patch structure. This allows the infected executable to continue working.
Xpaj.B is not only one of most sophisticated file infectors but also one of stealthiest. It uses several techniques to prevent detection and remain under the radar. Those techniques demonstrate that the authors favour discretion over efficiency and want the virus to persist for as long as possible once the infection has occurred.
[1] Yuan, L. Inside W32.Xpaj.B’s infection – part 1. Virus Bulletin, January 2014, p.13. http://www.virusbtn.com/pdf/magazine/2014/201401.pdf.