Advanced Threat Research Center of Excellence, Office of the CTO, F5
Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. The stolen information is consumed in the underground cyberspace for nefarious purposes. In this article, we present a 360 analysis of the Collector-stealer malware to unearth hidden artifacts covering binary analysis, its working, and the design of associated C&C panels. While carrying out our research on Collector-stealer we found enough indicators to lead us to believe that the malware author is based in Russia. The attacker primarily uses Collector-stealer to target European countries, but it also affects users in other countries such as the USA, China, Cambodia and others. Let’s start the analysis of Collector-stealer to unearth some insights. We will refer to the malicious code as ‘Collector-stealer’ throughout this article.
We began our research on the Collector-stealer malware by looking at how this malicious code has been distributed across the Internet. The Collector-stealer author uses multiple methods to launch infections, which include coercing users to visit phishing portals hosting free game downloads, Windows activation/crack software packages, etc., to trigger drive-by download attacks that install the malware on the fly. Drive-by download attacks can be executed by exploiting vulnerabilities in client software such as browsers or abusing the inherent design of browsers. For Collector-stealer distribution, the phishing emails contained messages that appeared to have come from legitimate authorized entities. With additional efforts, we analysed the domains (or subdomains) contained in the phishing emails to collect more intelligence. In the following sections we discuss a few different examples to explain how the attacker distributes the malware.
KMSAuto (see Figure 1) is a free Windows/Office activation utility. In fact, KMSAuto is an unauthorized piece of software that allows the user to circumvent the integrity of the operating system to install cracked tools and is also categorized as a hack tool or ‘riskware’, allowing end-users to use MS Office products illegally. Generally, hack tools like KMSAuto enable attackers to bundle their malicious code with the cracked utility so that legitimate applications can be used in conjunction with malicious code, a technique known as piggybacking.
Using the piggybacking technique, malicious code rides on the back of the KMSAuto utility and activates the legitimate application illegally, to inject additional malicious payloads. Collector-stealer is distributed in this manner. Upon execution of the utility, KMSAuto activates the application at the front, but in the backend it creates a sub-process that installs Collector-stealer on the victim’s machine and starts to communicate with the C&C server.
A number of other riskware utilities are used to serve the same purpose as KMSAuto. The majority of these packages relate to games or Discord VPN, such as GTAV, NewWay ModMenu, COD Manager, etc. The quote ‘Nothing comes for free on the Internet’ is worth mentioning here: if a user downloads any crack utility, software key generator or cracked games/software from an untrusted site, there is high chance that it will lead to the user’s machine being infected with malicious code.
KMSAuto is not the only method through which Collector-stealer is distributed. Phishing web portals mimicking content from legitimate software provider sites are also used to spread the malware. During our research, we discovered that the attacker hosted a phishing web portal that replicated the content from a cryptocurrency software provider portal. The phishing web portal hosted Collector-stealer packages, and when the user visited the web portal, Collector-stealer was downloaded on the fly. The phishing web portal tricked users into believing that they had downloaded legitimate miner software, which was not the case.
Figure 2 shows a comparison between the phishing portal and the legitimate portal.
The phishing web portal was ethhomework[.]online and the legitimate web portal was hostero[.]eu. The attacker mirrored the content from the hostero[.]eu site and also added socially engineered messages such as offering bitcoin after downloading, which is nothing but bait for users.
Now we have an understanding of the distribution mechanisms deployed by the attackers to spread Collector-stealer. Next, we present a technical analysis.
Public reports have already been published about this malware. The malware has been active since mid-2020 and is still active in the wild and actively compromising victim machines. Collector-stealer, a stealth stealer written in C++, infects the victim machine to steal valuable information such as stored passwords, cookies, web data and more, from the infected machine. Collector-stealer, as its name suggests, is collectively used by many bad guys to exfiltrate data from across the world.
Let’s start an in-depth analysis of the main program and have a look at the functionalities this malware is equipped with.
In this section we will present a basic structure of the Collector-stealer malware using static analysis. Figure 3 shows the portable executable (PE) file structure. Overall, PE file structure analysis helps us to understand the design of an executable. PE format is composed of Common Object File Format (COFF), object code, dynamic link libraries (DLLs), font files, and core dumps in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that provides the information to the Windows loaders required to load the executable code in the memory. This includes dynamic library references for linking, API export and import tables, resource management data, and other structures.
Now we have provided a basic overview of Collector-stealer, including the basic details of the PE structure. Let’s analyse the binary for more insights.
It is important to analyse the PE import address table (IAT) to understand the API functions being imported from the system DLLs and used during the execution of the binary. When a PE file is loaded into the system, the Windows loader is required to read the PE structure and load the image directly into the memory. During this process, the loader is also responsible for loading all the DLLs that the executable is going to use into the associated process address space. The mapping of DLLs and related API functions is managed through the import address table, which contains the function pointers that the Windows loader copies while DLLs are loaded. In simple terms, the IAT defines which API functions the executable is going to consume and performs related operations. We extracted the IAT table of the Collector-stealer malware and the most critical API functions are discussed in Table 1 to help to understand how Collector-stealer works in the system.
DLL | Critical API functions | Basic overview |
urlmon.dll | URLDownloadToFileA | Used to download files from the Internet. |
wininet.dll | InternetCloseHandle InternetOpenA InternetConnectA HttpSendRequestExA HttpEndRequestA InternetWriteFile HttpOpenRequestA |
The library contains modules that help applications to interact with FTP and HTTP protocols to access Internet resources. This library contains Internet-related methods. |
crypt32.dll | CryptUnprotectData | Microsoft Cryptographic library, which implements many of the certificate and cryptographic messaging functions in the CryptoAPI, such as CryptSignMessage. |
ADVAPI32.dll | RegGetValueA RegOpenKeyExA |
Advanced Application Programming Interface (AdvAPI32) is designed to support security calls and registry manipulation calls. |
Shell32.dll | SHGetFolderPathA | Designed to provide Windows Shell API functions, which are used to open web pages and files. |
User32.dll | CloseClipboard FindWindowA OpenClipboard keybd_event GetDesktopWindow ShowWindow GetClipboardData |
Contains modules that help to implement the Windows USER component, additionally provides functions by which we can simulate user behaviour. |
Kernel32.dll | OutputDebugStringW IsDebuggerPresent CreateFileW DeleteFileW LoadLibraryA TerminateProcess GetCurrentProcess FindFirstFileExW FindNextFileW GetFileAttributesEx VirtualProtect WriteFile ReadFile GetStartupInfoW |
Kernel32.dll is one of the major libraries in Windows machines. It provides functions for process and thread creation, memory management and more. It’s a user-mode library. |
gdiplus.dll | GdiplusShutdown GdipCreateBitmapFromHBITMAP GdipGetImageEncoders GdipCreateBitmapFromScan0 GdipSaveImageToStream GdipGetImageEncodersSize GdipDisposeImage GdiplusStartup |
Microsoft Graphics Device Interface Library, designed to handle graphics components for images. |
In this section we discuss the encryption and decryption procedures used by Collector-stealer. Listing 1 highlights the pseudocode that Collector-stealer uses to deobfuscate the Windows public directory path address and other obfuscated strings using the key (string) ‘1A’.
Step 1 : Encoded_String : "BB;YKXYBB6[HROIBB"
Step 2 : Deobfuscate_str : ""
Label 1:
Deobfuscate_str =Deobfuscate_str+HEX(String[i])+1A
If i<=Length(Encoded_String) :
i=i+1
Goto Label 1
On successful deobfuscation of the code above and executing in a controlled manner, a number of strings were deobfuscated to obtain the clear text that clarifies how exactly Collector-stealer interacts with various resources in the system. Table 2 shows a number of clear text strings obtained after deobfuscation of the inherent routine used by Collector-stealer.
Encrypted string | Decrypted strings |
2UMOT*GZG | Login data |
)UUQOKY | Cookies |
=KH*GZG | Web data |
] GRRKZJGZ | wallet.dat |
HXU]YKX | browser |
JKYQZUV0x14VTM | desktop.png |
0x15 0x3AKRKMXGS 0x6 0x2A KYQZUV 0x15ZJGZG | /Telegram Desktop/tdata |
9ZKGS6GZN | SteamPath |
95,:=’8+BB<GR\KBB9ZKGS | SOFTWARE\\Valve\\Steam |
0x14\JL | .vdf |
YZKGS | /con |
Whenever in the code the malware needs to use a string, it takes the encrypted string and passes it into functions to deobfuscate it.
To store all collected data into a file Collector-stealer performs additional operations to generate random file names by utilizing the power of Windows APIs to generate pseudo-random numbers, i.e. seed values to create random file names, determining the length of the calculated strings, etc. To create a random string file name, Collector-stealer calls GetSystemtimeAsFileTime, as shown in Figure 4.
The GetSystemtimeAsFileTime API is used to obtain the current date and time of the system. The API processes 64-bit values, representing LowDateTime (low-order part of the file time) and HighDateTime (high-order part of the file time) under the FILETIME structure. In general, Collector-stealer validates the condition by verifying the data and time limits to calculate the current time of the system. The calculated date and time value act as seed values to create a random string for generating the file name. Figure 5 shows an algorithm used by Collector-stealer to generate 15-word-long random strings, which are later used as file names.
Figure 6 shows Collector-stealer creating a new file by calling the CreateFileW [1] function utilizing the same algorithm as highlighted in Figure 5.
Collector-stealer also performs additional operations to extract the location and attempts to check the current time zone setting using the GetTimeZoneInformation function to check which time zone the victim machine belongs to, along with the computer name by calling the GetComputerName method.
In this section the focus is on analysing the interaction of Collector-stealer with the Windows clipboard and keyboard events to take screenshots. In general, clipboard operations cover the cut and copy functions that Windows provides. The clipboard history reflects the number of copy operations that are present in the clipboard. One of the main functionalities of Collector-stealer is to steal data via screenshot captures and store them in the clipboard. As shown in Figure 7, to capture the screenshot, the malware uses the Keybd_event function from user32.dll and simulates the user pressing ‘Take Screenshot’, bypassing ‘VK_SNAPSHOT’ as a key parameter in the keybd_event function.
Later, Collector-stealer accesses the captured image from the clipboard in bitmap format and converts it into stream (Figure 8).
Figure 9 is a byte stream of the captured screenshot, initial bytes ‘FF D8 FF E0’, which is the four-byte signature of JPEG IMAGE. The malware writes a byte stream from memory into a newly created file in ‘\\public\\directory’. After taking a screenshot, Collector-stealer starts to look for other data such as finding Telegram data in the ‘tdata’ folder located at ‘%Appdata%’. The ‘tdata’ folder is the directory in which Telegram Desktop keeps all media-related files, session details, etc.
In this section we will look at how Collector-stealer searches for sensitive information stored in specific directories. After getting basic information, Collector-stealer starts to iterate each folder and file in the location ‘C:\Users\alert-user\AppData\Roaming\’ and looks for file login data, cookie files, web data, and wallet.dat. Let’s understand the type of content stored in these files. The details are discussed below:
Once Collector-stealer has completed the scanning of the primary directory, it iterates the ‘\\AppData\\Local’ directory and once again looks for file login data, cookie files, web data, and wallet.dat. Collector-stealer also scans all desktop files and looks for files with specific extensions such as .txt, .log, .rdp and others. If any file is found with such an extension, it reads the content of the file and stores the data in a compressed RAR file.
Collector-stealer uses the URLDownloadToFile API to download sqlite3.dll from the C&C server, as shown in Figure 10, and stores it in the public directory.
The majority of modern browsers use SQLite databases to store data locally specific to different websites and applications. The data includes cookies, login data, autofill data, web data, etc. The parsing of an SQLite database requires the SQLite library. Collector-stealer leverages this functionality to obtain data in clear text from the SQLite database stored in the compromised end‑user systems. SQLite’s library allows Collector-stealer to run SQL queries for data extraction. After successfully reading the data, Collector-stealer compresses all the stolen data, including directory structure, into Zip format. Figure 11 shows how Collector-stealer structures the collected data into a single archive file.
Table 3 represents the complete directory structure compressed as a Zip file and sent by Collector-stealer to the C&C server.
Information.txt | Contains basic information such as launch time, username, etc. |
desktop.png | Desktop screenshot. |
/Browser/ | Contains collected cookies data, autofill data, stored passwords. |
/Files/ | Contains collected files from a desktop location whose file types are .txt, .log, .docx, .rdp. |
/Telegram/ | Contains collected data from the data directory, i.e. Telegram data. |
/Steam/ | Contains collected data from the steam directory as well as Steam registry entries including active user details. |
Let’s understand the network communication, i.e. how this malicious code interacts with the C&C server and transmits the stolen data through the network. After successfully collecting the sensitive data from the system, and before transmitting the data to the C&C server, as shown in Figure 12 Collector-stealer creates a new command line sub-process to check Internet connectivity on infected machines by pinging Cloudflare DNS resolver IP address 1.1.1.1. If the ping request fails, then it deletes the executable file along with all previously collected data and silently exits.
Let’s dissect the command: ‘cmd.exe /C ping 1.1.1.1 -n 1 -w 1000 > Nul & Del /f /q \”%s\”’ to understand the complete operation.
Upon successful connectivity checks, Collector-stealer sends collected data to the C&C server. Figure 13 shows the malware using HTTP POST requests to transmit stolen data using zip files.
If you see the ‘user-agent’ string, Collector-stealer does not use the standard browser identifier, rather a custom string ‘uploader’. The data exfiltration process utilizes the HTTP communication channel. HTTP POST requests are initiated by Collector-stealer to exfiltrate the data in a compressed format (zipped) to the C&C panel managed by the attacker. The unzipping of the data occurs in the C&C panel itself and stolen data is uploaded in the C&C portal for utilization.
In this section, you will gain an understanding of the design of Collector-stealer’s C&C panel. Collector-stealer C&C uses HTTP protocol for communication purposes. The data extracted from the compromised machine is sent to the C&C server in an archive format over HTTP. The Collector-stealer C&C server accepts the HTTP POST requests to receive data from infected machines. The Collector-stealer C&C panel is developed using PHP using openresty as a web server. Figure 14 shows the C&C login panel.
Figure 15 shows how attackers keep stolen data in a more structured way.
Information stealers actively target victims to exfiltrate sensitive information from the compromised systems. In this article we have discussed the analysis of the Russian origin Collector-stealer malware. The data stolen by Collector-stealer is used for nefarious purposes – it may be sold in the underground market, the information may be harnessed for launching additional attacks, etc. Due to its popularity in the wild, some adversarial groups have released cracked versions of Collector-stealer on the Internet and made it freely available. The intelligence presented in this research can be utilized to enhance detection and prevention solutions to combat the risks posed by the Collector-stealer malware.
[1] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew.
As Collector-stealer is bundled with numerous stealing features, it quickly gained popularity on underground forums. Due to its extensive list of services offered, as shown in Figure 16, many forum users were keen to buy this stealer and even some groups attempted to provide cracked versions.
Our research team started to investigate the malware and its author and discovered some interesting facts. Collector-stealer is sold by the ‘Hack_Jopi’ group (Figure 17), and after some more digging we found that the author sells this malware only to Russian users. The author has been active on the forum since October 2018, is still active and periodically releases updates. As per the forum’s updates, the last update version was v1.20 ‘aimed at fixing the collection of Telegram, Discord and Steam sessions’. But we also saw some samples with 1.21 version (malware mentioned version_id in drop file).
To cover maximum audiences for Collector-stealer, the author uses other platforms as such as Telegram, where authors create their own channel and publicize their stealer. In forums, they created a separate account with the name ‘CollectorSoft’. Most of the accounts were created between May and June 2020. So we can conclude that this malware started spreading in mid-2020 and is still active in the wild.
A number of malware hashes and list of C&C servers related to Collector-stealer are presented in Listings 2 & 3:
a9e3f9fb9cf5ae8dcbfd139ecadb961a
bd27acd9bc0ba05847dc0d8ea443e437
253ce038dd0e2a30165f24b18aaa34d3
eb8e99e82b6ed97f89292467aa8dc866
f0537213[.]xsph[.]ru
fata-collector[.]online
f0542175[.]xsph[.]ru
f0538564[.]xsph[.]ru
f0537214[.]xsph[.]ru
f0548561[.]xsph.ru
collector-node[.]us
collector-gate01[.]us
collector-steal[.]ga
a0556434[.]xsph[.]ru //Kpot Panel
Figure 18 shows an example of an online advertisement for Collector-stealer presented in the underground community. We can clearly see the advertisement is in Russian language.
In Table 4 we correlate malware activity with MITRE ATT&CK TTP mapping.
Tactic | Identifier | Name | Description |
Collection | T1074.001 | Data Staged: Local Data Staging | Store collected data at //Public// directory |
T1560 | Archive Collected Data | Archive collected data before sending | |
T1113 | Screen Capture | Take desktop screenshot | |
Discovery | T1083 | File and Directory Discovery | Look for web data/cookies/login/desktop files |
T1124 | System Time Discovery | Gather the system time and/or time zone | |
T1012 | Query Registry | Query HKCU for Steam entry | |
T1082 | System Information Discovery | Extract OS version | |
T1016.001 | System Network Configuration Discovery: Internet Connection Discovery | Ping 1.1.1.1 to check Internet connectivity | |
Execution | T1059.003 | Windows Command Shell | Ping 1.1.1.1 using cmd.exe |
Credential access | T1555 | Credentials from Password Stores | Steal browser passwords |
T1539 | Steal Web Session Cookie | Steal cookies | |
Command and control | T1071.001 | Application Layer Protocol: Web Protocols | Send data to the C&C server via POST requests |
T1105 | Ingress Tool Transfer | Download sqlite3.dll | |
Defence evasion | T1027 | Obfuscated Files or Information | Contain obfuscated strings |
T1036.001 | Invalid Code Signature | Invalid signed executables |
While carrying out research on this malware, we tried to determine if there is any correlation between it and any other malware family. There is a strong opinion that the Collector-stealer author utilizes the same components as those found in the code of Hunter stealer. Hunter stealer is an information stealer whose functionality is almost identical to Collector-stealer. Even the file creation and file drop pattern provided by the two stealers are the same. And most importantly, many buyers of this malware reported that logs go through the seller’s bot. Another possibility is that the author is the same for both malware families. This is not concrete information, but it’s a good starting point for collecting more intelligence for attribution.