The dark side of WebAssembly

Aishwarya Lonkar & Siddhesh Chandrayan

Symantec, India

Table of contents

Abstract
Introduction
JavaScript
Asm.js
WebAssembly
How is WebAssembly generated?
WebAssembly's date with malware
Case 1: Tech support scams
What is a tech support scam?
Tech support scam sources
Tech support scams on the rise
Tech support scams getting murkier
What's next: use of WebAssembly
CASE 2: Website keyloggers
What are keyloggers?
WebAssembly – Exploring new frontiers
References

Abstract

The WebAssembly (Wasm) format is a way to run code, compiled in native languages such as C/C++, on web browsers. WebAssembly has better performance when running native code than other variations of compiled JavaScript such as asm.js (Assembly JS). WebAssembly is often used in developing web games. Recent versions of all popular browsers including Chrome, Firefox and Microsoft Edge support WebAssembly execution.

Though Wasm has been around for a few years, it rose to prominence more recently when it was used for cryptocurrency mining in browsers. This opened a Pandora's box of potential malicious uses of Wasm.

In this paper we will walk through some of the instances in which Wasm can be used maliciously, such as:

Tech support scams: with the decline of exploit kits we have seen an uptick in tech support scams delivered in various ways including compromised websites, malvertisements (malicious advertisements), etc. These scams make extensive use of JavaScript with little or no obfuscation, making their detection relatively easy. In this paper we will describe how Wasm may be used in tech support scams to render them harder to detect by security products.
Browser exploits: browser exploits written in JavaScript can be tailored to use Wasm for browser exploitation and subsequent malware download.
Script-based keyloggers: Wasm can also be used to steal information entered into web forms. Currently, such information stealing is done via JavaScript.

To add the cherry to the top of the cake, detection of Wasm is difficult as it is a compiled file, making string-based detection almost impossible. We will discuss some of the areas in which we expect the above methods to be used.

Introduction

JavaScript

JavaScript [1] is a general-purpose programming language. It's a simple language with a huge ecosystem, and it is tightly integrated in the web. There is no way of moving away from JavaScript without breaking all of the existing web applications, which is not a situation any browser vendor wants. Furthermore, all browser technologies and security constraints are designed specifically for JavaScript.

Current JavaScript is quite fast, but there are a few mechanisms in JavaScript engines that limit its speed [2]:

Boxing: Floating point numbers are boxed, they have wrappers that allow them to co-exist with other values such as objects.
Just-in-time (JIT) compilation and runtime type checks: Most JavaScript engines compile code in two stages. Initially, a format is used that can be compiled to quickly, but that runs slowly. The execution of that format is observed. If it runs more often, assumptions can be made about the types of its parameters etc., and it can be compiled to a format that runs faster. If one of the assumptions turns out to be wrong, the faster format can't be used anymore and the engine has to go back to the slower format. The faster format is always slowed down by having to check whether the assumptions still hold.
Automated garbage collection: this can be slow.
Flexible memory layout: JavaScript's data structures are very flexible, but they also make memory management slower.

Asm.js

Asm.js [3] is a subset of JavaScript, defined with the goal of being easily optimizable and used primarily as a compiler target from languages like C and C++. Asm.js code can produce executables that exhibit none of the drawbacks listed above. They can be compiled 'ahead of time' and are faster than JIT-compiled ones.

The web is not controlled by any single vendor, so every change must be a joint effort. It was a group of hardcore developers at Mozilla that developed asm.js. Meanwhile, Google developers worked on Native Client (NaCl) and Portable Native Client (PNaCl), a binary format for the web based on the LLVM compiler project. Although each of these solutions worked to some degree, they did not provide a satisfactory answer to all the above problems. It was from this experience that WebAssembly was born: a joint effort aimed at providing a cross-browser compiler target.

The continued evolution of asm.js is WebAssembly [4]. WebAssembly is intended to fill a role that JavaScript has been forced to occupy up to now: a low-level code representation that can serve as a compiler target.

WebAssembly provides a unified compilation target for languages such as C and C++ that do not map easily to JavaScript [5].

WebAssembly

WebAssembly (Wasm) is a new type of code that can be run in modern web browsers and provides new features and major gains in performance. It is considered as a new binary format for the web [6, 7]. Generally, performance-critical functions can be implemented in Wasm and can be imported like a library into JavaScript.

Wasm was not created as a replacement for JavaScript, rather to complement and work alongside it. With the introduction of WebAssembly, the modern web browser's virtual machine is expected to run both JavaScript and Wasm.

All major browsers support Wasm. The benefits of WebAssembly include:

Fast, efficient and portable: WebAssembly code can be executed at near-native speed across different platforms
Readable and debuggable: WebAssembly is a low-level assembly language, but it has a human-readable text format
Secure: WebAssembly is specified to be run in a safe, sandboxed execution environment.

Figure 1: WebAssembly: a joint effort aimed at providing a cross-browser compiler target.

How is WebAssembly generated?

Tools like Emscripten [8, 9] can be used to compile code written in C/C++ into WebAssembly:

Take a copy of the following simple C example, and save it in a file called 'hello.c' in a new directory on your local drive:

Figure 2: Save a copy of this C example in a file called 'hello.c' in a new directory on your local drive.

Navigate to the same directory as your hello.c file, and run the following command:

emcc hello.c -s WASM=1 -o hello.html

The options in the command are as follows:

-s WASM=1 – specifies that we want Wasm output. If we don't specify this, Emscripten will just output asm.js, as it does by default.

-o hello.html – specifies that we want Emscripten to generate an HTML page in which to run our code (and a filename to use), as well as the Wasm module and the JavaScript 'glue' code to compile and instantiate the Wasm so it can be used in the web environment.

Figure 3: Compiling code into WebAssembly.

There are future plans to get rid of the above JavaScript glue code to allow WebAssembly modules to be loaded like JavaScripts (<script type='module'>).

WebAssembly's date with malware

With the performance benefits and features that WebAssembly provides, it was only a matter of time until malware authors took notice. WebAssembly found its place in browser-based miners wherein it was used to mine cryptocurrency using the victim's computer resources (basically CPU cycles). The WebAssembly code used was developed using C implementation of the Cryptonight mining algorithm. The mining process occurred, mostly unknown to the victim.

The flow of the mining process is shown in Figure 4.

Figure 4: Mining process.

With knowledge of the above-mentioned technique, which is already in the wild, let's discuss other ways in which WebAssembly can be used maliciously.

Case 1: Tech support scams

What is a tech support scam?

A technical support scam (often abbreviated to tech support scam) refers to telephone fraud in which scammers claim to be providing a legitimate technical support service. It may begin with a cold call, usually from a legitimate-sounding third party like 'Microsoft' or 'Windows'. Remote desktop software is used to connect to the victim's computer, and the scammer then uses a variety of confidence tricks that employ various Windows components and utilities (such as the Event Viewer), third-party utilities (such as rogue security software), and reference sites like Wikipedia or summaries written by security companies to make the victim believe that the computer has issues that need to be fixed, before asking the victim to pay for 'support'. These scams usually target users, such as senior citizens, who are unfamiliar with the tools used in the process, especially when taken by surprise by a cold call.

In other cases, the scam is initiated with a browser pop-up that 'alerts' the victim to an apparent infection on their machine and urges them to call a tech support number. An example of a tech support scam browser pop-up can be seen in Figure 5.

Figure 5: Tech support scam browser pop-up.

The attacker wants victims to see the alerts in the browser and continues to bombard them with pop-ups about the apparent infection. When the victim calls the tech support number, the scammers either ask for money to address the 'problem' or simply install some software/backdoor on the victim's machine.

Tech support scam sources

Sources of tech support scams may include the following:

Unsuspecting user searching for commercial technical support via a popular search engine such as Bing or Google.
Legitimate but compromised websites which redirect to these scams. Website compromise is usually achieved via exploiting vulnerabilities in CMS (Content Management Systems) such as WordPress, Joomla, Drupal, etc.
Malicious advertisements which redirect to these scams. This mechanism makes use of fingerprinting techniques such as geolocation checks, browser information, etc. to avoid detection and avoid showing the same scam to a single user.

Tech support scams on the rise

For a long time, exploit kits were the preferred malware delivery vehicle for malware authors. However, the non-availability of newer browser and plug-in exploits coupled with hardening of operating systems, meant that exploit kits became increasingly less viable and malware authors were met with reduced infection rates. To keep the money flowing, redirection campaigns associated with exploit kits gradually shifted to delivering tech support scams to victims. This led to a heavy influx in tech support scams. Evidence of this can be found in reports presented by Microsoft [10] and the FBI's Internet Crime Complaint Center (IC3) [11].

Tech support scams getting murkier

When tech support scams first arrived on the scene, all the malicious and annoying web page behaviour was achieved through the use of JavaScript, which was unobfuscated and could easily be detected. However, as tech support scams began to emerge as a major force in the threat landscape, new anti-detection features were added. These started with the use of light obfuscation such as hex encoding, and went all the way to the use of packed encoding and even encryption algorithms like AES (Advanced Encryption Standard) [12, 13] (see Figure 6).

Figure 6: As tech support scams emerged as a major force in the threat landscape, new anti-detection features were added.

What's next: use of WebAssembly

Now we have discussed both WebAssembly and tech support scams, let's take a dive into their fusion.

Tech support scams rely on JavaScript to achieve almost all of their objectives. WebAssembly allows the execution of JavaScript in its compiled binary form with fewer detection avenues. Thus, a combination of the two achieves the underlying objective of scaring the victim by presenting a scam which is entirely built on WebAssembly, leaving no traces.

A proof of concept for this combination can be found in Figure 7, which shows a snippet of C code which executes JavaScript code.

Figure 7: Proof of concept: snippet of C code which executes JavaScript code.

The Emscripten compiler provides a way to call JavaScript from C using EM_ASM() [14].

Code within the EM_ASM() tag will run as if it appeared directly in the generated code. That is, the JavaScript code is executed like a normal piece of JavaScript which is usually found on the web.

Walking through the JavaScript code, a pop-up warning the user that the system is infected is shown first, along with an image, as shown in Figure 8.

Figure 8: A popup warns the user that the system is infected.

Moving forward, the scam checks for the following key presses:

Keycode	Key
13	ENTER
27	ESC
18	ALT
123	F12
85	u
9	TAB
115	F4
116	F5
112	F1
114	F3
17	CTRL

This prevents the user from escaping the scam by pressing keys like ESC or the CTRL+ALT+DELETE combination, or others as shown in the table.

The code also monitors mouse clicks and pops up the malicious alert each time the mouse is clicked.

In this scenario, only the code within the 'document.write()' tag is rendered in the browser, while the JavaScript code is loaded on the fly. The only visible trace of the C code is a Wasm file, seen in the browser cache, the content of which is shown in Figure 9. Thus, security products will only see the compiled Wasm file rather than the JavaScript source code. This is similar to seeing an executable file in a text editor, thus making detection difficult.

Figure 9: Content of the WASM file, seen in the browser cache.

CASE 2: Website keyloggers

What are keyloggers?

Keystroke logging, often referred to as keylogging or keyboard capturing, is the action of logging the keys struck on a keyboard, typically covertly, so that the person using the keyboard is unaware that their actions are being monitored. Data can then be retrieved by the person operating the logging program, better known as the keylogger [15].

Keyloggers are most often used for stealing passwords and other confidential information.

Keyloggers come in various forms including executable files, script files, etc., but the end objective is always to steal confidential data such as passwords, credit card details, etc.

Executable keylogger files land on the system via a variety of sources such as spam mails, social engineering scams, vulnerability exploitation, etc. Executable keyloggers can monitor keystrokes regardless of the running application – that is, keystrokes can be monitored whether the user is filling in a website form, typing in a Notepad file or any other actions carried out through the keyboard.

Script keyloggers are typically written in JavaScript, VB Script, etc. Script keyloggers are injected into compromised websites to steal passwords and other confidential information from website visitors. In the majority of cases, website owners and visitors are unaware of this keylogging activity. Script loggers are restricted to the website into which they are injected.

In this paper, we will discuss script keyloggers combined with WebAssembly. Since this kind of keylogger is written entirely in JavaScript, it is prone to string-based detection. With the following proof of concept, we will see how these detections can be bypassed.

Figure 10: Proof of concept code.

In the code shown in Figure 10, there are four main functions:

myFunction0() – stores the entered username.
myFunction1() – stores the entered password.
display() – in this function we display the captured credentials which we obtained in the above two functions.
onkeypress – this function listens to the keys pressed by the user and stores the result.

In lines 43 and 57, we can see the 'change' eventListener being attached to the text fields for username and password. This event is fired when the user has finished entering the username/password. When this event is fired, the code in myFunction0() or myFunction1() is called respectively, thus capturing the credentials.

The rest of the code just builds the HTML front end for the user input form.

In this scenario, security products will only see the compiled Wasm file rather than the JavaScript source code, thus making detection difficult.

The output of the proof of concept can been seen in Figure 11.

Figure 11: Output of the proof of concept.

This example shows that WebAssembly can be used in phishing campaigns to capture confidential information without leaving many traces for detection purposes.

WebAssembly – Exploring new frontiers

As we have witnessed, WebAssembly can be used in a variety of ways to achieve nefarious goals. However, this is just the beginning. We firmly believe that, in the future, WebAssembly will leave its footprint in one or more of the following domains:

Browser exploits – Going through some of the publicly available recent browser exploits, we see that they involve JavaScript. Thus, WebAssembly can play an important role in browser exploitation by obfuscating the exploit code.
Malicious redirections – We usually encounter malicious redirections from compromised websites to tech support scams, browser miners, etc. Instead of doing redirection through JavaScript, the redirection can be achieved using WebAssembly. The code snippet below shows redirection to our keylogger POC.

Thus, we can build a long redirection chain using WebAssembly: the compromised website loads the above Wasm, which leads to the custom phishing page where we steal confidential information using WebAssembly.

References

[1] https://www.quora.com/in/Will-WebAssembly-make-JavaScript-skills-more-or-less-valuable-in-the-future-WebAssembly-will-allow-performance-critical-stuff-to-be-done-using-WASM-while-all-the-rest-will-still-make-sense-to-be-done-in-Javascript.

[2] http://2ality.com/2013/02/asm-js.html.

[3] https://medium.com/javascript-scene/why-we-need-webassembly-an-interview-with-brendan-eich-7fb2a60b0723.

[4] https://brendaneich.com/2015/06/from-asm-js-to-webassembly/.

[5] https://auth0.com/blog/7-things-you-should-know-about-web-assembly/.

[6] https://webassembly.org/.

[7] https://developer.mozilla.org/en-US/docs/WebAssembly.

[8] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten.

[9] http://kripken.github.io/emscripten-site/.

[10] https://cloudblogs.microsoft.com/microsoftsecure/2018/04/20/teaming-up-in-the-war-on-tech-support-scams/.

[11] https://www.ic3.gov/media/2018/180328.aspx.

[12] https://www.symantec.com/connect/blogs/tech-support-scams-increasing-complexity.

[13] https://www.symantec.com/blogs/threat-intelligence/tech-support-scams-aes.

[14] https://kripken.github.io/emscripten-site/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#interacting-with-code-call-javascript-from-native.

[15] https://en.wikipedia.org/wiki/Keystroke_logging.

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…

Bulletin Archive