As shown in the pwntools ELF tutorial, pwntools has a host of functionality that allows you to really make your exploit dynamic. Simply setting elf.address
will automatically update all the function and symbols addresses for you, meaning you don't have to worry about using readelf
or other command line tools, but instead can receive it all dynamically.
Not to mention that the ROP capabilities are incredibly powerful as well.
Exploiting PIE with a given leak
Pretty simple - we print the address of main
, which we can read and calculate the base address from. Then, using this, we can calculate the address of win()
itself.
Let's just run the script to make sure it's the right one :D
Yup, and as we expected, it prints the location of main
.
First, let's set up the script. We create an ELF
object, which becomes very useful later on, and start the process.
Now we want to take in the main
function location. To do this we can simply receive up until it (and do nothing with that) and then read it.
Since we received the entire line except for the address, only the address will come up with p.recvline()
.
Now we'll use the ELF
object we created earlier and set its base address. The sym
dictionary returns the offsets of the functions from binary base until the base address is set, after which it returns the absolute address in memory.
In this case, elf.sym['main']
will return 0x11b9
; if we ran it again, it would return 0x11b9
+ the base address. So, essentially, we're subtracting the offset of main
from the address we leaked to get the base of the binary.
Now we know the base we can just call win()
.
By this point, I assume you know how to find the padding length and other stuff we've been mentioning for a while, so I won't be showing you every step of that.
And does it work?
Awesome!
From the leak address of main
, we were able to calculate the base address of the binary. From this we could then calculate the address of win
and call it.
And one thing I would like to point out is how simple this exploit is. Look - it's 10 lines of code, at least half of which is scaffolding and setup.
Try this for yourself first, then feel free to check the solution. Same source, same challenge.
Using format string
Unlike last time, we don't get given a function. We'll have to leak it with format strings.
Everything's as we expect.
As last time, first we set everything up.
Now we just need a leak. Let's try a few offsets.
3rd one looks like a binary address, let's check the difference between the 3rd leak and the base address in radare2. Set a breakpoint somewhere after the format string leak (doesn't really matter where).
We can see the base address is 0x565ef000
and the leaked value is 0x565f01d5
. Therefore, subtracting 0x1d5
from the leaked address should give us the binary. Let's leak the value and get the base address.
Now we just need to send the exploit payload.
Same deal, just 64-bit. Try it out :)
Position Independent Code
PIE stands for Position Independent Executable, which means that every time you run the file it gets loaded into a different memory address. This means you cannot hardcode values such as function addresses and gadget locations without finding out where they are.
Luckily, this does not mean it's impossible to exploit. PIE executables are based around relative rather than absolute addresses, meaning that while the locations in memory are fairly random the offsets between different parts of the binary remain constant. For example, if you know that the function main
is located 0x128
bytes in memory after the base address of the binary, and you somehow find the location of main
, you can simply subtract 0x128
from this to get the base address and from the addresses of everything else.
So, all we need to do is find a single address and PIE is bypassed. Where could we leak this address from?
The stack of course!
We know that the return pointer is located on the stack - and much like a canary, we can use format string (or other ways) to read the value off the stack. The value will always be a static offset away from the binary base, enabling us to completely bypass PIE!
Due to the way PIE randomisation works, the base address of a PIE executable will always end in the hexadecimal characters 000
. This is because pages are the things being randomised in memory, which have a standard size of 0x1000
. Operating Systems keep track of page tables which point to each section of memory and define the permissions for each section, similar to segmentation.
Checking the base address ends in 000
should probably be the first thing you do if your exploit is not working as you expected.