ROP exploitation in Linux

The exploitation techniques used in stack-smashing have evolved right alongside the security mitigations that were created to prevent them. At one point in time an attacker was able to craft an exploit which overwrites the saved return address so that it points into a stack location where the shellcode was injected. This technique has been foiled by various security mechanisms over the years, but it was primarily DEP (data execution prevention) that created the necessity for an exploit to return somewhere other than the stack. The classic phrack paper Advanced return-into-lib(c) exploits details the earliest forms of ROP exploits used in Linux, by returning directly into libc, or even indirectly through the PLT (procedure linkage table). These techniques are still valid today, and tend to be used in conjunction with code chunks called gadgets. A gadget is typically a small sequence of instruction which performs some task or another, such as popping values from the stack into %rdi, %rsi, etc. Each gadget will end in a ret instruction of some type, or in some cases a jmp instruction (which is why the term JOP is often used with ROP). The gadgets are chained together by what is called a ROP Chain. A ROP chain is a carefully crafted stack which links the gadgets together through a series of return addresses. The ROP chain may also contain arguments that will be passed to the functions that the attacker is aiming to execute. A ROP chain is often termed a ‘data-only payload’ since it contains only data; usually a series of memory addresses and arguments.

Why is detecting ROP important?

Virtually all stack-based overflow vulnerabilities in Linux userland applications will be exploited using some form of ROP. Not only that, but vulnerabilities such as heap, or even .bss overflows are also targets for ROP, as the attacker can perform a stack pivot which points %rsp to whatever data structure the attacker has control over. This is to say that ROP is very prevalent, and not limited to stack overflows.

ROP detection forensics

Over the years there have been a variety of shellcode detection engines, but only in the more recent times have we began seeing research into developing forensics capabilities which can detect ROP chain payloads within memory. Here at Backtrace we have a growing set of security features for detecting exploitation and malware artifacts within a given process image or memory coredump. The forensics feature for detecting ROP activity is able to detect ROP chains and gadgets by using a set of heuristics that will be discussed in depth in a follow up blog-post. There are some other researchers out there who have produced very good work in the area of ROP forensics, such as the developers of ROPMEMU which is a framework based on Volatility plugins and the Unicorn QEMU hooks for emulation. It has been used to reliably capture and decompile some very sophisticated ROP chains, including one that serves as a ROP chain based kernel rootkit.

ROP forensics challenges

Through my own experience, which has led me into some pretty interesting research related to ROP exploitation detection and mitigation through techniques such as CFI instrumentation. I have personally come to the opinion that designing a mechanism to prevent ROP from ever happening is in some ways more straight forward to accomplish than designing one which attempts to forensically search for evidence of a ROP attack that is in progress or has passed already. This could be likened to an analogy of a real crime scene; It would be more straight forward to catch a criminal at the scene of a crime if you were present at the scene before and while the criminal shows up. If you did not arrive at the scene until minutes or possibly even weeks after the crime was committed, then you only have evidence found at the scene of the crime to go off of. ROP prevention and ROP forensics are actually two very different problems to solve. Software which aims to prevent ROP attacks may instrument every ret instruction, or monitor them in some way such as with the Intel LBR (last branch record) feature. The challenge which sort of differentiates ROP prevention from ROP forensics is that the ROP forensics software will be looking at a live process or a memory dump, and there is no telling at which phase the ROP attack is acting in, at the time the forensics software acquisitions the memory or stops the process for analysis. ROP forensics can reliably detect a ROP chain on the stack, given that the ROP chain still exists in memory or is mostly intact. Since the stack is constantly changing it is possible that a successful ROP exploit leads to a function that overwrites the entire ROP chain payload with its own stack variables. What this means for forensics analysis is that there are some states or phases that the exploit may be in at the time of memory acquisition/analysis, which are more advantageous for the entity performing the forensics. There are also some states which put the defending entity at a disadvantage, such as when the entire ROP chain has been overwritten by new data. From my research I am seeing that there are atleast a few scenarios that open up the possibility for the ROP chain forensics to be successful.

  • The exploit fails at a time before the ROP chain has been wiped off the stack. This scenario is actually quite common, and has the added benefit of triggering our software to perform the ROP forensics and create a snapshot right after the exploit fails, therefore resulting in an automated detection process. See our blogpost on coresnap to see how the product can be configured for automatic crash dump analysis.

  • The exploit succeeds, but a sufficiently long ROP chain was used (atleast 4 or more gadgets) and there is still atleast one or two addresses left on the stack which could be identified as ROP chain entry points. Even just a single memory address on the stack that points to a gadget is enough to warrant some real suspicion. Every exploit is different, and there is no telling exactly how the stack will be effected as these exploits tend to vary, but after some testing and experimentation it seems that this scenario is a probable one.

  • The exploit succeeds, and a snapshot is taken at some point in the middle of the execution of gadgets. If there were 6 gadgets, and the snapshot was taken during execution of the 3rd gadget, then it is very likely that the ROP chain for the next 3 gadgets is still in-tact.

In forensics we are often times dealing with artifacts that have been corrupted to some degree. An example of this could be a ROP chain that is 90% wiped out, but contains atleast one or two addresses that point to valid gadgets. This is why we feel it is important to have very adept heuristics that are able to detect a possible ROP attack by looking only at a fragment of a ROP chain and still posit some degree of certainty that this isn’t a false positive. If an entire ROP chain is in-tact then there is much more to go on, but in the forensics world we should not ever assume that the entire ROP chain will be resident in a given memory dump.

Example of ROP detection with controlled exploit settings

In this first example, lets utilize a controlled exploit which sets up a ROP chain that executes system("/bin/bash"); exit(); . The following image illustrates what the ROP chain looks like on the stack.

ROP chain image

As illustrated above, the gadget-0 code pops the address to “/bin/bash” into %rdi and then returns into system@PLT effectively executing system("/bin/bash");. Once the attacker exits the bash shell, it returns to the final gadget which is exit@PLT which allows the process to gracefully exit without segfaulting and causing a raucous. Lets run this exploit and see how the Backtrace ROP forensics feature can be used to detect it.

elfmaster@backtrace:~/rop_test$ ./exploit

In another terminal we trace the process

elfmaster@backtrace:~$ sudo ptrace `pidof exploit` --module=security:enable,true

View the snapshots “Warning pane” in Hydra

ROP exploit in hydra

This screenshot gives even a layman forensics analyst a decent view of what’s happening in the exploit. In the backtrace pane we can see that do_system() was called, followed by waitpid(). In actuality the address to system@PLT was wiped off the stack by local variables, but it resulted in a call to do_system which in turn calls waitpid(), so those two top functions in the backtrace are a residual of the ROP chain but not directly part of the ROP chain. Normally we would also deduce from this pane that the code found at 0x400480 (which happens to be exit@PLT) was called before do_system; but do not forget that ROP chains work backwards, they do not call functions, they ret to functions. So what we are seeing is that do_system() will return to exit@PLT which is why it shows up in the backtrace like it was called by exit@PLT, when really it wasn’t and this is just part of a crafted ROP chain. The next address on the stack after exit@PLT is 0x4005a6 which is the address to main() which was pushed onto the stack by whatever function main() had originally called before the exploit was triggered. This means that technically the only real part of the ROP chain leftover at this point is the address pointing to exit@PLT, which by itself should be quite a red flag since PLT entries are not functions, and do not call other functions; they are code stubs that perform indirect jumps to functions within shared libraries, and therefore will not ever be found in a real call stack. This is a good example of the scenario where only a fragment of the ROP chain is left-over, leaving us just a single bread-crumb to work with. Notice the message in the warning pane:

Critical: Discovered possible ROP chain entry point at stack location 0x7ffcb9f8f240 into the .plt section: 0x400480

This demonstrates that even when there is only a fragmented ROP chain left over, we can still make some fairly accurate detections of ROP activity.