ROP exploitation in Linux
The exploitation techniques used in stack-smashing have evolved right alongside the
security mitigations that were created to prevent them. At one point in time an
attacker was able to craft an exploit which overwrites the saved return address
so that it points into a stack location where the shellcode was injected. This
technique has been foiled by various security mechanisms over the years, but it
was primarily DEP (data execution prevention)
that created the necessity for an exploit to return somewhere other than the stack.
The classic phrack paper Advanced return-into-lib(c) exploits
details the earliest forms of ROP exploits used in Linux, by returning directly into
libc, or even indirectly through the PLT (procedure linkage table). These techniques
are still valid today, and tend to be used in conjunction with code chunks
called gadgets. A gadget is typically a small sequence of instruction which performs
some task or another, such as popping values from the stack into %rdi
, %rsi
, etc.
Each gadget will end in a ret
instruction of some type, or in some cases a jmp
instruction (which is why the term JOP is often used with ROP). The gadgets are chained
together by what is called a ROP Chain. A ROP chain is a carefully crafted stack
which links the gadgets together through a series of return addresses. The ROP chain
may also contain arguments that will be passed to the functions that the attacker is
aiming to execute. A ROP chain is often termed a ‘data-only payload’ since it contains
only data; usually a series of memory addresses and arguments.
Why is detecting ROP important?
Virtually all stack-based overflow vulnerabilities in Linux userland applications
will be exploited using some form of ROP. Not only that, but vulnerabilities such
as heap, or even .bss overflows are also targets for ROP, as the attacker can perform a
stack pivot which points %rsp
to whatever data structure the attacker has control
over. This is to say that ROP is very prevalent, and not limited to stack overflows.
ROP detection forensics
Over the years there have been a variety of shellcode detection engines, but only in the more recent times have we began seeing research into developing forensics capabilities which can detect ROP chain payloads within memory. Here at Backtrace we have a growing set of security features for detecting exploitation and malware artifacts within a given process image or memory coredump. The forensics feature for detecting ROP activity is able to detect ROP chains and gadgets by using a set of heuristics that will be discussed in depth in a follow up blog-post. There are some other researchers out there who have produced very good work in the area of ROP forensics, such as the developers of ROPMEMU which is a framework based on Volatility plugins and the Unicorn QEMU hooks for emulation. It has been used to reliably capture and decompile some very sophisticated ROP chains, including one that serves as a ROP chain based kernel rootkit.
ROP forensics challenges
Through my own experience, which has led me into some pretty interesting research related
to ROP exploitation detection and mitigation through techniques such as CFI instrumentation.
I have personally come to the opinion that designing a mechanism to prevent ROP from ever happening is
in some ways more straight forward to accomplish than designing one which attempts to forensically
search for evidence of a ROP attack that is in progress or has passed already. This could be likened
to an analogy of a real crime scene; It would be more straight forward to catch a criminal at the scene
of a crime if you were present at the scene before and while the criminal shows up. If you did not arrive at the
scene until minutes or possibly even weeks after the crime was committed, then you only have evidence found
at the scene of the crime to go off of. ROP prevention and ROP forensics are actually two very different
problems to solve. Software which aims to prevent ROP attacks may instrument every ret
instruction,
or monitor them in some way such as with the Intel LBR (last branch record) feature.
The challenge which sort of differentiates ROP prevention from ROP forensics is that the
ROP forensics software will be looking at a live process or a memory dump,
and there is no telling at which phase the ROP attack is acting in, at the time the forensics software
acquisitions the memory or stops the process for analysis. ROP forensics can reliably detect a ROP
chain on the stack, given that the ROP chain still exists in memory or is mostly intact.
Since the stack is constantly changing it is possible that a successful ROP exploit leads
to a function that overwrites the entire ROP chain payload with its own stack variables.
What this means for forensics analysis is that there are some states or phases that the exploit
may be in at the time of memory acquisition/analysis, which are more advantageous for the entity
performing the forensics. There are also some states which put the defending entity at a disadvantage,
such as when the entire ROP chain has been overwritten by new data. From my research I am seeing that
there are atleast a few scenarios that open up the possibility for the ROP chain forensics
to be successful.
The exploit fails at a time before the ROP chain has been wiped off the stack. This scenario is actually quite common, and has the added benefit of triggering our software to perform the ROP forensics and create a snapshot right after the exploit fails, therefore resulting in an automated detection process. See our blogpost on coresnap to see how the product can be configured for automatic crash dump analysis.
The exploit succeeds, but a sufficiently long ROP chain was used (atleast 4 or more gadgets) and there is still atleast one or two addresses left on the stack which could be identified as ROP chain entry points. Even just a single memory address on the stack that points to a gadget is enough to warrant some real suspicion. Every exploit is different, and there is no telling exactly how the stack will be effected as these exploits tend to vary, but after some testing and experimentation it seems that this scenario is a probable one.
The exploit succeeds, and a snapshot is taken at some point in the middle of the execution of gadgets. If there were 6 gadgets, and the snapshot was taken during execution of the 3rd gadget, then it is very likely that the ROP chain for the next 3 gadgets is still in-tact.
In forensics we are often times dealing with artifacts that have been corrupted to some degree. An example of this could be a ROP chain that is 90% wiped out, but contains atleast one or two addresses that point to valid gadgets. This is why we feel it is important to have very adept heuristics that are able to detect a possible ROP attack by looking only at a fragment of a ROP chain and still posit some degree of certainty that this isn’t a false positive. If an entire ROP chain is in-tact then there is much more to go on, but in the forensics world we should not ever assume that the entire ROP chain will be resident in a given memory dump.
Example of ROP detection with controlled exploit settings
In this first example, lets utilize a controlled exploit which sets up a ROP chain
that executes system("/bin/bash"); exit();
. The following image illustrates what the ROP
chain looks like on the stack.
As illustrated above, the gadget-0 code pops the address to “/bin/bash” into %rdi
and then
returns into system@PLT
effectively executing system("/bin/bash");
. Once the attacker
exits the bash shell, it returns to the final gadget which is exit@PLT
which allows the process
to gracefully exit without segfaulting and causing a raucous. Lets run this exploit and see how the
Backtrace ROP forensics feature can be used to detect it.
elfmaster@backtrace:~/rop_test$ ./exploit
bash-4.3$
In another terminal we trace the process
elfmaster@backtrace:~$ sudo ptrace `pidof exploit` --module=security:enable,true
/home/elfmaster/exploit.31054.1472839862.btt
View the snapshots “Warning pane” in Hydra
This screenshot gives even a layman forensics analyst a decent view of what’s happening in the exploit.
In the backtrace pane we can see that do_system()
was called, followed by waitpid()
. In actuality the address to
system@PLT
was wiped off the stack by local variables, but it resulted in a call to do_system
which
in turn calls waitpid()
, so those two top functions in the backtrace are a residual of the ROP chain
but not directly part of the ROP chain. Normally we would also deduce from this pane that the code found at
0x400480
(which happens to be exit@PLT
) was called before do_system
; but do not forget that ROP chains
work backwards, they do not call
functions, they ret
to functions. So what we are seeing is that do_system()
will return to exit@PLT
which is why it shows up in the backtrace like it was called by exit@PLT
, when really it wasn’t and this
is just part of a crafted ROP chain. The next address on the stack after exit@PLT
is 0x4005a6
which is the address to main()
which was pushed onto the stack by whatever function main()
had originally called before the exploit was triggered.
This means that technically the only real part of the ROP chain leftover at this point is the address pointing to exit@PLT
,
which by itself should be quite a red flag since PLT entries are not functions, and do not call other functions; they
are code stubs that perform indirect jumps to functions within shared libraries, and therefore will not ever be
found in a real call stack. This is a good example of the scenario where only a fragment of the ROP chain is left-over, leaving
us just a single bread-crumb to work with. Notice the message in the warning pane:
Critical: Discovered possible ROP chain entry point at stack location 0x7ffcb9f8f240 into the .plt section: 0x400480
This demonstrates that even when there is only a fragmented ROP chain left over, we can still make some fairly accurate detections of ROP activity.