Many languages provide the ability to manually allocate and deallocate memory. For some workloads, this level of control over memory management enables superior memory utilization and performance. However, manual memory management comes at the cost of enabling a wide-class of bugs involving memory safety. Languages such as C and C++ are infamous for allowing difficult-to-debug issues involving mis-use of dynamic memory allocation. High-level languages are still susceptible to these classes of bugs if they are interoperating with system libraries that are built on unmanaged languages.
The Backtrace platform has seen tens of thousands of memory management errors in production and testing environments. This post introduces readers to a myriad of common heap management errors that are seen in the wild.
Introduction
Languages such as C, C++ and Objective-C provide developers a dynamic memory
allocation interface so that blocks of unused memory can be requested and
returned at run-time. This is necessary when allocation sizes for structures
are only known at run-time. Generally speaking, this form of memory management
is implemented with the malloc(3)
interface as specified below.
Function | Description |
---|---|
malloc | allocates the requested number of bytes |
calloc | allocates the requested number of bytes and initializes it to zero |
realloc | resizes an existing allocation, creating a new allocation if necessary |
free | releases the provided allocation for re-use |
Languages such as C++ and Objective-C implement more sophisticated and
arguably safer interfaces for dynamic memory allocation on top these facilities
or variants of them. However, they are still susceptible to the same classes of
bugs.
With malloc
, the developer computes the required size of memory, allocates
it and then calls free
once they are done with it. When using malloc
,
the developer must make sure to initialize the data as it will
be filled with indeterminate values. When dynamic memory management functions
are misused, several bad things occur ranging from immediate crashes to
more difficult-to-debug transient errors. For example, many allocators will
not return a free region of memory to the operating system and the program
may appear to be working for some time before an obvious bugs occurs
such as data corruption or a crash.
Common Errors
Below are some examples of memory management bugs involving the manipulation of
a dynamically linked list of struct person
objects.
#include <assert.h>
#include <stdlib.h>
#include <string.h>
struct person {
char *name;
struct person *next;
struct person *previous;
};
struct people {
struct person *head;
};
static struct people people = { NULL };
struct person *
person_add(const char *name)
{
struct person *person;
/* Allocate memory for the person structure. */
person = malloc(sizeof *person);
assert(person != NULL);
/* Allocate a copy of the name. */
person->name = strdup(name);
assert(person->name != NULL);
person->previous = NULL;
person->next = people.head;
people.head = person;
return person;
}
bool
person_exists(const char *name)
{
struct person *cursor;
for (cursor = people.head; cursor != NULL; cursor = cursor->next) {
if (strcmp(cursor->name, name) == 0)
return true;
}
return false;
}
Memory Leaks
If a programmer fails to deallocate an unused region of memory, then the region will never be re-used. This wasted memory may eventually cause the application to exhaust memory resources on a system, eventually leading to failure or worse.
The example below demonstrates a memory leak. The implementation makes sure
to free
the people
object but does not call free
on the allocated
name
pointer.
void
person_destroy(struct person *person)
{
/* Unlink person from list. */
if (person->previous != NULL)
person->previous->next = person->next;
if (person->next != NULL)
person->next->previous = person->previous;
/* Release memory. */
free(person);
return;
}
The memory that was allocated for person->name
may not be re-used to
service other allocations and is just wasted memory. Memory leaks will
also occur in much more complex systems involving complicated reclamation
semantics (for example, garbage collection or reference counting).
Double-Free
If a programmer calls free on a deallocated region of memory more than once, a double-free condition occurs. If the programmer is lucky, this will result in a crash condition. If a programmer is unlucky, this may manifest as a crash or data corruption at a significantly later point in time.
The example below demonstrates some side-effects of the same double-free condition.
int
main(int argc, char **argv)
{
struct person *person = person_add("John Smith");
unsigned long *one, *two;
person_destroy(person);
person_destroy(person);
one = malloc(sizeof(*one) * 3);
one[0] = 1;
two = malloc(sizeof(*two) * 3);
two[0] = 2;
assert(one[0] == 1);
assert(two[0] == 2);
return 0;
}
Using the glibc
memory allocator, this program crashes inside of the
second free
call. An error message is provided with potentially useful
information if you are familiar with the internals of the memory allocator.
$ ./person
*** Error in `./person': double free or corruption (fasttop): 0x0000000001afe010 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f51278c6725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f51278cef4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f51278d2abc]
./person[0x4007d1]
./person[0x40085d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f512786f830]
./person[0x400609]
With the jemalloc
memory allocator, the double-free condition will not
manifest as an explicit abort condition. Instead, data corruption occurs due to
the fact that the one
and two
allocations are served from the same region
of memory. In this state, a double-free is difficult to disambiguate from
other classes of bugs without additional consistency checks.
$ ./person
person: person.c:85: main: Assertion `one[0] == 1' failed.
Dangling Pointers
References to a region of memory that has been destroyed and subsequently
re-used to service other allocations are referred to as dangling, stale or
wild pointers. This class of bug is notoriously difficult to debug as it
often manifests as a transient issue that is difficult to disambiguate from
other classes of bugs. A use-after-free
is a variation of a dangling pointer.
Take the example below, this version of person_destroy
fails to clear
the people
linked list references to the destroyed person
object.
void
person_destroy(struct person *person)
{
/* Release memory. */
free(person->name);
free(person);
return;
}
The program below crashes in person_exists
. The region of memory containing
the person->name
pointer is re-used for the strdup
allocation
and now contains an invalid pointer value, resulting in a crash when
person_exists
attempts to dereference it.
int
main(int argc, char **argv)
{
struct person *person;
char *test;
person = person_add("John Smith");
person_destroy(person);
test = strdup("http://backtrace.io");
person_exists("John Smith");
return 0;
}
Type Mismatch
Programmer error may also cause insufficient allocation requests to be made.
For example, in the following program person_add
allocates memory with
malloc(sizeof person)
instead of malloc(sizeof *person)
.
int
main(int argc, char **argv)
{
struct person *person;
char *test;
person = person_add("John Smith");
person_destroy(person);
test = strdup("apple");
person_exists("John Smith");
return 0;
}
This class of fault manifests itself in many of the same ways as a dangling pointer or use-after-free.
Uninitialized Memory
If a developer allocates uninitialized memory, and forgets to initialize it before use, undefined behavior will occur ranging from transient errors, faults to security violations.
Conclusion
The most common forms of memory allocation errors have been summarized. One of the biggest bottlenecks to solving these issues is determining which class of heap corruption is present. For example, it can be difficult or impossible to disambiguate the side-effects of a stale pointer from a double-free or type mismatch. Read our post on how Backtrace helps debug memory management errors.