0110.be logo

~ Identifying memory leaks in C

Memory leaks
Fig: Memory leaks.

The C programming language is deceptively simple. The syntax is straightforward, C has a limited amount of keywords and a small standard library. The first edition of the classic book ‘The C Programming Language’ is only about 200 pages. And yet, when programming in C, it is hard to avoid the many exiting footguns: integer type conversions, unchecked indexes and memory leaks can all cause subtle problems. This is part of the appeal of C: shooting yourself in the foot does make you feel alive. Here I want to focus on ways to check for memory leaks for C programs.

Memory leaks come about when memory is claimed but is never released again. If this is done in a loop or during a long running program, the claimed memory adds up and eventually the system may run out of memory. A memory leak is less a problem if a program forgets to free a small amount of memory it only claims once: after program shut down, the operating system reclaims all memory anyhow. However, it does feels very dirty to not clean up after oneself. And I for one, am not a dirty boy.

Another reason to look for memory use and leaks is when you are programming for embedded devices. For these systems memory is very limited: in that world 500kB RAM is considered a massive amount of memory. I have been busy programming a scalable audio search system called Olaf which targets both traditional computers, embedded systems and browsers (via WebAssembly). It is clear that memory use — and memory leaks — need to be kept in check to pull this of.

Now, these memory leaks might not be easy to spot by inspecting the code. There are tools which help to spot memory management problems. One of these is valgrind which is currently not easy to use on Apple system with ARM processors. Luckily there is an alternative which is probably already installed on macOS via the XCode Command Line Tools a command line tool aptly called leaks. To quote the apple documentation on leaks, leaks reports:

The most straightforward use of leaks is to run a program and generate a report after program shutdown. See below to run a memory leak inspection, in this case for the bin/olaf_c program which indexes an audio file in a key-value store. For CI purposes it is practical to know that leaks has an exit status of zero only when no leaks have been found. The exit status can be used in an automated test script to break a build if a leak is detected. The --quiet option can be practical in such setting.

1
leaks --atExit -- bin/olaf_c store audio.raw audio

In the case of Olaf I made a classic mistake: I had called free() on hash table but I needed to call the hash table destructor: hash_table_destroy() which freed not only the hash table itself but also all memory associated with the hash table entries. After a quick fix the leaks command showed no more leaks!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
leaks Report Version: 4.0, multi-line stacks
Process 35293: 2200395 nodes malloced for 135146 KB
Process 35293: 2200171 leaks for 138371200 total leaked bytes.

STACK OF 1 INSTANCE OF 'ROOT LEAK: <malloc in hash_table_new>':
5   dyld                                  0x1a16dbf28 ...
4   olaf_c                                0x100db145c main ...
3   olaf_c                                0x100db53c8 olaf_...
2   olaf_c                                0x100db4400 olaf_...
1   olaf_c                                0x100da5788 hash_...
0   libsystem_malloc.dylib                0x1a1874d88 _mall...
====
    2200171 (132M) ROOT LEAK: <malloc in hash_table_new 0x152704d00> [64]
       2200170 (132M) <calloc in hash_table_insert 0x153b00000> [50348032]
          2 (80 bytes) <malloc in hash_table_insert 0x13e8fffe0> [32]
             1 (48 bytes) <malloc in olaf_fp_matcher_match_single_fingerprint 0x12cf04080> [48]
Output of the leaks command which shows where a memory leak can be found.


General takaways