Gem #78: Where did my memory go? (Part 2)

Let's get started…


Unless your coding standard forbids any dynamic allocation, memory management is a constant concern during system development. You might want to limit the amount of memory that your application requires, or you might have memory leaks (allocation chunks that are never returned to the system). The latter is a critical concern for long-running applications.

Part II: System.Memory

In the previous Gem we discussed the use of GNAT.Debug_Pools to detect memory problems. Another approach that is somewhat easier to use, because it doesn't require changes to the application, is to override the System.Memory package. In GNAT and its run-time, all actual memory allocations are done via this package, which among other things does the low-level system calls to malloc() and free().

If you create your own version of System.Memory, you will in fact short-circuit all memory allocation and deallocation, and replace it with your own. To do so, copy the file s-memory.adb to one of your source directories, modify it as appropriate, and compile your application, passing the "-a" switch to gnatmake (gprbuild currently does not have an equivalent switch, although using project files should work as expected). This ensures that GNAT will recompile the library with your modified System.

Using this package does not provide the same safety as the debug pools, since it does not check that dereferences are valid, and so your code could still be accessing invalid memory. On the other hand, the use of System.Memory is much less intrusive in your code. System.Memory is best viewed as a performance analysis tool rather than a debugging tool, although it will allow you to monitor your code for memory leaks.

The GNATCOLL library (a recent addition to the GNAT technology, and part of the latest customer and public releases) provides such an implementation in the form of the GNATCOLL.Memory package. This package is not a direct replacement for System.Memory, but only a minimal amount of work is needed to make use of it, by creating a version of s-memory.adb file that contains the following:

with GNATCOLL.Memory;
package body System.Memory is

   package M renames GNATCOLL.Memory;

   function Alloc (Size : size_t) return System.Address is
   begin
      return M.Alloc (M.size_t (Size));
   end Alloc;

   procedure Free (Ptr : System.Address) renames M.Free;

   function Realloc
      (Ptr  : System.Address;
       Size : size_t)
      return System.Address
   is
   begin
      return M.Realloc (Ptr, M.size_t (Size));
   end Realloc;

end System.Memory;

You then need to modify your code so that it properly initializes GNATCOLL.Memory, which is done via a call like the following:

   GNATCOLL.Memory.Configure (Activate_Monitor => True);

The monitoring provided by this GNATCOLL package is not enabled by default, to limit overhead on a running program. In your application, monitoring could be activated through a command-line switch, or by means of a specific environment variable. (This is all fully under your control, though, so you'll have to do the actual call to Getenv and then to Configure.)

You can then instrument your code in one or more places to dump the memory usage at that point. This is done through a call such as the following:

   GNATCOLL.Memory.Dump (Size => 3, Report => Memory_Usage);

Such a call will print on the console three backtraces for the code that allocated the most memory (among the currently allocated memory). Variants exist to dump the backtraces that executed the greatest number of allocations (as opposed to the largest allocation size), or the total amount of memory, even if that memory has since been released.

Such a dump includes a backtrace with addresses, which you can convert to a symbolic backtrace by using the external tool addr2line as follows:

    addr2line -e 

This library has very light overhead (in particular when not activated) so that you can distribute your application with support for GNATCOLL.Memory built in, and then investigate any issues that arise in production code. In fact, our own GPS IDE now includes this support. Activation is controlled by an external variable, and dumping the state of memory can be done through a Python command at any point in time without the need to recompile GPS. (Note that GNATCOLL also provides an interface to Python.)

Another feature of GNATCOLL.Memory is the capability of resetting all counters to zero. For example, let's assume we want to investigate memory leaks while opening and closing editors in GPS. If we look at the places that allocate memory, the biggest allocations that are displayed in the console do not concern the editor itself, but rather memory allocated when GPS started (if you are curious, this is generally memory that is related to the cross-reference database). Therefore, we would do the following: start GPS, reset GNATCOLL.Memory counters to 0, open and close an editor, and dump memory usage. At that point, if Dump prints any information on the console, we know that this is memory that has been allocated since the call to Reset, and that wasn't freed when the editor was closed, and therefore is most likely a memory leak.

The appropriate use of Reset and Dump therefore allows the monitoring of memory usage in specific parts of the code.

A separate tool called gnatmem is also distributed with GNAT. When you link your application with the -lgmem switch, it will transparently instrument all calls to the standard malloc and free (from the Ada code), in a fashion similar to GNATCOLL.Memory. On program exit, a disk file is created that you can then analyze using gnatmem, to highlight the sources of memory leaks in your application. This tool requires no change to your code at all, but, on the other hand, does not provide a way to monitor specific sections of your code like GNATCOLL.Memory does.

Gnatmem provides a number of command-line switches to control the display of information. For instance, the switch "-m 0" lets you view all places in your code that ever allocated memory, even if that memory was properly deallocated afterwards. When one such place is doing millions of allocations, it might sometimes be more efficient to use a custom Storage_Pool and avoid the system call to malloc, by reusing memory. The "-s" switch allows you to sort the output in various ways.

In Part III of this series we will take a look at various commercially available system tools for monitoring and analyzing memory usage.