Gem #125: Detecting infinite recursion with GDB's Python API

by Jerome Guitton —AdaCore

Let's get started...

Suppose that you are printing a table of factorials from 9 down to 0, as follows:

with Ada.Text_IO; use Ada.Text_IO;
procedure Try is
   function Factorial (I : Integer) return Integer is
   begin
      if I = 0 then
         return 1;
      else
         return I * Factorial (I - 1);
      end if;
   end Factorial;
   N : Integer := 9;
   F : Integer;
begin
   loop
      F := Factorial (N);
      Put (Integer'Image (N));
      Put ("! = ");
      Put (Integer'Image (F));
      New_Line;
      exit when N < 0;
      N := N - 1;
   end loop;
end Try;

Now build the example:

gnatmake -g try.adb

When you execute this program, it will either crash with Storage_Error, or SIGSEGV, or hang, or corrupt the memory, depending on the target platform:

$ ./try
 9! =  362880
 8! =  40320
 7! =  5040
 6! =  720
 5! =  120
 4! =  24
 3! =  6
 2! =  2
 1! =  1
 0! =  1
raised STORAGE_ERROR : EXCEPTION_STACK_OVERFLOW

If you run the program in GDB, by using 'catch exception' you can detect infinite recursion in procedure Factorial:

(gdb) catch exception 
Catchpoint 1: all Ada exceptions
(gdb) run
Starting program: C:\home\guitton\GIT\GDB\builds\gems\1    ry.exe 
[New Thread 12736.0x33a4]
 9! =  362880
 8! =  40320
 7! =  5040
 6! =  720
 5! =  120
 4! =  24
 3! =  6
 2! =  2
 1! =  1
 0! =  1
Program received signal SIGSEGV, Segmentation fault.
0x00401797 in try.factorial (i=-698787) at try.adb:9
9                return I * factorial (I - 1);
(gdb) backtrace 10
#0  0x00401797 in try.factorial (i=-698787) at try.adb:9
#1  0x0040179c in try.factorial (i=-698786) at try.adb:9
#2  0x0040179c in try.factorial (i=-698785) at try.adb:9
#3  0x0040179c in try.factorial (i=-698784) at try.adb:9
#4  0x0040179c in try.factorial (i=-698783) at try.adb:9
#5  0x0040179c in try.factorial (i=-698782) at try.adb:9
#6  0x0040179c in try.factorial (i=-698781) at try.adb:9
#7  0x0040179c in try.factorial (i=-698780) at try.adb:9
#8  0x0040179c in try.factorial (i=-698779) at try.adb:9
#9  0x0040179c in try.factorial (i=-698778) at try.adb:9
(More stack frames follow...)

But now you would like to know the context of the call to Factorial that causes the recursion. This can't easily be done at the exception point, as the debugger has too many frames (698787, presumably) to unwind to get to the original call.

If you add a breakpoint in Factorial, you would do several continue commands without finding anything suspicious:

(gdb) b factorial
Breakpoint 1 at 0x40177f: file try.adb, line 6.
(gdb) run
Starting program: C:\home\guitton\GIT\GDB\builds\gems\1    ry.exe 
[New Thread 11856.0x23b4]
Breakpoint 1, try.factorial (i=9) at try.adb:6
6             if I = 0 then
(gdb) cont
Continuing.
Breakpoint 1, try.factorial (i=8) at try.adb:6
6             if I = 0 then
(gdb) cont
Continuing.
Breakpoint 1, try.factorial (i=7) at try.adb:6
6             if I = 0 then

There are unfortunately a large number of valid calls to Factorial before the bogus one. What we would like to do is set an upper bound on the possible calls to Factorial, and stop only when that bound is reached.

Fortunately, the Python API of GDB can help: with this API, you can go through the debugger structures and get more information about the context at a point of execution. For example, here is a Python function that counts the number of frames in a backtrace:

def frame_count():
    """Count the number of frames in the backtrace"""
    c = 0
    f = gdb.newest_frame()
    while f is not None:
        c = c + 1
        f = f.older()
    return c

The function uses the class Frame from the Python API of GDB, and its method older()/newer() to browse through the backtrace.

After creating a file frame_count.py that contains the implementation of this command, it can then be added to the commands of GDB and used as follows:

(gdb) source frame_count.py
(gdb) python help(frame_count)
Help on function frame_count in module __main__:
frame_count()
    Count the number of frames in the backtrace
(gdb) python print frame_count()
4

Now you can set a breakpoint in Factorial that will stop only when frame_count reaches a maximum bound. Edit frame_count.py to add the following breakpoint hook:

def factorial_hook():
    """Called whenever Try.Factorial is reached"""
    max_number_of_frames = 100
    fc = frame_count()
    if fc <= max_number_of_frames:
        gdb.execute("continue")
    else:
        print ""
        print ("warning: more than %d frames in backtrace."
               % max_number_of_frames)
        print "         To ease the investigation, the selected frame will be"
        print "         the first call to factorial."
        print ""
        f = gdb.newest_frame()
        while f is not None and f.name() == "try.factorial":
            gdb.Frame.select(f)
            f = f.older()
        # Finally, show the frame that this loop selected; it is the first
        # call to factorial
        gdb.execute("frame")

Then attach this command to a breakpoint in Try.Factorial:

(gdb) source frame_count.py
(gdb) break factorial
Breakpoint 1 at 0x40177f: file try.adb, line 6.
(gdb) commands
>python factorial_hook()
>end

You can then run, and the program will stop when the maximum bound is reached. At that point you'll be able to investigate the infinite recursion much more easily:

(gdb) run
Breakpoint 1, try.factorial (i=-100) at try.adb:6
6             if I = 0 then
warning: more than 100 frames in backtrace.
         To ease the investigation, the selected frame will be
         the first call to factorial.
#99 0x0040179c in try.factorial (i=-1) at try.adb:9
9                return I * Factorial (I - 1);

We can now see that Try calls Factorial with -1; there is a bug in the exit condition in the loop.

(gdb) up
#100 0x004017c1 in try () at try.adb:17
17            F := Factorial (N);
(gdb) print N
$2 = -1

This is just one of the advanced debugging capabilities that the Python API of GDB provides. For more information, have a look at the corresponding section in the GDB user's guide.

infinite_recursion.zip


About the Author

Jerome joined AdaCore in 2002 after completing his studies at the Ecole Nationale des Télécommunations (Paris, France), during which he had already worked with the company on one of its many reseach projects, namely PolyORB. His enthusiasm has remained undimmed during these six years and he has worked on a variety of projects, as well as acquiring expertise in debuggers and cross technologies. He has recently led the effort to port GNAT Pro to various cross targets (VxWorks 6, ELinOS).