Gem #97: Reference Counting in Ada - Part 1

Memory management is typically a complex issue to address when creating an application, and even more so when creating a library to be reused by third-party applications. It is necessary to document which part of the code allocates memory and which part is supposed to free that memory. As we have seen in a previous Gem, a number of tools exist for detecting memory leaks (gnatmem, GNATCOLL.Memory or valgrind). But of course, it would be more convenient if the memory were automatically managed.

Some languages include an automatic garbage collector. The Ada Reference Manual has an implementation permission allowing a conformant compiler to provide one, although none of the mainstream compilers do so. Ada's design allows implementations to use the stack in many situations where other languages use the heap; this reduces the need for a garbage collector.

An alternative implementation for getting automatic memory management is to use reference counting: every time some object is allocated, a counter is associated with it. This counter records how many references to that object exist. When that counter goes down to zero, it means the object is no longer referenced in the application and can therefore be safely deallocated.

The rest of this Gem will show how to implement such a mechanism in Ada. As we will see, there are a number of minor but delicate issues involved, so implementing such types is not as trivial as it first seems. The GNAT Components Collection (GNATcoll) now includes a reusable generic package that simplifies this, and we will discuss this briefly at the end of this Gem.

As stated above, we need to associate a counter with the objects of all types we want to monitor. The simplest is to create a tagged type hierarchy where the root type defines the counter:

   type Refcounted is abstract tagged private;
   procedure Free (Self : in out Refcounted) is null;

private

   type Refcounted is abstract tagged record
      Refcount : Integer := 0;
   end record;

This approach is mostly suitable when building a reusable library for reference-counted types, such as GNATcoll. If you just want to do this once or twice in your application, you can simply add a new Refcount field to your record type (which doesn't need to be tagged).

Next, we need to determine when to increment and decrement this counter. In some languages this counter needs to be manually modified by the application whenever a new reference is created, or when one is destroyed. This is, for instance, how the Python interpreter is written (in C). But we can do better in Ada, by taking advantage of controlled types. The compiler calls special primitive operations each time a value of such a type is created, copied, or destroyed.

If we wrap a component of a simple access type in a type derived from Ada.Finalization.Controlled, we can then have the compiler automatically increment or decrement the reference count of the designated entity each time a reference is established or removed. We thus create a smart pointer: a pointer that manages the life cycle of the block of memory it points to.

   type Refcounted_Access is access all Refcounted'Class;
   type Ref is tagged private;

   procedure Set (Self : in out Ref; Data : Refcounted'Class);
   function Get (Self : Ref) return Refcounted_Access;
   procedure Finalize (P : in out Ref);
   procedure Adjust   (P : in out Ref);

private

   type Ref is new Ada.Finalization.Controlled with record
       Data : Refcounted_Access;
   end record;

Let's first see how a user would use the type. Note that Get returns an access to the data. This might be dangerous, since the caller might want to free the data (which should remain under control of Ref). In practice, the gain in efficiency is worth it, since it avoids making a copy of a Refcounted'Class object. This is also essential if we want to allow the user to easily modify the designated entity. The user is ultimately responsible for ensuring that the lifetime of the returned value is compatible with the lifetime of the corresponding smart pointer.

declare
  type My_Data is new Refcounted with record
     Field1 : ...;
  end record;

  R1 : Ref;

begin
  Set (R1, My_Data'(Refcounted with Field1 => ...));
  --  R1 holds a reference to the data

  declare
     R2 : Ref;
  begin
     R2 := R1;
     --  R2 also holds a reference to the data (thus 2 references)

     ...
     --  We now exit the block. R2 is finalized, thus only 1 ref left

  end; 

  Put_Line (My_Data (Get (R1).all).Field1);
  --  In practice, the smart pointers would be implemented in a generic package,
  --  and Get would return an access to My_Data, so we could write the simpler:
  --
  --     Put_Line (Get (R1).Field1);

  --  We now leave R1's scope, thus refcount is 0, and the data is freed. 
end;

Now let's look at the details of the implementation. First consider the two subprograms for setting and getting the designated entity. Note that the default value for the reference count is zero in the Refcounted type. The implementation of Set is slightly tricky: it needs to decrement the reference count of the previously designated entity, and increment the reference count for the new data. Instead of calling Adjust and Finalize explicitly (which is not a recommended practice when it can be avoided), we use an aggregate and let the compiler generate the calls for us.

procedure Set (Self : in out Ref; Data : Refcounted'Class) is
  D : constant Refcounted_Access := new Refcounted'Class'(Data);
begin
  if Self.Data /= null then
      Finalize (Self); -- decrement old reference count
  end if;

  Self.Data := D;
  Adjust (Self);  -- increment reference count (set to 1) 
end Set;

function Get (P : Ref) return Refcounted_Access is
begin
   return P.Data;
end Get;

In GNATCOLL.Refcount, we provide a version of Set that receives an existing access to Refcount'Class, and takes responsibility for freeing it when it is no longer needed. The implementation is very similar to the above (although we need to be careful that we do not Finalize the old data if it happens to be the same as the new, since otherwise we might end up freeing the memory).

Adjust is called every time a new reference is created. Nothing special here:

overriding procedure Adjust (P : in out Ref) is
begin
   if P.Data /= null then
      P.Data.Refcount := P.Data.Refcount + 1;
   end if;
end Adjust;

The implementation of Finalize is slightly more complicated: the Ada reference manual indicates that a Finalize procedure should always be idempotent. An Ada compiler is free to call Finalize multiple times on the same object, in particular when exceptions occur. This means we must be careful not to decrement the reference counter every time Finalize is called, since a given object only owns one reference. Hence the following implementation:

overriding procedure Finalize (P : in out Ref) is
   Data : Refcounted_Access := P.Data;
begin
   --  Idempotence: the next call to Finalize will have no effect
   P.Data := null;

   if Data /= null then
      Data.Refcount := Data.Refcount - 1;
      if Data.Refcount = 0 then
         Free (Data.all);  --  Call to user-defined primitive
         Unchecked_Free (Data);
      end if;
   end if;

end Finalize;

That's it for the basic implementation. The next Gem in this series will discuss issues of task safety associated with reference-counted types.