Gem #143 : Return to the Sources

Let's get started...

It all starts from the source.

All large applications organize their source code into multiple separate directories, which we generally think of as modules. The source files themselves generally follow naming conventions so that we can easily find things. For instance, the traditional extension for Ada files (in GNAT) are .adb and .ads, although other technologies use other extensions (.1.ada, for instance).

A lot of tools, in particular the compiler and the IDE, need to find source files in order to perform various actions on the code. Once they have found the sources, though, they also need to know how to manipulate them. For instance, the compiler might need to compile a specific file with style checks turned off, whereas all other files need style checks enabled, to ensure style consistency.

A typical application, nowadays, uses multiple languages, such as Ada and C. Each language has its own naming scheme, and perhaps its own set of tools.

At the beginning, GNAT was using switches like -I to point to the various source directories, and was expecting all source files to use .adb and .ads suffixes. But then we introduced project files, which serve as a convenient place to describe the organization of software projects. In contrast to a Makefile, they are purely descriptive and do not describe the set of actions to perform. This makes them ideally suited for sharing among multiple tools.

Other Gems have already talked about various aspects of projects, so we will not go into the details here.

However, it might happen that your own application could use what's in the project files. Parsing those efficiently is tricky, since we keep adding features to support various aspects of managing sources, and the parser would have to be kept up to date.

Instead, we recommend using the package GNATCOLL.Projects, found in the GNAT Components Collection, to manipulate project files. This Gem presents a brief introduction to the features of this package.

Here's a simple example of use:

   pragma Ada_05;
   with GNATCOLL.Projects;   use GNATCOLL.Projects;
   with GNATCOLL.VFS;        use GNATCOLL.VFS;  --  Gem 118
      Tree : Project_Tree;
      Tree.Load (GNATCOLL.VFS.Create ("root.gpr"));

There is often confusion among terms. Let's give a few definitions that are used within the GNATCOLL API. A project tree is a set of projects that might depend on each other. You can think of the tree as representing your whole application or source base. It is generally subdivided into modules, each of which contains a single project file.

In the example above, what we loaded is the tree rooted at root.gpr. Thus, we loaded the project root.gpr, but also perhaps the project child.gpr on which root.gpr depends.


In fact, the example above is often simplistic. A project can be configured for multiple scenarios (by using the "external" keyword in the project file, and then some case statements, for instance to change the list of source directories depending on an environment variable).

Likewise, your project might depend on some preinstalled projects. For instance, if you intend to use GNATCOLL.Projects, your project will likely depend on gnatcoll.gpr. To find these projects, GNATCOLL will by default ask gnatls where it thinks the predefined projects are, or what the run-time directory is. But you can also add your own.

To do this, you need to go through an instance of Project_Environment, as in the following code:

      Env : Project_Environment_Access;
      Initialize (Env);
      Env.Set_Predefined_Source_Path ((1 => Create ("/usr/local/prefix")));

      --  add a custom language
      Env.Register_Default_Language_Extension ("python", ".py", "");

      --  set up scenario variables
      Env.Change_Environment ("VARIABLE", "VALUE");

      Tree.Load (Create ("root.gpr"), Env => Env);

This time, the project is loaded in a specific, preinitialized context, which might affect the view the application has of it.

If you need to change the scenario during the lifetime of your application, you would do the following:

    Env.Change_Environment ("VARIABLE", "VALUE2");

which reloads the same project, in a different scenario. Now, for instance, the list of source files or compiler switches could be different.


Once we have the projects loaded in memory, we need to perform queries to extract information.

First of all, let's find the list of all source files in the application.

   pragma Ada_12;   --  convenient for iterators
      Src : File_Array_Access :=
              Tree.Root_Project.Source_Files (Recursive => True);
      for S of Src loop
         Put_Line (S.Display_Full_Name);
      end loop;
      Free (Src);

The source files are returned as instances of a Virtual_File. As we saw in Gem 118, such an object provides a convenient cache for information that otherwise would need to be queried via system calls, which can be slow on some systems. Also, it doesn't presume whether you are going to be needing full path, basenames, or other information. These objects are cached in the project tree, so that every time you request source files, the same instances (and its cache) are returned.

A frequent operation is that you have a base name for a source file (for instance a.adb), and want to find it on disk. This can easily be achieved with:

      A_Adb : constant Virtual_File := Tree.Create ("a.adb");
      Put_Line (A_Adb.Display_Full_Name);

Again, this information is cached, so is very fast to query.

Naming schemes

As much as possible, tools should be able to handle multiple source languages. From a source file, we therefore need to know its language, which can be done with:

   Put_Line (Tree.Info (A_Adb).Language);    --  "ada"

Ada, in particular, also has the notion of units. For instance, the unit GNATCOLL.Projects is in the source file "". The mapping from one to the other is fully described in the project file, and is not something that each application should assume, or recompile on its own.

From a source file, retrieving the name of the unit is done with:

   Put_Line (Tree.Info (A_Adb).Unit_Name);   --  "A"
   Put_Line (Tree.Info (A_Adb).Unit_Part);   --  Unit_Body

From the unit, finding the source file is done with:

   Put_Line (Tree.Root_Project ("A", Unit_Body, "Ada"));  --  "a.adb"


The information in a project file is organized into packages (typically one for each tool like the compiler, binder, IDE,...), and then into attributes. Users are free to add their own packages, so you could decide that your own tool's configuration should go into the package My_Tool, and which attributes can be used for that configuration. You should not however add new attributes to the predefined packages, since the project parser will complain otherwise, to avoid possible future name clashes.

A typical attribute is Switches, which specifies the command-line switches to pass to your tool. So the user's project file could contain:

  project Root is
     package My_Tool is
        for Switches ("Ada") use ("-a", "-b");
     end My_Tool;
  end Root;

From Ada, you can retrieve the value of this attribute with the following piece of code.

   pragma Ada_12;
   with GNAT.Strings;   use GNAT.Strings;
      My_Tool_Switches : constant Attribute_Pkg_List :=
         Build (Package_Name => "My_Tool", Attribute_Name => "Switches");

      Switches : GNAT.Strings.String_List_Access :=
         Tree.Root_Project.Attribute_Value (My_Tool_Switches, Index => "Ada");
      for S of Switches loop
         Put_Line (S.all);   --  "-a", then "-b"
      end loop;

      Free (Switches);

Some constants for the predefined attributes are already declared in GNATCOLL.Projects.

GNATCOLL.Projects also provides an API to edit project files. It has the same limitation as GPS does (not surprisingly): when you edit a project, the changes might impact the whole project file, so your handcrafted formatting or comments might disappear. Mostly, you should only edit projects that were created through the same API, at the risk of otherwise losing user changes.

GNATCOLL.Projects is able to parse all projects, even those it cannot edit later on, with the notable exception of aggregate projects. Support for those is on the roadmap, but hasn't been implemented yet.