Gem #152 : Defining a New Language in a Project File

by Vincent Celier —AdaCore

Let's get started...

Gprbuild, the multi-language builder, has knowledge of a number of toolchains for different languages, such as Ada, C, C++, Fortran, and Assembler. This knowledge is contained in a configuration project file that is usually created "on the fly" by gprbuild. The configuration project file that gprbuild creates and then uses is a project file that contains the characteristics of the languages used in the user project tree. The attributes that define these characteristics are inherited by all the user project files.

However, there are cases where a language is not known by gprbuild because it's a different toolchain or a different language. In this case, it's possible to declare all the characteristics of the language in a project file.

There are many configuration attributes that define the characteristics of a programming language. The full list of attributes can be found in the GPRbuild User's Guide in section 1.9.10 (Attributes), in particular in section 1.9.10.6 (Package Compiler Attributes).

Characteristics of a Programming Language

The most important aspects of defining the characteristics of a programing language are:

  • Naming scheme: how gprbuild finds the sources of the language
  • Compiler driver and required switches: how gprbuild finds the compiler for the language and the minimal options to use when invoking the compiler
  • Dependencies: how gprbuild decides that a source is up to date or needs to be recompiled
  • Search directories: how gprbuild specifies to the compiler the list of directories to search for imported or included files

As a simplified example, consider a fictitious language "New_Lang". The compiler is called "nlang". The source suffix is ".nlng". The directories to search are specified with switch "-I". To produce a Makefile fragment that contains the dependencies, there is a switch "--dependencies=".

Assume that language "New_Lang" is file-based, not unit-based. In truth, Ada is the only unit-based language, and it requires special handling from gprbuild. All other languages, such as C, C++, and Fortran, are file-based.

Naming Scheme

We need to indicate to gprbuild how to find the sources of language New_Lang. As usual for file-based languages, we only need to specify the default suffix of the sources.

In package Naming, the source suffix is specified with attribute Body_Suffix (or the equivalent Implementation_Suffix):

   package Naming is
      for Body_Suffix ("New_Lang") use ".nlng";
   end Naming;

Compiler Driver and Required Switches

The compiler driver is defined with attribute Driver in package Compiler:

   for Driver ("New_Lang") use "nlang";

We assume here that executable "nlang" is in the path. If this is not the case, then we specify the compiler driver with its full path:

   for Driver ("New_Lang") use "/path/to/bin/nlang";

The "required switches" are the options that need to be specified in the invocation of the compiler so that it will compile correctly and only compile.

There are two kinds of required switches: leading and trailing.

Leading required switches are the first switches in the invocation of the compiler and are defined with attribute Leading_Required_Switches (or the equivalent Required_Switches). Trailing required switches are the last switches in the invocation of the compiler and are defined with attribute Trailing_Required_Switches.

Language "New_Lang" has only this single leading required switch: "-c".

for Leading_Required_Switches ("New_Lang") use ("-c");

Dependencies

There are two kinds of dependencies for file-based languages: "Makefile" and "None". "None" is the default.

When the dependency kind is "None", gprbuild will recompile a source only if the source file has been modified after it was last compiled. In other words, only if the time stamp of the object file is earlier than the time stamp of the source.

When the dependency kind is "Makefile", gprbuild expects to find a dependency file in the object directory. The file name of this dependency file is derived from the source file name, with the extension ".d". For example, the dependency file for a C source "toto.c" is named "toto.d".

This dependency file contains a Makefile fragment to list all the files that should be checked by gprbuild in deciding whether the source needs to be recompiled. If the object file has a time stamp earlier than the time stamp of any of these files, then gprbuild will recompile the source.

Here is the contents of a dependency file toto.d for a C source toto.c:

toto.o: /path/to/project_dir/sources/toto.c \
 /path/to/project_dir/templates/toto.h

Language "New_Lang" has the dependency kind "Makefile". This means that for a source "file_name.nlng", gprbuild should find in the object directory a Makefile fragment "file_name.d". If any of the files listed in the dependency file is more recent than the object file, then gprbuild will recompile the source.

The dependency kind is defined with attribute Dependency_Kind:

   for Dependency_Kind ("New_Lang") use "Makefile";

The switch for creating the dependency file when invoking the compiler for our fictitious language "New_Lang" is "--dependencies=.d". The attribute is Dependency_Switches:

   for Dependency_Switches ("New_Lang") use ("--dependencies=");

Search Directories

There are several ways to indicate to the compiler the list of directories to be searched for files needed to compile a source of a file-based language. For example, C (gcc) uses an environment variable "CPATH". This is specified by:

   for Include_Path ("C") use "CPATH";

Our language "New_Lang" uses a switch for each directory to be searched:

   for Include_Switches ("New_Lang") use ("-I", "");

Summary

In summary, a possible project file for our language "New_Lang" is the following:

project Prj is
   for Languages use ("New_Lang");
   for Source_Dirs use (".", "src1", "src2");
   for Object_Dir use "obj";

   package Naming is
      for Body_Suffix ("New_Lang") use ".nlng";
   end Naming;

   package Compiler is
      for Driver ("New_Lang") use "nlang";
      for Leading_Required_Switches ("New_Lang") use ("-c");
      for Dependency_Kind ("New_Lang") use "Makefile";
      for Dependency_Switches ("New_Lang") use ("--dependencies=");
      for Include_Switches ("New_Lang") use ("-I", "");
   end Compiler;
end Prj;

If the project directory "/proj_dir" contains a single source "toto.nlng" and the subdirectories "src1" and "src2" contain no file with extension ".nlng", then using the command:

   gprbuild -v prj.gpr

should result in a single invocation of the compiler:

/path/to/bin/nlang -c --dependencies=toto.d -I /proj_dir -I /proj_dir/src1 -I /proj_dir/src2 /project_dir/toto.nlng

and the following files should be created in object directory "obj":

  • auto.cgpr (the automatically created configuration project file)
  • toto.o (the object file)
  • toto.d (the dependency file)

Of course, there are many other attributes that define the characteristics of a programing language. If you want to define a new programming language, we encourage you to study all the configuration attributes in the GPRbuild User's Guide.

Special Languages

For some languages, the "object files" are not linked into the executable. For example, this is the case for Java. For these languages, this is indicated with attribute Objects_Linked at the project level (not in any package in the project file):

   for Objects_Linked ("<language name>") use "False";

The "compiler" for some "languages" may not even produce an object file. This is indicated with attribute Object_Generated, also at the project level:

   for Object_Generated ("<language name>") use "False";

For such languages, the compiler is invoked every time gprbuild is invoked.

There are even stranger "languages": languages with no compiler. This is indicated with a compiler driver specified as the empty string. One example is language "project file" that is known by default by gprbuild.

   package Compiler is
      for Driver ("project file") use "";
      ...
   end Compiler;

Languages such as "project file" are used by GPS. The "sources" are listed in the Project View, but as there is no compiler driver, no "compilation" is done by gprbuild.

More to come...

If you are using a new language extensively, then you will want to have its characteristics defined automatically in the configuration project file, without needing to put all the configuration attributes in each of the project files that contain source of your language. This will be the subject of a later Gem.


About the Author

Vincent Celier spent twenty years in the French Navy, as a radar and computer officer. He retired in 1988 with the rank of commander and joined CR2A, a software house, where he was one of the authors of the ISO Technical Report ExtrA (Extensions Temps Reel en Ada). In 1994 he emigrated to Vancouver, Canada, to work on CAATS, the Canadian Automated Air Traffic System, a large system written in Ada. He joined AdaCore in 2000. He is the main implementer of the Project Manager and of gprbuild, the multi-language builder.