Gem #138 : Master the Command Line - Part 1

Applications can be configured in multiple ways. Among the most frequent are command-line options, configuration files, and graphical user interfaces. The GNAT technology provides various means to interface with those, respectively Ada.Command_Line and GNAT.Command_Line, GNATCOLL.Config, and GtkAda.

This Gem is concerned with parsing the command-line options provided by the user.

A command line may contain various pieces of information:

  • switches, which can be short switches (e.g., "-a") or long switches (e.g. "--long"). These switches can have optional parameters, which come right after the switch itself, and can be separated by various characters, for instance "-a value" or "--long=value". Applications often allow short switches to be combined (so for instance, "-ab" is the same as "-a -b"). This combination might involve finding a common prefix for switches (in gnatmake, "-gnatpQ" is equivalent to "-gnatp -gnatQ"). It is also possible to have aliases. In gnatmake, for instance, users can use "-gnaty" instead of the much longer "-gnaty3abcefhiklmnprst").
  • arguments, such as file names to manipulate, that are not associated with switches. One of the difficulties in parsing the command line is to distinguish between switch parameters and other arguments.
  • sections, which typically are used to provide switches for other applications spawned by the main one. A good example of this is the gnatmake command line, where it's possible to specify switches for the compiler (after "--cargs"), for the linker (after "--largs"), and so on. In such a switch section, the first application will typically not know what switches are valid, so it should allow any switch or argument, so that it can then pass them on to the other application.

This is a rich set of possibilities, and parsing the command line can lead to code that is hard to maintain and can make adding new switches difficult.

Users also expect a few standard switches to exist, among which are "-h" and "--help", which should provide a short description of all the possible switches. This does not substitute for proper documentation, but acts as a quick summary and reminder.

The Ada standard provides the package Ada.Command_Line. This package is a convenient way to access each of the elements on the command line, but it does not provide any help in describing their meaning. The application is responsible for guessing whether it has a switch, an argument to a switch seen previously, or an argument.

The package GNAT.Command_Line builds on top of this, and provides a much richer API. This API can be used at several levels of abstraction, which we will now describe, starting from the lowest level to the higher levels.

Simple parsing of the command line

Let's start with an example of code that parses the command line.

   with GNAT.Command_Line;   use GNAT.Command_Line;
   with Ada.Text_IO;         use Ada.Text_IO;

   procedure Main is
   begin
      loop   -- 1
         case Getopt ("a b: -long= -help") is    -- 2
            when 'a' =>
               Put_Line ("Seen -a");   -- 3
            when 'b' =>
               Put_Line ("Seen -b with arg=" & Parameter);  -- 4
            when '-' =>
               if Full_Switch = "--long" then   -- 5
                  Put_Line ("Seen --long with arg=" & Parameter);
               elsif Full_Switch = "--help" then
                  Put_Line ("Seen --help");
               end if;
            when others =>  -- 6
               exit;
         end case;
      end loop;

      Put_Line ("File argument was " & Get_Argument);  -- 7
   end Main;

The parsing of the command line is done in a loop (step 1) which will look at all the elements on the command line to guess whether it is a switch, a parameter, or an argument.

The call to Getopt on step 2 is where most of the work is done. It will look at the next element on the command line, and extract information from it. The mandatory argument is a description of the list of valid switches. The syntax is fully described in g-comlin.ads (or accessible via the /Help/GNAT Runtime menu in GPS), but basically boils down to the following: switches are assumed to all start with the same character ('-' by default), which does not need to be repeated here. In this example, the application accepts four switches. The first one is "-a", which takes no parameter; the second is "-b", which requires a parameter (there can optionally be a space between the switch and its parameter, so "-bparam" and "-b param" are both valid); the third switch is "--long", which also requires a parameter, and must be separated from the switch by either a space or an equal sign. Finally, the last switch is "--help", which takes no parameter.

For each switch it finds on the command line, Getopt returns its first character, to allow for a fast case statement rather than a series of if statements. Unfortunately, this design does not deal completely well with long switches, which all start with "-" and thus require if statements as in step 5.

Step 4 prints the parameter that was passed for "-b". Parameter is a function that returns the parameter of the current switch.

When there are no more switches on the command line, Getopt returns ASCII.NUL, and step 6 exits the loop.

Finally, we assume the application requires one file name on the command line. That file name can be retrieved by a call to Get_Argument. This function cannot be called until we have already found all switches, since otherwise there could be ambiguities between switch parameters and arguments. Get_Argument provides one additional service: it can expand an argument that contains a meta character like "*", for instance "*.adb". On Unix systems this has already been expanded by the shell, leading to multiple arguments. But on Windows, the application itself is responsible for doing this expansion, and Get_Argument can do it for you.

As it is, the code will automatically deal correctly with the following command line: "-a -bvalue --long=value", or even "-abvalue". However, a command line such as "-c" or "-avalue" will be rejected, and an exception raised.

Sections

As we mentioned, applications like gnatmake accept different set of switches for the various applications they will spawn. Such sections must be declared as such:

    Initialize_Option_Scan (Section_Delimiters => "cargs bargs largs");

And multiple loops are needed based on Getopt, one for each section. So the code would be similar to:

    Goto_Section ("cargs");
    --  a loop around Getopt as described above

    Goto_Section ("bargs");
    --  a loop around Getopt as described above

It is likely that the parameter to Getopt in one or more sections will start with "*", to indicate that all elements found on the command line in that section should be returned as is, without trying to guess whether they are switches or arguments, because gnatmake simply wants to pass them as is to the application it spawns.

Simulating the command line

On some systems, most notably embedded systems, there is no notion of command line. So GNAT.Command_Line would appear useless on those systems.

But in fact, it doesn't have to read the elements from a real command line. It can also read them from an array of strings. So, using the same example as above, we could keep most of the code and simply simulate a command line. For instance:

   with GNAT.Command_Line;   use GNAT.Command_Line;
   with GNAT.OS_Lib;   use GNAT.OS_Lib;

   ...

   declare
      Parser : Opt_Parser;
      Command_Line : constant Argument_List_Access :=
         Argument_String_To_List ("-a -bvalue --long=value");
   begin
      Initialize_Option_Scan (Parser, Command_Line);

      loop
         case Getopt ("-a -b: -long= -help", Parser => Parser) is
           ... as before ...
         end case;
      end loop;

      Free (Parser);
   end;

Command_Line can be an empty list in the case where we want to parse the actual elements of the command line (provided that the system supports command lines).

The next Gem in this series will discuss aspects of the high-level API of GNAT.Command_Line  that greatly simplify command-line processing.