Formal Methods To Improve Code Quality

Projects have been developing code for safety and security-critical systems for decades. And we have gotten very good at it. So yes, maybe your web browser or phone crashes occasionally, but your airplane rarely crashes, and you seldom hear about problems with the brake controller in your car.

That is because people spend a lot of time designing, writing, reviewing, and testing the code that runs the devices that our lives depend on. Writing a line of code for a flight management system is many times more expensive than writing a line of code for your home thermostat.

Still, testing is never exhaustive; it is really, really hard to test all the paths through a piece of code, even if the business logic in that code is not terribly complicated.

Testing code is done primarily for 2 reasons:

To make sure the code does what it is designed to do, also known as functional correctness
To make sure the code does not do what it is not supposed to do: crash or otherwise violate safety and security properties. To make sure it does not have runtime errors.

Especially with languages such as C or C++, this latter case is important. Runtime errors happen when code goes outside of boundaries, for example, reading outside the bounds of an array, or dereferencing a null pointer. The problem with testing for runtime errors is that C and C++ do not necessarily throw an error when you do something wrong. And this is something to think about, even though your program passes step 1 above, it could still corrupt memory without anybody noticing.

 int a[5];
  a[5]=0;

The code above will give you a warning in any reasonably modern C/C++ compiler, but it will still compile and will still run, and may or may not crash.

Memory Safe Languages

C and C++ are powerful languages, close to the hardware, and they allow a software developer to get the most performance out of the available hardware. There is a lot of chatter lately about languages that are memory safe. Memory safe languages will either statically prevent this type of code from compiling, or, if it compiles, generate a run-time error (sometimes called a panic) during execution.

Rust and Ada are two examples of memory-safe languages. Compilers for these languages will not accept code like the code above, the languages are memory safe, partially through compiler checks, partially through dynamic checks.

Rust and Ada also allow the programmer to get close to hardware, Ada even lets you control exact memory layout with its representation clauses. On top of that, Ada allows you to turn off runtime checks for runtime errors to improve performance. Which you can do safely once you have used SPARK to prove that the code is free of runtime errors. Several users of SPARK have compared it to C code and concluded that both languages, when used correctly, are of similar performance.

Static Memory Safety

The programming language SPARK is a subset of Ada that can statically (during compile time) guarantee that your code does not have runtime errors and even prove functional correctness and key safety or security properties. That means that errors of type 2 above are simply not possible. SPARK uses a formal proof engine to analyze the code after the compiler compiles it, and that proof engine can mathematically verify that there are no runtime errors, or, if it is not able to provide that guarantee, it will highlight problems to the developer to resolve.

SPARK is sound, which means that there are no false negatives. The developer can completely rely on the tool.

Proving The Absence Of Runtime Errors

To achieve this feat, the developer will have to put in a bit more work during development. You can see this additional work as ‘explaining the algorithm to a code reviewer’. A bit of additional effort during development to guarantee the absence of runtime errors is a valuable trade-off for many systems where safety and security are critical.

The proof system pushes the developer to think more about the algorithm, and especially corner cases of the algorithm. Take, for example, the code below that shows an implementation of a binary search on a sorted array of Integers. Even if you are not an Ada programmer, the algorithm will be fairly easy to follow.

procedure Binary_Search
    (Arr    : in Integer_Array;
     Target : in Integer;
     Found  : out Boolean;
     Index  : out Valid_Index)
  is
     Left  : Positive;
     Right : Positive;
     Mid   : Positive;
  begin
     Found := False;
     Index := No_Index;
     Left := Arr'First;
     Right := Arr'Last;
    
     while Left <= Right loop       
        -- Calculate middle index avoiding overflow
        Mid := Left + (Right - Left) / 2;
    
        if Arr(Mid) < Target then
           Left := Mid + 1;
        elsif Arr(Mid) > Target then
           Right := Mid - 1;
        else
           Found := True;
           Index := Mid;
           return;
        end if;
     end loop;
  end Binary_Search;

From a runtime error perspective, this code could have a bunch of different things go wrong with it:

The array could not be sorted
The array could be empty or only hold 1 entry
The number may not be in the array

All things you can test for, of course, and fix the code as you find problems. SPARK and Formal methods guarantee that you tackle it.

For the sorting of the array, the developer would add a precondition to the definition of the Binary_Search procedure, which would look like:

    Pre => Sorted (Arr),

Where

function Sorted (Arr : Integer_Array) return Boolean is
    (for all I1 in Arr'Range => (for all I2 in I1 .. Arr'Last => Arr (I1) <= Arr (I2)))
  with Ghost;

This function Sorted is not something that is called at runtime; it is only a specification-level concept, so we do not have to be concerned about performance.

When we run the formal method proof engine on the code, we see warnings such as:

binary_search_pkg.adb:53:17: medium: array index check might fail
   53 |         if Arr(Mid) < Target then
      |                ^~~
  reason for check: value must be a valid index into the array
  possible fix: loop at line 39 should mention Mid in a loop invariant
   39 |      while Left <= Right loop
      |                          ^ here

Which are type and array overruns. We could fix these by writing defensive code, but instead, if we help the proof engine understand the algorithm a little bit, we can help it guarantee the absence of overruns.

The type of help that the proof system needs is in the form of loop variants and invariants, for example:

        pragma Loop_Variant
           (Increases => Left, Decreases => Right);

Which indicates that every pass through the loop either the search space left diminishes. And:

         pragma Loop_Invariant
           (Left in Arr'Range and Right in Arr'Range);

Which helps the prover understand that Left, nor Right will ever overrun the array’s range. Not that this invariant still needs to be proven by the prover, but it helps it on its path.

People frequently comment on the fact that using SPARK forces them to think more critically about their algorithm. The algorithm above, for example, will continue to throw overflow and underflow errors because there are 2 corner cases that need handling: 1) the case where the array is empty; and 2) the case where the value is not within the bounds of the array, or in other words, if the target is to the left of the lowest index, or the right of the highest index.

The code that we have now is considered at the SPARK-Silver level, that is, we have proven the absence of runtime errors.

Getting back to our earlier comment on performance, since we know that the code is free of runtime errors, we can turn off the runtime error checking through a compiler switch. Think of this as removing your defensive code in C or C++.

Proving Functional Correctness

Once we have proven the absence of runtime errors, we can prove the correctness of the algorithm as an additional benefit. This is called SPARK-Gold level. Proving correctness means that we need to provide a postcondition for our Binary_Search function, something like the following:

 Post => (if Found then
               Index in Arr'Range and then
               Arr (Index) = Target
             else
               (for all I in Arr'Range => Arr (I) /= Target));

Again, SPARK is easy to read, so if the return parameter Found is True, then Index is in the range of the array and Arr (Index) is the Target we were looking for and if Found is False, then there is no such value in the entire array.

For the prover to be able to confirm that this post condition is true, we have to give a bit more explanation to the prover about our algorithm. We can state, for example that all values to the left of Left, are less than Target and all values to the Right, are more than Target:

     pragma Loop_Invariant
          (for all Index in Arr'First .. Left - 1 => Arr(Index) < Target);
        pragma Loop_Invariant
          (for all Index in Arr'Range => (if Index > Right then Target < Arr (Index)));

These types of explanations about our algorithm helps the prover out, and it can confirm that the program is free of runtime errors and that all Post-Conditions have been proven as well.

The full code for this example is in the appendix below.

Ada and SPARK In The Real World

Ada is by no means a new language; it has been around for four decades and in that time has proven itself as a highly performant and stable language with new features added over time as technology evolved. There are compilers and runtime environments for many operating systems, real-time operating systems as well as bare-metal environments for pretty much any instruction set architecture. From legacy instruction sets like PPC to Arm and Risc-V, from Linux and Windows to LynxOS, VxWorks, QNX and others.

SPARK has been around for a long time as well. It uses the same compilers as Ada, so it has the same broad ecosystem to build on and has been used in many fielded systems across Aviation, as well as automotive and robotics. NVIDIA has famously adopted SPARK for code in their Drive AGX (and other) layers that require high levels of safety and security.

Getting started with SPARK is fairly straightforward; there is a package manager called Alire that can be used to download not just crates (libraries and runtimes), but also the compiler and prover. All tools are available in the open source space, and commercial support is available from companies such as AdaCore.

Conclusion

Memory safety is a hot topic currently, for very valid reasons, and applying memory safety will help embedded developers deliver better products to their customers. This article highlights the difference between static and dynamic memory safety. SPARK is a language that delivers static memory safety and can prove functional correctness as well. The specification language used for this is the Ada programming language, makes pre- and post-conditions and variants/invariants easy to read and write.

Interested? There is a great blog post on the benefits of using SPARK, summarizing a DEFCON presentation by NVIDIA on the benefits of programming SPARK. Summary: It takes longer to write code in SPARK, but there are much less bugs and the resulting software is of much higher quality.

Appendix

binary_search_pkg.ads

package Binary_Search_Pkg
  with SPARK_MODE
is


  type Integer_Array is array (Positive range <>) of Integer;
 
  subtype Valid_Index is Positive;


  No_Index : constant Valid_Index := 1;


  -- Function to check if an array is sorted in ascending order
  function Sorted (Arr : Integer_Array) return Boolean is
    (for all I1 in Arr'Range => (for all I2 in I1 .. Arr'Last => Arr (I1) <= Arr (I2)))
  with Ghost;
 
  -- Binary search procedure that returns both success status and index
  procedure Binary_Search
    (Arr    : in Integer_Array;
     Target : in Integer;
     Found  : out Boolean;
     Index  : out Valid_Index)
  with
    Pre => Sorted (Arr),
    Post => (if Found then


               Index in Arr'Range and then
               Arr (Index) = Target
             else
               (for all I in Arr'Range => Arr (I) /= Target));


end Binary_Search_Pkg;

binary_search_pkg.adb

package body Binary_Search_Pkg
  with  SPARK_Mode
is

  procedure Binary_Search
    (Arr    : in Integer_Array;
     Target : in Integer;
     Found  : out Boolean;
     Index  : out Valid_Index)
  is
     Left  : Positive;
     Right : Positive;
     Mid   : Positive;
  begin
     Found := False;
     Index := No_Index;

     if Arr'First > Arr'Last then
        return;
     end if; 

     Left := Arr'First;
     Right := Arr'Last;

     Index := Arr'First;

     if Arr(Left) > Target or Arr(Right) < Target then
        return;
     end if;
    
     while Left <= Right loop
        pragma Loop_Variant
           (Increases => Left, Decreases => Right);
        pragma Loop_Invariant
           (Left in Arr'Range and Right in Arr'Range);

           pragma Loop_Invariant
           (for all Index in Arr'First .. Left - 1 => Arr(Index) < Target);
           pragma Loop_Invariant
           (for all Index in Arr'Range => (if Index > Right then Target < Arr (Index)));
       
        -- Calculate middle index avoiding overflow
        Mid := Left + (Right - Left) / 2;
    
        if Arr(Mid) < Target then
           Left := Mid + 1;
        elsif Arr(Mid) > Target then
           Right := Mid - 1;
        else
           Found := True;
           Index := Mid;
           return;
        end if;


     end loop;
  end Binary_Search;


end Binary_Search_Pkg;

Tools

Services

Get an Overview

Ready To Go?

Languages

Ready To Go?

Industries

Learn how NVIDIA Adopted SPARK for Security-Critical Software Development

Company

Careers

Explore Resources

The AdaCore Blog

Learn Ada & SPARK

Community

Get Started with Ada

Support

Product Roadmap

Formal Methods To Improve Code Quality

Memory Safe Languages

Static Memory Safety

Proving The Absence Of Runtime Errors

Proving Functional Correctness

Ada and SPARK In The Real World

Conclusion

Appendix

Author

Mark Hermeling

Latest Blog Posts

Proving asymptotic time complexity bounds with SPARK

Information Hiding and Context Management in SPARK

Unit Testing with Google Test and GNATcoverage