Gem #135 : Erroneous Execution - Part 4

by Bob Duff —AdaCore

Let's get started...

Many programmers believe that "optimizers should not change the behavior of the program". Many also believe that "optimizers do not change the behavior of the program". Both beliefs are false in the presence of erroneousness.

So if erroneousness is so bad, why does the Ada language design have it? Certainly, a language designer should try to minimize the amount of erroneousness. Java is an example of a language that eschews erroneousness, but that comes at a cost. It means that lots of useful things are impossible or infeasible in Java: device drivers, for example. There is also an efficiency cost. C is an example of a language that has way too much erroneousness. Every single array-indexing operation is potentially erroneous in C. (C calls it "undefined behavior".)

Ada is somewhere in between Java and C in this regard. You can write device drivers in Ada, and user-defined storage pools, and other things that require low-level access to the machine.

But for the most part, things that can cause erroneousness can be isolated in packages -- you don't have to scatter them all over the program as in C.

For example, to prevent dangling pointers, try to keep the "new" and Unchecked_Deallocations together, so they can be reasoned about locally. A generic Doubly_Linked_List package might have dangling pointer bugs within itself, but it can be designed so that clients cannot cause dangling pointers.

Another way to prevent dangling pointers is to use user-defined storage pools that allow deallocation of the entire pool at once. Store heap objects with similar lifetimes in the same pool. It might seem that deallocating a whole bunch of objects is more likely to cause dangling pointers, but in fact just the opposite is true. For one thing, deallocating the whole pool is much simpler than walking complicated data structures deallocating individual records one by one. For another thing, deallocating en masse is likely to cause catastrophic failures that can be fixed sooner rather than later. Finally, a user-defined storage pool can be written to detect dangling pointers, for example by using operating system services to mark deallocated regions as inaccessible.

Note that Ada 2012 has "Subpools", which make user-defined storage pools more flexible.

A final point about erroneousness that might be surprising is that it can go backwards in time. For example:

if Count = 0 then
   Put_Line ("Zero");
end if;
Something := 1 / Count; -- could divide by zero

If checks are suppressed, the entire 'if' statement, including the Put_Line, can be removed by the optimizer. The reasoning is: If Count is nonzero, we don't want to print "Zero". If Count is zero, then it's erroneous, so anything can happen, including not printing "Zero".

Even if the Put_Line is not removed by the compiler, it can appear to be, because the "Zero" might be stored in a buffer that never gets flushed because some later erroneousness caused the program to crash.

Every statement about Ada must be understood to have ", unless execution is erroneous" after it. In this case, "Count = 0 returns True if Count is zero" is obviously true, but it really means "Count = 0 returns True if Count is zero, unless execution is erroneous, in which case anything can happen".

Moral: Take care to avoid writing erroneous programs.