Gem #83: Type-Based Security 2: Validating the Input

by Yannick Moy —AdaCore

Let's get started…

Input validation consists of checking a set of properties on the input which guarantee it is well-formed. This usually involves excluding a set of ill-formed inputs (black-list) or matching the input against an exhaustive set of well-formed patterns (white-list).

Here, we consider the task of validating an input for inclusion in an SQL command. This is a well-known defense against SQL injection attacks, where an attacker passes in a specially crafted string that is interpreted as a command rather than a plain string when executing the initial SQL command.

The basic idea is to define a new type SQL_Input derived from type String. Function Validate checks that the input is properly validated and fails if not. Function Valid_String returns the raw data inside a validated string, as follows:

package Inputs is

   type SQL_Input is new String;

   function Validate (Input : String) return SQL_Input;

   function Valid_String (Input : SQL_Input) return String;

end Inputs;

The implementation of Validate simply checks that the input string does not contain a dangerous character before returning it as an SQL_Input, while Valid_String is a simple type conversion:

with Ada.Strings.Fixed; use Ada.Strings.Fixed;
with Ada.Strings.Maps;  use Ada.Strings.Maps;

package body Inputs is

   Dangerous_Characters : constant Character_Set := To_Set ("""*^';&><</"); 

   function Validate (Input : String) return SQL_Input is
   begin
      if Index (Input, Dangerous_Characters) /= 0 then
         raise Constraint_Error
           with "Invalid input " & Input & " for an SQL query ";
      else
         return SQL_Input (Input);
      end if;
   end Validate;

   function Valid_String (Input : SQL_Input) return String is
   begin
      return String (Input);
   end Valid_String;

end Inputs;

Now, this does not prevent future uses of such type conversions in the program, whether malicious or unintended. To guard against such possibilities, we must make type SQL_Input private. To make sure we do not ourselves inadvertently convert an input string into a valid one in the implementation of package Inputs, we use this opportunity to make SQL_Input a discriminated record parameterized by the validation status.

with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;

package Inputs is

   type SQL_Input (<>) is private;

   function Validate (Input : String) return SQL_Input;

   function Valid_String (Input : SQL_Input) return String;

   function Is_Valid (Input : SQL_Input) return Boolean;

private

   type SQL_Input (Validated : Boolean) is
      record
         case Validated is
            when True =>
               Valid_Input : Unbounded_String;
            when False =>
               Raw_Input   : Unbounded_String;
         end case;
      end record;

end Inputs;

Each time we access field Valid_Input, a discriminant check will be performed to ensure that the operand of type SQL_Input has been validated. Observe the use of Unbounded_String for the type of the input component, which is more convenient and flexible than using a constrained string.

Note in the implementation of Validate, that instead of raising an exception when the string cannot be validated, as in the first implementation, here we create corresponding validated or invalid input values based on the result of the check against dangerous characters. Also, an Is_Valid function has been added to allow clients to query validity of an SQL_Input value.

with Ada.Strings.Fixed; use Ada.Strings.Fixed;
with Ada.Strings.Maps;  use Ada.Strings.Maps;

package body Inputs is

   Dangerous_Characters : constant Character_Set := To_Set ("""*^';&><</"); 

   function Validate (Input : String) return SQL_Input is
      Local_Input : constant Unbounded_String := To_Unbounded_String (Input);
   begin
      if Index (Input, Dangerous_Characters) /= 0 then
         return (Validated   => False,
                 Raw_Input   => Local_Input);
      else
         return (Validated   => True,
                 Valid_Input => Local_Input);
      end if;
   end Validate;

   function Valid_String (Input : SQL_Input) return String is
   begin
      return To_String (Input.Valid_Input);
   end Valid_String;

   function Is_Valid (Input : SQL_Input) return Boolean is
   begin
      return Input.Validated;
   end Is_Valid;

end Inputs;

That's it! As long as this interface is used, no errors can result in improper input being interpreted as a command, while ensuring that future maintainers of the code won't inadvertently be able to insert inappropriate conversions.

Of course, this minimal interface does not really provide anything other than the validation of the input. Simply having an Is_Valid function to tell whether a string is valid input data would seem to give you much the same functionality. However, you can now safely extend this package with additional capabilities, such as transformations on valid SQL inputs (for example, to optimize queries before sending them to the database), or to resolve queries faster using a local cache, and so forth. By using the private encapsulation, you are guaranteed that no client package will tamper with the validity of the SQL inputs you are manipulating.

Incidentally, the similar but distinct problem of input sanitization, where possibly invalid data is transformed into something that is known valid prior to use, can be handled in the same way.


About the Author

Yannick Moy’s work focuses on software source code analysis, mostly to detect bugs or verify safety/security properties. Yannick previously worked for PolySpace (now The MathWorks) where he started the project C++ Verifier. He then joined INRIA Research Labs/Orange Labs in France to carry out a PhD on automatic modular static safety checking for C programs. Yannick joined AdaCore in 2009, after a short internship at Microsoft Research.

Yannick holds an engineering degree from the Ecole Polytechnique, an MSc from Stanford University and a PhD from Université Paris-Sud. He is a Siebel Scholar.