Abstract Interpretation vs Agentic AI

Large language models, especially their use through agents, have improved tremendously over the past six months. Their impact on software engineering has been profound, and I look forward to their continued contributions to the practice of building safe and secure software.

However you look at them, agents are still probabilistic. Strong and capable, but non-deterministic by nature and prone to hallucinating content or glossing over even simple features and problems.

Static analysis is another process to find problems in source code. How does static analysis compare to using an LLM to find defects in source code, and when would you use one over the other?

Analyzing software with an LLM

LLMs are, for most users, black boxes. You send content in the form of a program, combined with a prompt. The LLM will analyze all or parts of it and provide responses. Some of these responses are spot-on; some are not. How the prompt, or the program text, impacts the result is unpredictable and can vary between LLMs, between versions of the LLM, between the time of day, seemingly the position of the moon, and the size of the prompt and text.

I ran a test on a small program of about 1,000 lines of code, and the results were good and fairly complete. On a 1M LOC program that simply will not hold, the LLMs themselves indicate that about 3k LOC in a single file is the sweet spot. For larger code bases, the LLM suggests using a static analysis tool. Another area where LLMs (by their nature) are weak is in a warning category that requires inter-file analysis: a function in file A calls a function in file B.

This means that in software development processes, especially where high-integrity software is involved, you cannot rely on an LLM for your process. That does not mean you cannot use it. You can and absolutely should use it as one input into your testing process. But you cannot build on it, you cannot use it as your sole guidance.

On top of the LLM's unpredictable nature, the cost, at least at this point in time, prohibits its frequent use in large-scale programs. You simply cannot afford to ask an LLM, for example, to evaluate every single line of code in your 200k LOC application for MISRA violations or undefined behavior on every code change.

Analyzing software with Static Analysis

Compared to an LLM, static analysis is more deterministic, trustworthy, and capable of being used in a well-defined software development process for the development of high-integrity software, such as the software in cars, planes, medical devices, you name it. Static analysis tools can be qualified for use in these processes, meaning you understand how they work, their failure cases, and when you can trust their output and when you cannot.

There are two types of problems that a static analysis tool reports: coding style and undefined behavior warnings. Coding style warnings are simpler and address issues such as MISRA-C 2025 rule 2.3, which mandates that a project must not contain unused type declarations. These types of warnings are easy to detect reliably. Most notably, they do not depend on the program's execution path.

For warnings that depend on possible paths through the source code, the situation changes drastically. For example, take MISRA-C 2026 rule 2.2, which the standard lists as non-decidable, meaning this rule fundamentally cannot be verified automatically by any algorithm or static analysis tool in all possible cases.

To compute all violations of this rule, a static analysis tool must analyze all paths through the source code and find faults in them. The problem now is one of state explosion; there is a saying that says the number of paths through the Linux kernel is greater than the number of atoms in the universe.

Static analysis tools use the technique of abstract interpretation to calculate through the possible state space of a program. For an arbitrarily sized program (e.g., anything over 100k LOC), they can no longer compute over the entire state space, so tools need to be smart and use summaries and related optimization techniques. As a result, they may miss problems (we call these false negatives), so you cannot trust the absence of errors to mean that there are no problems in the source code. On the other hand, they may overapproximate and may report errors that are not real problems. We call these false positives.

False positives are not a safety issue. They need to be assessed and documented. They are a problem from a workload perspective; if there are too many false positives, then the (human) developers lose attention span and trust in the tool, and that is not acceptable from a software development process perspective.

Static analysis is also much cheaper than processing an LLM. Developers can run static analysis on every code change, either on their own desktop machines or in a CI/CD pipeline. Depending on the size of the codebase, this process takes minutes to tens of minutes and returns results to the developer.

Combining the best of both worlds

What I have found as a great collaboration between LLMs and static analysis tools is a two-step collaboration:

Step 1:

Use static analysis to analyze _every_single_code_change. Every time a developer changes code, run it through static analysis, grab the results
Ask the LLM to interpret the results and suggest fixes
The developer reviews the fixes and integrates them
Rinse and repeat.

Step 2:

At regular intervals, use the LLM to inspect specific parts of the application and suggest improvements, particularly in areas where static analysis is weak. Ask it if the application is susceptible to denial-of-service attacks. Ask it if the password input is properly encrypted before storage. Ask it whether the changes since last week violate your architectural definition.

This uses the strength of static analysis to find as many defects as possible, and the strength of LLMs to interpret large code bases at a high level and provide feedback.

This also allows you to use your existing software development process and your qualified static analysis tool on your path to safety certification.

Conclusion

If you are not yet using LLMs in your software development process, then I urge you to investigate. They can provide useful input and can support your software development team. A short video on a workflow for using static analysis with an LLM is available here.

Tools

Services

Get an Overview

Ready To Go?

Languages

Ready To Go?

Industries

Learn how NVIDIA Adopted SPARK for Security-Critical Software Development

Company

Careers

Explore Resources

The AdaCore Blog

Learn Ada & SPARK

Community

Get Started with Ada

Support

Product Roadmap

Abstract Interpretation vs Agentic AI

Analyzing software with an LLM

Analyzing software with Static Analysis

Combining the best of both worlds

Step 1:

Step 2:

Conclusion

Author

Mark Hermeling

Latest Blog Posts

SPARK Doesn't Comply With MISRA C. It Makes Most of It Moot.

Prototyping memory safety in your automotive stack — in weeks, not years

Building and evolving multi-language systems with confidence using GNATpolyglot