Memory-Safety Verification of Open Programs With Angelic Assumptions
This program is tentative and subject to change.
An \textit{open program} is one for which the complete source code is not available, which is a reality for real-world program verification. Software verification tools tend to assume the worst about any unconstrained behavior, and this can yield an enormous number of spurious warnings for open programs. For any serious verification effort, the engineer must invest time up-front in building a suitable model (or mock) of any missing code, which is time-consuming and error-prone. Inaccuracies in the mocks can lead to incorrect verification results.
In this paper, we demonstrate a technique that is capable of distinguishing between false positives and actual bugs from potential memory-safety violations in an open program with high accuracy. Central to the technique is the ability of making \textit{angelic} assumptions about missing code. To accomplish this, we first mine a set of idiomatic patterns in buffer-manipulating programs using a large language model (LLM). This is complemented by a formal synthesis strategy that performs property-directed reasoning to select, adapt and instantiate these idiomatic patterns into angelic assumptions on the target program. Overall, our system, \textit{Seeker}, guarantees that a program is deemed correct \textit{only} if it can be verified under a well-defined set of ``trusted'' idiomatic patterns. In our experiments over a set of benchmarks curated from popular open-source software, our tool \textit{Seeker} is able to identify 79% of the false positives with zero false negatives.