SPLASH 2025
Sun 12 - Sat 18 October 2025 Singapore
co-located with ICFP/SPLASH 2025

Binary lifting is a key component in binary analysis tools. In order to guarantee the correctness of binary lifting, researchers have proposed various formally verified lifters. However, such formally verified lifters have too strict requirements on binary, which do not sufficiently reflect real-world lifters. In addition, real-world lifters use heuristic-based assumptions to lift binary code, which makes it difficult to guarantee the correctness of the lifted code using formal methods. In this paper, we propose a new interpretation of the correctness of real-world binary lifting. We formalize the process of binary lifting with heuristic-based assumptions used in real-world lifters by dividing it into a series of transformations, where each transformation represents a lift with new abstraction features. We define the correctness of each transformation as \emph{filtered-simulation}, which is a variant of bi-simulation, between programs before and after transformation. We present three essential transformations in binary lifting and formalize them: (1) control flow graph reconstruction, (2) abstract stack reconstruction, and (3) function input/output identification. We implement our approach for x86-64 Linux binaries and demonstrate that it can correctly lift Coreutils and CGC datasets compiled with GCC.