Fuzzing C++ Compilers via Type-Driven Mutation
This program is tentative and subject to change.
C++ is a system-level programming language for modern software development, which supports multiple programming paradigms, including object-oriented, generic, and functional programming. The intrinsic complexity of these paradigms and their interactions grants C++ powerful expressiveness while posing significant challenges for compilers in correctly implementing its type system. A type system encompasses various aspects such as type inference, type checking, subtyping, type conversions, generics, scoping, and binding. However, systematic testing of the type systems of C++ compilers remains largely underexplored in existing studies.
In this work, we present TyMut, the first approach specifically designed to test the C++ type system. TyMut is a mutation-based compiler fuzzer equipped with advanced type-driven mutation operators, carefully crafted to target intricate type-related features such as template generics, type conversions, and inheritance. Beyond differential testing, TyMut introduces enhanced test oracles through a must analysis that partially confirms the validity of generated programs. Specifically, mutation operators are classified into well-formed and not-well-formed: Programs generated by well-formed mutation operators are valid and must be accepted by compilers.
Programs generated by not-well-formed operators are validated against a set of well-formedness rules. Any violation indicates the program is invalid and must be rejected. For programs that pass the rules but lack a definitive oracle, TyMut applies differential testing to identify behavioral inconsistencies across compilers.
The testing campaign took about 32 hours to generate and test 250584 programs. The must analysis provides definite test oracles for nearly 80% of all generated programs. TyMut uncovered 102 bugs in the recent versions of GCC and Clang, with 56 confirmed as new bugs by compiler developers. Among the confirmed bugs, 26 of them cause compiler crashes, and more than 50% cause miscompilation. Additionally, 7 of them had remained hidden for over 20 years, 22 for over 10 years, and 39 for over 5 years. One long-standing bug discovered by TyMut was later confirmed as the root cause of a real-world issue in TensorFlow. Before submitting this paper, 13 bugs were fixed, most of which were fixed within 60 days. Notably, some unconfirmed bugs have led to in-depth discussions among developers. For instance, one bug led a compiler developer to submit a new issue to the C++ language standard, showing that we uncovered ambiguities in the language specification.
This program is tentative and subject to change.
Fri 17 OctDisplayed time zone: Perth change
13:45 - 15:30 | |||
13:45 15mTalk | An Empirical Evaluation of Property-Based Testing in Python OOPSLA | ||
14:00 15mTalk | Fray: An Efficient General-Purpose Concurrency Testing Platform for the JVM OOPSLA Ao Li Carnegie Mellon University, Byeongjee Kang Carnegie Mellon University, Vasudev Vikram Carnegie Mellon University, Isabella Laybourn Carnegie Mellon University, Samvid Dharanikota Efficient Computer, Shrey Tiwari Carnegie Mellon University, Rohan Padhye Carnegie Mellon University Pre-print Media Attached | ||
14:15 15mTalk | Fuzzing C++ Compilers via Type-Driven Mutation OOPSLA Bo Wang Beijing Jiaotong University, Chong Chen Beijing Jiaotong University, Ming Deng Beijing Jiaotong University, Junjie Chen Tianjin University, Xing Zhang Peking University, Youfang Lin Beijing Jiaotong University, Dan Hao Peking University, Jun Sun Singapore Management University | ||
14:30 15mTalk | Interleaving Large Language Models for Compiler Testing OOPSLA | ||
14:45 15mTalk | Model-guided Fuzzing of Distributed Systems OOPSLA Ege Berkay Gulcan Delft University of Technology, Burcu Kulahcioglu Ozkan Delft University of Technology, Rupak Majumdar MPI-SWS, Srinidhi Nagendra IRIF, Chennai Mathematical Institute | ||
15:00 15mTalk | Tuning Random Generators: Property-Based Testing as Probabilistic Programming OOPSLA Ryan Tjoa University of Washington; Jane Street, Poorva Garg University of California, Los Angeles, Harrison Goldstein University at Buffalo, the State University of New York at Buffalo, Todd Millstein University of California at Los Angeles, Benjamin C. Pierce University of Pennsylvania, Guy Van den Broeck University of California at Los Angeles Pre-print | ||
15:15 15mTalk | Understanding and Improving Flaky Test Classification OOPSLA Shanto Rahman The University of Texas at Austin, Saikat Dutta Cornell University, August Shi The University of Texas at Austin |