Interleaving Large Language Models for Compiler Testing (SPLASH 2025 - OOPSLA) - SPLASH 2025

Sun 12 - Sat 18 October 2025 Singapore

co-located with ICFP/SPLASH 2025

Who

Yunbo Ni, Shaohua Li

Track

SPLASH 2025 OOPSLA

Time Zone

The program is currently displayed in (GMT+08:00) Perth.

Use conference time zone: (GMT+08:00) PerthSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Fri 17 Oct 2025 14:30 - 14:45 at Orchid Plenary Ballroom - Testing 1 Chair(s): Karine Even-Mendoza

Abstract

Testing compilers with AI models, especially large language models (LLMs), has shown great promise. However, current approaches struggle with two key problems: The generated programs for testing compilers are often too simple, and extensive testing with the LLMs is computationally expensive. In this paper, we propose a novel compiler testing framework that decouples the testing process into two distinct phases: an offline phase and an online phase. In the offline phase, we use LLMs to generate a collection of small but feature-rich code pieces. In the online phase, we reuse these code pieces by strategically combining them to build high-quality and valid test programs, which are then used to test compilers.

We implement this idea in a tool, LegoFuzz, for testing C compilers. The results are striking: we found 66 bugs in GCC and LLVM, the most widely used C compilers. Almost half of the bugs are miscompilation bugs, which are serious and hard-to-find bugs that none of the existing LLM-based tools could find. We believe this efficient design opens up new possibilities for using AI models in software testing beyond just C compilers.

Yunbo Ni

The Chinese University of Hong Kong

Hong Kong SAR China

Shaohua Li

The Chinese University of Hong Kong

Hong Kong SAR China

Time Zone

The program is currently displayed in (GMT+08:00) Perth.

Use conference time zone: (GMT+08:00) PerthSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Fri 17 Oct
Displayed time zone: Perth change

	13:45 - 15:30	Testing 1OOPSLA at Orchid Plenary Ballroom Chair(s): Karine Even-Mendoza King’s College London

	13:45 15m Talk		An Empirical Evaluation of Property-Based Testing in Python OOPSLA Savitha Ravi UC San Diego, Michael Coblenz University of California, San Diego Link to publication
	14:00 15m Talk		Fray: An Efficient General-Purpose Concurrency Testing Platform for the JVM OOPSLA Ao Li Carnegie Mellon University, Byeongjee Kang Carnegie Mellon University, Vasudev Vikram Carnegie Mellon University, Isabella Laybourn Carnegie Mellon University, Samvid Dharanikota Efficient Computer, Shrey Tiwari Carnegie Mellon University, Rohan Padhye Carnegie Mellon University Pre-print Media Attached
	14:15 15m Talk		Fuzzing C++ Compilers via Type-Driven Mutation OOPSLA Bo Wang Beijing Jiaotong University, Chong Chen Beijing Jiaotong University, Ming Deng Beijing Jiaotong University, Junjie Chen Tianjin University, Xing Zhang Peking University, Youfang Lin Beijing Jiaotong University, Dan Hao Peking University, Jun Sun Singapore Management University
	14:30 15m Talk		Interleaving Large Language Models for Compiler Testing OOPSLA Yunbo Ni The Chinese University of Hong Kong, Shaohua Li The Chinese University of Hong Kong
	14:45 15m Talk		Model-guided Fuzzing of Distributed Systems OOPSLA Ege Berkay Gulcan Delft University of Technology, Burcu Kulahcioglu Ozkan Delft University of Technology, Rupak Majumdar MPI-SWS, Srinidhi Nagendra IRIF, Chennai Mathematical Institute
	15:00 15m Talk		Tuning Random Generators: Property-Based Testing as Probabilistic Programming OOPSLA Ryan Tjoa University of Washington; Jane Street, Poorva Garg University of California, Los Angeles, Harrison Goldstein University at Buffalo, the State University of New York at Buffalo, Todd Millstein University of California at Los Angeles, Benjamin C. Pierce University of Pennsylvania, Guy Van den Broeck University of California at Los Angeles DOI Pre-print
	15:15 15m Talk		Understanding and Improving Flaky Test Classification OOPSLA Shanto Rahman The University of Texas at Austin, Saikat Dutta Cornell University, August Shi The University of Texas at Austin