Work Packets: A New Abstraction for GC Software Engineering, Optimization, and Innovation (SPLASH 2025 - OOPSLA)

Sun 12 - Sat 18 October 2025 Singapore

co-located with ICFP/SPLASH 2025

Who

Wenyu Zhao, Stephen M. Blackburn, Kathryn S McKinley

Track

SPLASH 2025 OOPSLA

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+08:00) Perth.

Use conference time zone: (GMT+08:00) PerthSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 18 Oct 2025 17:15 - 17:30 at Orchid East - Abstraction

Abstract

Garbage collection (GC) implementations must meet efficiency and maintainability requirements, which are often perceived to be at odds. Moreover, the desire for efficiency is often at odds with agility, undermining rapid development and innovation, which harms longer-term performance aspirations. Prior GC implementations struggle to: i) maximize efficiency, parallelism, and hardware utilization, while ii) correctly and elegantly implementing complex algorithm-specific optimizations and scheduling constraints. This struggle is reflected in today’s monolithic implementations with simple phase-based synchronization.

This paper presents a new design for GC implementations that emphasizes agility and efficiency. The design simplifies and unifies all GC tasks into work packets which define: i) work items, ii) kernels that process them, and iii) scheduling constraints.

Our simple insights are that GC implementations are high-level algorithms that orchestrate vast numbers of performance-critical work items, and that execution is dominated by a few very small, heavily executed kernels. Work packets comprise groups of like work items, such as the scanning of a thread’s stack or the tracing of a single object in a multi-million object heap. The kernel attached to a packet specifies how to process items within the packet, such as how to scan a stack, or how to trace an object. The scheduling constraints express dependencies, e.g. all mutators must stop before copying any objects. Fully parallel activities, such as scanning roots and performing a transitive closure, proceed with little synchronization. The implementation of a GC algorithm reduces to declaring required work packets, their kernels, and dependencies. The execution model operates transparently of GC algorithms and work packet type. We broaden the scope of work-stealing, applying it to any type of GC work and introduce a novel two-tier work-stealing algorithm to further optimize parallelism at fine granularity.

We show the software engineering benefits of this design via eight collectors that use 23 common work packet implementations in the MMTk GC framework. We use the LXR collector to show that the work packet abstraction supports innovation and high performance, outperforming the latest (OpenJDK 24) instance of the highly-tuned, state-of-the-art G1 garbage collector. We thus demonstrate that work packets achieve high performance, while simplifying GC implementation, making them inherently easier to optimize and verify.

Wenyu Zhao

Australian National University

Australia

Stephen M. Blackburn

Google; Australian National University

Australia

Kathryn S McKinley

Google