SPLASH 2025
Sun 12 - Sat 18 October 2025 Singapore
co-located with ICFP/SPLASH 2025

This program is tentative and subject to change.

Fri 17 Oct 2025 11:15 - 11:30 at Orchid Plenary Ballroom - Compilation 3

The R programming language is primarily designed for statistical computing and mostly used by researchers without a background in computer science. R provides a wide range of dynamic features and peculiarities that are difficult to analyze statically like dynamic scoping and lazy evaluation with dynamic side-effects. At the same time, the R ecosystem lacks sophisticated analysis tools that support researchers in understanding and improving their code. In this paper, we present a novel static dataflow analysis framework for the R programming language that is capable of handling the dynamic nature of R programs and produces the dataflow graph of given R programs. This graph can be essential in a range of analyses, including program slicing, which we implement as a proof of concept. The core analysis works as a stateful fold over a normalized version of the abstract syntax tree of the R program, which tracks (re-)definitions, values, function calls, side effects, external files, and a dynamic control flow to produce one dataflow graph per program. We evaluate the correctness of our analysis using output equivalence testing on a manually curated dataset of 779 sensible slicing points from executable real-world R scripts. Additionally, we use a set of systematic test cases based on the capabilities of the R language and the implementation of the R interpreter and measure the runtime as well as the memory consumption on a set of 4,230 real-world R scripts and 20,815 packages available on R’s package manager CRAN. Furthermore, we evaluate the recall of our program slicer, its accuracy using shrinking, and its improvement over the state of the art. We correctly analyze almost all programs in our equivalence test suite, preserving the identical output for 99.7 % of the manually curated slicing points. On average, we require 153.2 ms to analyze the dataflow of a given file and around 211.6 kB for the dataflow graph. This shows that our analysis is capable of analyzing real-world sources quickly and correctly. Our slicer achieves an average reduction of 84.8 % of tokens indicating its potential to improve program comprehension.

This program is tentative and subject to change.

Fri 17 Oct

Displayed time zone: Perth change

10:30 - 12:15
10:30
15m
Talk
Efficient Algorithms for the Uniform Tokenization Problem
OOPSLA
Wu Angela Li Rice University, Konstantinos Mamouras Rice University
10:45
15m
Talk
REPTILE: Performant Tiling of Recurrences
OOPSLA
Muhammad Usman Tariq Stanford University, Shiv Sundram Stanford University, Fredrik Kjolstad Stanford University
11:00
15m
Talk
SPLAT: A framework for optimised GPU code-generation for SParse reguLar ATtention
OOPSLA
Ahan Gupta University of Illinois at Urbana-Champaign, Yueming Yuan University of Illinois Urbana-Champaign, Devansh Jain University of Illinois at Urbana-Champaign, Yuhao Ge University of Illinois at Urbana-Champaign, David Aponte Microsoft, Yanqi Zhou Google, Charith Mendis University of Illinois at Urbana-Champaign
11:15
15m
Talk
Statically Analyzing the Dataflow of R Programs
OOPSLA
Florian Sihler Ulm University, Matthias Tichy Ulm University
11:30
15m
Talk
Static Inference of Regular Grammars for Ad Hoc Parsers
OOPSLA
Pre-print
11:45
15m
Talk
Syntactic Completions with Material Obligations
OOPSLA
David Moon University of Michigan, Andrew Blinn University of Michigan, Thomas J. Porter University of Michigan, Cyrus Omar University of Michigan
DOI Pre-print