HieraSynth: A Parallel Framework for Complete Super-Optimization with Hierarchical Space Decomposition
Modern optimizing compilers generate efficient code but rarely achieve theoretical optimality, often necessitating manual fine-tuning for performance-critical applications. This challenge is amplified on modern processors with complex vector instruction sets like RISC-V Vectors (RVV), where writing optimal code requires deep hardware-specific knowledge. Super-optimizers address this gap by automatically synthesizing high-performance code but face a fundamental scalability constraint: as instruction set size increases, the maximum synthesizable program length decreases inversely. We introduce HieraSynth, a parallel framework for complete super-optimization that overcomes this constraint through hierarchical decomposition on instruction selection rather than the conventional peephole-style approach of decomposing on program length. Unlike non-exhaustive approaches that cannot guarantee optimality, HieraSynth preserves completeness, ensuring that a solution matching the specification will be found if one exists. Our approach systematically partitions program spaces into manageable subspaces, aggressively prunes unrealizable branches, and achieves near-linear speedup through independent parallel exploration of subspaces. We implement HieraSynth as a library and demonstrate its effectiveness with an RVV super-optimizer capable of handling instruction sets with up to 700 instructions while synthesizing programs with 7-8 instructions, a significant advancement over previous approaches limited to 1-3 instructions with similar instruction set sizes. Specifically, when compared to existing systems, HieraSynth can handle up to 10.66× larger instruction set for a given program length, or synthesize up to 4.75× larger programs for a fixed instruction set. Evaluations show that HieraSynth can discover optimizations surpassing human-designed code and significantly reduce synthesis time, making super-optimization more practical for modern vector architectures.
Thu 16 OctDisplayed time zone: Perth change
16:00 - 17:30 | |||
16:00 15mTalk | Compressed and Parallelized Structured Tensor Algebra OOPSLA Mahdi Ghorbani University of Edinburgh, Emilien Bauer University of Edinburgh, Tobias Grosser University of Cambridge, Amir Shaikhha University of Edinburgh | ||
16:15 15mTalk | Exploring the Theory and Practice of Concurrency in the Entity-Component-System Pattern OOPSLA Patrick Redmond University of California, Santa Cruz, Jonathan Castello University of California, Santa Cruz, Jose Calderon Galois, Inc., Lindsey Kuper University of California, Santa Cruz Pre-print | ||
16:30 15mTalk | HieraSynth: A Parallel Framework for Complete Super-Optimization with Hierarchical Space Decomposition OOPSLA | ||
16:45 15mTalk | Lilo: A Higher-Order, Relational Concurrent Separation Logic for Liveness OOPSLA Dongjae Lee Massachusetts Institute of Technology, Janggun Lee KAIST, Taeyoung Yoon Seoul National University, Minki Cho Seoul National University, Jeehoon Kang FuriosaAI, Chung-Kil Hur Seoul National University | ||
17:00 15mTalk | Opportunistically Parallel Lambda Calculus OOPSLA Stephen Mell University of Pennsylvania, Konstantinos Kallas University of California, Los Angeles, Steve Zdancewic University of Pennsylvania, Osbert Bastani University of Pennsylvania | ||
17:15 15mTalk | Soundness of Predictive Concurrency Analyses OOPSLA Shuyang Liu , Doug Lea State University of New York (SUNY) Oswego, Jens Palsberg University of California, Los Angeles (UCLA) | ||