Sound and Modular Activity Analysis for Automatic Differentiation in MLIR
This program is tentative and subject to change.
Computing derivatives is paramount for multiple domains ranging from training neural networks to precise climate simulations. While derivatives can be generated by Automatic Differentiation (AD) tools, they often require aggressive optimization to avoid compromising program performance. One of the central optimizations consists of identifying inactive operations that do not contribute to the partial derivatives of interest.
Multiple tools provide activity analyses for a variety of input languages, though often with only informal correctness guarantees. This paper formally defines activity analysis for AD as an abstract interpretation, proves its soundness, and implements it within the MLIR compiler infrastructure. To account for MLIR’s genericity, a subset of MLIR’s internal representation amenable to AD is formalized for the first time. Furthermore, the paper proposes a sound intraprocedural approximation of the whole-program activity analysis via function summaries along with a mechanism to automatically derive these summaries from function definitions.
The implementation is evaluated on a differentiation-specific benchmark suite. It achieves a $1.24\times$ geometric mean speedup on CPU and a $1.7\times$ geometric mean speedup on GPU in the runtime of generated programs, when compared to a baseline that does not use activity analysis. The evaluation also demonstrates that the intraprocedural analysis with function summaries proves inactive 100% of instructions proven inactive by the whole-program analysis.