SPLASH 2025
Sun 12 - Sat 18 October 2025 Singapore
co-located with ICFP/SPLASH 2025
Thu 16 Oct 2025 17:00 - 17:15 at Orchid West - Neural Network Chair(s): Jiasi Shen

Despite the wide success of neural networks, their computational cost is very high. Quantization techniques reduce this cost but can result in changing the classifications of the original floating-point network, even if the training is quantization-aware. In this work, we rely on verification to design correction systems that detect classification inconsistency at inference time and eliminate them. The key idea is to overapproximate the space of inconsistent inputs with their maximal classification confidence. The main challenge in computing this confidence is that it involves analyzing a quantized network, which introduces a very high degree of nonlinearity, over all possible inputs. We propose CoMPAQt, an algorithm for computing this confidence. CoMPAQt relies on a novel encoding of quantization in mixed-integer linear programming (MILP) along with customized linear relaxations to reduce the high complexity. To prune the search space, it ties the computations of the quantized network and its floating-point counterpart. Given this confidence, we propose two correction mechanisms. The first mechanism guarantees to return the classification of the floating-point network and relies on networks with increasing bit precisions. The second mechanism mitigates classification inconsistencies by an ensemble of quantized networks. We evaluate our approach on MNIST, ACAS-Xu, and tabular datasets over fully connected and convolutional networks. Results show that our first correction mechanism guarantees 100% consistency with the floating-point network’s classifications while reducing its computational cost by 3.8x, on average. Our second mechanism reaches an almost perfect consistency guarantee in our experiments while reducing the computational cost by 4.1x. Our work is the first to provide a formal guarantee on the classification consistency of a quantized network.

Thu 16 Oct

Displayed time zone: Perth change

16:00 - 17:30
Neural NetworkOOPSLA at Orchid West
Chair(s): Jiasi Shen The Hong Kong University of Science and Technology
16:00
15m
Talk
Convex Hull Approximation for Activation Functions
OOPSLA
Zhongkui Ma The University of Queensland, Zihan Wang The University of Queensland and CSIRO's Data61, Guangdong Bai University of Queensland
16:15
15m
Talk
Cost of Soundness in Mixed-Precision Tuning
OOPSLA
Anastasia Isychev TU Wien, Debasmita Lohar Karlsruhe Institute of Technology
Pre-print
16:30
15m
Talk
Finch: Sparse and Structured Tensor Programming with Control Flow
OOPSLA
Willow Ahrens Massachusetts Institute of Technology, Teodoro F. Collin MIT CSAIL, Radha Patel MIT CSAIL, Kyle Deeds University of Washington, Changwan Hong Massachusetts Institute of Technology, Saman Amarasinghe Massachusetts Institute of Technology
16:45
15m
Talk
MetaKernel: Enabling Efficient Encrypted Neural Network Inference Through Unified MVM and Convolution
OOPSLA
Peng Yuan Ant Group, Yan Liu Ant Group, Jianxin Lai Ant Group, Long Li Ant Group, Tianxiang Sui Ant Group, Linjie Xiao Ant Group, Xiaojing Zhang Ant Group, Qing Zhu Ant Group, Jingling Xue University of New South Wales
17:00
15m
Talk
Quantization with Guaranteed Floating-Point Neural Network Classifications
OOPSLA
Anan Kabaha Technion, Israel Institute of Technology, Dana Drachsler Cohen Technion
17:15
15m
Talk
The Continuous Tensor Abstraction: Where Indices are Real
OOPSLA
Jaeyeon Won MIT, Willow Ahrens Massachusetts Institute of Technology, Teodoro F. Collin MIT CSAIL, Joel S Emer MIT/NVIDIA, Saman Amarasinghe Massachusetts Institute of Technology