MetaKernel: Enabling Efficient Encrypted Neural Network Inference Through Unified MVM and Convolution (SPLASH 2025 - OOPSLA)

Sun 12 - Sat 18 October 2025 Singapore

co-located with ICFP/SPLASH 2025

Who

Peng Yuan, Yan Liu, Jianxin Lai, Long Li, Tianxiang Sui, Linjie Xiao, Xiaojing Zhang, Qing Zhu, Jingling Xue

Track

SPLASH 2025 OOPSLA

Time Zone

The program is currently displayed in (GMT+08:00) Perth.

Use conference time zone: (GMT+08:00) PerthSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 16 Oct 2025 16:45 - 17:00 at Orchid West - Neural Network Chair(s): Jiasi Shen

Abstract

Practical encrypted neural network inference under the CKKS fully homomorphic encryption (FHE) scheme relies heavily on accelerating two key kernel operations: Matrix-Vector Multiplication (MVM) and Convolution (Conv). However, existing solutions—such as expert-tuned libraries and domain-specific languages—are designed in an ad hoc manner, leading to significant inefficiencies caused by excessive rotations.

We introduce MKR, a novel composition-based compiler approach that optimizes MVM and Conv kernel operations for DNN models under CKKS within a unified framework. MKR decomposes each kernel into composable units, called MetaKernels, to enhance SIMD parallelism within ciphertexts (via horizontal batching) and computational parallelism across them (via vertical batching). Our approach tackles previously unaddressed challenges, including reducing rotation overhead through a rotation-aware cost model for data packing, while also ensuring high slot utilization, uniform handling of inputs with arbitrary sizes, and compatibility with the output tensor layout. Implemented in a production-quality FHE compiler, MKR achieves inference time speedups of $10.08\times$-$185.60\times$ for individual MVM and Conv kernels and $1.75\times$-$11.84\times$ for end-to-end inference compared to a state-of-the-art FHE compiler. Moreover, MKR enables homomorphic execution of large DNN models for the first time, where prior methods fail, significantly advancing the practicality of FHE compilers.

Peng Yuan

Ant Group

Yan Liu

Ant Group

Jianxin Lai

Ant Group

Long Li

Ant Group

Tianxiang Sui

Ant Group

Linjie Xiao

Ant Group

Xiaojing Zhang

Ant Group

Qing Zhu

Ant Group

Jingling Xue

University of New South Wales

Australia

Time Zone

The program is currently displayed in (GMT+08:00) Perth.

Use conference time zone: (GMT+08:00) PerthSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 16 Oct
Displayed time zone: Perth change

16:00 - 17:30	Neural NetworkOOPSLA at Orchid West Chair(s): Jiasi Shen The Hong Kong University of Science and Technology

16:00 15m Talk		Convex Hull Approximation for Activation Functions OOPSLA Zhongkui Ma The University of Queensland, Zihan Wang The University of Queensland and CSIRO's Data61, Guangdong Bai University of Queensland
16:15 15m Talk		Cost of Soundness in Mixed-Precision Tuning OOPSLA Anastasia Isychev TU Wien, Debasmita Lohar Karlsruhe Institute of Technology Pre-print
16:30 15m Talk		Finch: Sparse and Structured Tensor Programming with Control Flow OOPSLA Willow Ahrens Massachusetts Institute of Technology, Teodoro F. Collin MIT CSAIL, Radha Patel MIT CSAIL, Kyle Deeds University of Washington, Changwan Hong Massachusetts Institute of Technology, Saman Amarasinghe Massachusetts Institute of Technology
16:45 15m Talk		MetaKernel: Enabling Efficient Encrypted Neural Network Inference Through Unified MVM and Convolution OOPSLA Peng Yuan Ant Group, Yan Liu Ant Group, Jianxin Lai Ant Group, Long Li Ant Group, Tianxiang Sui Ant Group, Linjie Xiao Ant Group, Xiaojing Zhang Ant Group, Qing Zhu Ant Group, Jingling Xue University of New South Wales
17:00 15m Talk		Quantization with Guaranteed Floating-Point Neural Network Classifications OOPSLA Anan Kabaha Technion, Israel Institute of Technology, Dana Drachsler Cohen Technion
17:15 15m Talk		The Continuous Tensor Abstraction: Where Indices are Real OOPSLA Jaeyeon Won MIT, Willow Ahrens Massachusetts Institute of Technology, Teodoro F. Collin MIT CSAIL, Joel S Emer MIT/NVIDIA, Saman Amarasinghe Massachusetts Institute of Technology