MPC-Pipe: An Efficient Pipeline Scheme for Semi-honest MPC Machine Learning
Published in 2024 ACM 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024
Recommended citation: Yongqin Wang, Rachit Rajat, Murali Annavaram, "MPC-Pipe: An Efficient Pipeline Scheme for Semi-honest MPC Machine Learning," 2024 ACM 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). https://arxiv.org/abs/2209.13643
Multi-party computing (MPC) has been gaining popularity as a secure computing model over the past few years. However, prior works have demonstrated that MPC protocols still pay substantial performance penalties compared to plaintext, particularly when applied to ML algorithms. The overhead is due to added computation and communication costs. Prior studies, as well as our own analysis, found that most MPC protocols today sequentially perform communication and computation. The participating parties must compute on their shares first and then perform data communication to allow the distribution of new secret shares before proceeding to the next computation step. In this work, we show that serialization is unnecessary, particularly in the context of ML computations (both in Convolutional neural networks and in Transformer-based models). We demonstrate that it is possible to carefully orchestrate the computation and communication steps to overlap. We propose MPC-Pipe, an efficient MPC system for both training and inference of ML workloads, which pipelines computations and communications in an MPC protocol during the online phase. MPC-Pipe proposes three pipeline schemes to optimize the online phase of ML in the semi-honest majority adversary setting. The three pipeline schemes are 1. inter-linear pipeline, 2. inner-layer pipeline, and 3. inter-batch pipeline. Inter-linear pipeline focuses on linear layers; inner-layer pipeline focuses on non-linear layers; inter-batch pipeline focuses on communication and computation overlaps in different input batches. We implement MPC-Pipe by augmenting a modified version of CrypTen, which separates online and offline phases. We evaluate the end-to-end system performance benefits of the online phase of MPC using deep neural networks (VGG16, ResNet50) and Transformers using different network settings. We show that MPC-Pipe can improve the throughput and latency of ML workloads.