Skip to content

Latest commit

 

History

History
22 lines (15 loc) · 1019 Bytes

README.md

File metadata and controls

22 lines (15 loc) · 1019 Bytes

Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving

This repository contains the source code implementation of the SOSP '24 paper Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving.

Please note that the arXiv version is not up to date with our SOSP submission. We will update the arXiv paper once the camera-ready version is finalized.

Getting Started

Apparate is implemented in Python. We have tested Apparate on Ubuntu 22.04 with Python 3.8.13.

Detailed instructions on how to reproduce the main results from our SOSP paper are in EXPERIMENTS.md.

References

@article{dai2023apparate,
  title={Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving},
  author={Dai, Yinwei and Pan, Rui and Iyer, Anand and Li, Kai and Netravali, Ravi},
  journal={arXiv preprint arXiv:2312.05385},
  year={2023}
}