Biography
I am a second-year Ph.D. candidate in Computer Science at the University of Virginia, working under the supervision of Prof. Yue Cheng in the DS² Lab. My research interests span machine-learning systems, storage systems, and distributed systems.
Prior to UVA, I received my M.S. in Computer Science from Boston University and my B.S. from Hangzhou Dianzi University. I am dedicated to building efficient and scalable systems for next-generation data-intensive applications.
My research focuses on addressing challenges in real-world storage systems, driven by the complexities of modern data-intensive computer systems. I am particularly interested in serverless AI, storage systems for AI, and serverless computing, taking an end-to-end approach that spans applications, middleware, platforms, and low-level operating systems.
Education
-
2024 – present
Ph.D. in Computer ScienceUniversity of Virginia
-
2022 – 2024
M.S. in Computer ScienceBoston University
-
2018 – 2022
B.S. in Computer ScienceHangzhou Dianzi University
News
-
May 2026
Excited to start my summer internship at ByteDance Seed (San Jose, CA) this summer!
-
May 2026
Passed my Ph.D. qualifying exam — officially a Ph.D. Candidate now!
-
Apr 2026
Honored to serve on the Artifact Evaluation Committee for OSDI 2026.
-
Jan 2026
Our paper MorphServe: Efficient and Workload-Aware LLM Serving via Runtime Quantized Layer Swapping and KV Cache Resizing has been accepted to MLSys 2026!
-
Jan 2026
Our paper λScale: Enabling Fast Scaling for Serverless Large Language Model Inference has been accepted to MLSys 2026!
-
Dec 2025
I am honored to serve on the Artifact Evaluation Committee for NSDI'26 Fall.
-
Oct 2025
I joined the Shadow PC for EuroSys 2026.
-
Oct 2025
Our paper ZipLLM (NSDI'26) received all three Artifact Evaluation badges (Available, Functional, Reproduced).
- Jul 2025
-
Jun 2025
Our new preprint MorphServe is now available on arXiv.
Publications
-
MorphServe: Efficient and Workload-Aware LLM Serving via Runtime Quantized Layer Swapping and KV Cache ResizingZhaoyuan Su, Zeyu Zhang, Tingfeng Lan, Zirui Wang, Haiying Shen, Juncheng Yang, Yue ChengNinth Annual Conference on Machine Learning and Systems (MLSys 2026)
-
λScale: Enabling Fast Scaling for Serverless Large Language Model InferenceMinchen Yu, Rui Yang, Chaobo Jia, Zhaoyuan Su, Sheng Yao, Tingfeng Lan, Yuchen Yang, Zirui Wang, Yue Cheng, Wei Wang, Ao Wang, Ruichuan ChenNinth Annual Conference on Machine Learning and Systems (MLSys 2026)
-
Towards Efficient LLM Storage Reduction via Tensor Deduplication and Delta CompressionZirui Wang, Tingfeng Lan, Zhaoyuan Su, Juncheng Yang, Yue Cheng23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 2026)
-
Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to AskZhaoyuan Su, Ammar Ahmed, Zirui Wang, Ali Anwar, Yue Cheng50th International Conference on Very Large Data Bases (VLDB 2024)
-
Temporal Cue Guided Video Highlight Detection with Low-Rank Audio-Visual FusionQinghao Ye*, Xiyue Shen*, Yuan Gao*, Zirui Wang*, Qi Bi, Ping Li, Guang YangInternational Conference on Computer Vision (ICCV 2021)
Academic Service
- 2026Artifact Evaluation Committee — USENIX OSDI 2026
- 2026Artifact Evaluation Committee — USENIX NSDI 2026 (Fall)
- 2026Shadow Program Committee — EuroSys 2026
- 2026Reviewer — ACM Transactions on Storage (TOS)
Outside Research
Playing basketball Watching F1 Watching CS2 esports Age of Empires IV — World Top 1000 Playing League of Legends