About

I am a third-year Ph.D. student at MMLab, CUHK, advised by Prof. Dahua Lin. My research interests lie in broad area of MLSys, especially efficient large scale DNN training and inference. Before joining CUHK, I received my Bachelor’s degree in Computer Science from University of Chinese Academy of Sciences, advised by Prof. Shiguang Shan.

Currently, I am working on efficient LLM serving systems. Feel free to drop me an email if you are interested in my research.

My detailed CV can be found here.

News


  • [Apr. 2024] I will attend NSDI ‘24 in person at Santa Clara, CA. See you there!

Education


Image


The Chinese University of Hong Kong
Aug. 2021 - July 2025 (Expected)
Ph.D. Candidate in Department of Information Engineering

Image


University of Chinese Academy of Sciences
Sep. 2016 - July 2020
B.E. in Computer Science and Technology

Experience


Catalyst, CMU
Research Intern, Apr. 2022 - May. 2023
Advisors: Prof. Zhihao Jia, Dr. Minjia Zhang, Dr. Xupeng Miao
Cost-efficient DNN training and inference.

MMLab, CUHK
Research Assiant, Aug. 2020 - Apr. 2022
Advisors: Prof. Dahua Lin, Prof. Shengen Yan, Prof. Xiuhong Li
Auto parallel DNN training.

MMLab, CUHK
Research Assiant, July 2019 - July 2020
Mentors: Prof. Dahua Lin, Xingcheng Zhang
Optmize large scale data parallel training performance. With sparse communication and system optimization, We train alexnet in 1 minute on a 1000 V100 cluster with Parrots (a DL framework similar to PyTorch).

Publications


  • MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving
    Jiangfei Duan, Runyu Lu, Haojie Duanmu, Xiuhong Li, Xingcheng Zhang, Dahua Lin, Ion Stoica, and Hao Zhang
    arXiv Preprint, 2024
    [Paper]

  • Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning
    Chang Chen, Xiuhong Li, Qianchao Zhu, Jiangfei Duan, Peng Sun, Xingcheng Zhang, and Chao Yang
    In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2024.

  • SpotServe: Serving Generative Large Language Models on Preemptible Instances
    Xupeng Miao$^{*}$, Chunan Shi$^{*}$, Jiangfei Duan, Xiaoli Xi, Dahua Lin, Bin Cui, and Zhihao Jia
    In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2024.
    Distinguished Artifact Award
    [Paper], [Code]

  • Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances
    Jiangfei Duan$^{*}$, Ziang Song$^{*}$, Xupeng Miao$^{*}$, Xiaoli Xi, Dahua Lin, Harry Xu, Minjia Zhang, and Zhihao Jia
    In Proceedings of the Symposium on Networked Systems Design and Implementation (NSDI), April 2024.
    [Paper], [Code]

  • Simulating the Performance of Distributed DNN Training
    Jiangfei Duan, Xiuhong Li, Ping Xu, Xingcheng Zhang, Shengen Yan, Yun Liang, and Dahua Lin
    arXiv Preprint, 2023
    [Paper], [Code]

Teaching


TA, IERG3050: Simulation and Statistical Analysis, Fall 2021, CUHK
TA, CSCI2100: Data Structure, Spring 2022, CUHK

Services


AEC Member: MLSys 2023

Awards


First-class Academic Scholarship, UCAS (top 5%), 2017,2018
Tang Lixin Scholarship, 2019
Outstanding Graduate of Beijing, 2020
Outstanding Graduate of University of Chinese Academy of Sciences, 2020