Xin Jin

Associate Professor
School of Computer Science
Peking University

Email: xinjinpku (at) pku (dot) edu (dot) cn

I am an Associate Professor in the School of Computer Science at Peking University. I work on computer systems and networking. My research has received USENIX NSDI Best Paper Award (2018) and USENIX FAST Best Paper Award (2019).

Research

I am broadly interested in computer systems and networking. My research currently focuses on designing and building systems for cloud computing and large language models.

Current Projects

Serverless Computing

Training and Serving Large Language Models

Network Testing and Verification

Disaggregated Storage with RDMA and DPUs

Recent Publications (All Publications)

[SOSP 24] LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism

[OSDI 24] DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving

[OSDI 24] dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving

[OSDI 24] Burstable Cloud Block Storage with Data Processing Units

[NSDI 24] Jolteon: Unleashing the Promise of Serverless for Serverless Workflows

[NSDI 24] Fast Vector Query Processing for Large Datasets Beyond GPU Memory with Reordered Pipelining

[NSDI 24] MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

[SOSP 23] Halfmoon: Log-Optimal Fault-Tolerant Stateful Serverless Computing

[SOSP 23] Automated Verification of an In-Production DNS Authoritative Engine

[SOSP 23] Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates

[SIGCOMM 23] Ditto: Efficient Serverless Analytics with Elastic Parallelism

[SIGCOMM 23] Klotski: Efficient and Safe Network Migration of Large Production Datacenters

[SIGCOMM 23] Understanding the Micro-Behaviors of Hardware Offloaded Network Stacks with Lumina

[SIGCOMM 23] XRON: A Hybrid Elastic Cloud Overlay Network for Video Conferencing at Planetary Scale

[OSDI 23] AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

[NSDI 23] Transparent GPU Sharing in Container Clouds for Deep Learning Workloads

[NSDI 23] Fast, Approximate Vector Queries on Very Large Unstructured Datasets