(at Lake Louise, Alberta, Canada)

Hanshi Sun   孙寒石

I am a M.S. student in ECE at Carnegie Mellon University. Now I am working on LLM efficiency with Prof. Beidi Chen. I also work closely with Prof. Andrea Zanette. I obtained my bachelor degree at Southeast University. I was a member of the PAttern Learning and Mining (PALM) Lab , where I am fortunate to be advised by Prof. Yi Zhou. Before, I was a research intern at the University of Alberta, supervised by Prof. Xingyu Li.

hanshis [at] andrew [dot] cmu [dot] edu

@preminstrel
GitHub (preminstrel)
Google Scholar
LinkedIn
Blog

News

Publications

/ . MLSys papers are highlighted.

   
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, and Beidi Chen

ArXiv, 2024


arXiv / website / code / bibtex


High-Throughput Long-Context LLM Inference System

   
Fast Best-of-N Decoding via Speculative Rejection

Hanshi Sun*, Momin Haider*, Ruiqi Zhang*, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, and Andrea Zanette* (* for core authors)

Conference on Neural Information Processing Systems (NeurIPS), 2024


arXiv / website / code / bibtex


Fast Inference-time Aligment Algorithm

   
TriForce
*TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Hanshi Sun, Zhuoming Chen, Xinyu Yang, Yuandong Tian, and Beidi Chen

Conference on Language Modeling (COLM), 2024


arXiv / website / code / demo / bibtex


Training-free Lossless Long Sequence Generation Acceleration

   
BMAD: Benchmarks for Medical Anomaly Detection

Jinan Bao, Hanshi Sun, Hanqiu Deng, Zhaoxiang Zhang, and Xingyu Li

Computer Vision and Pattern Recognition (CVPR) Workshop, 2024


arXiv / code / bibtex


This benchmark encompasses six reorganized datasets from five medical domains (i.e. brain MRI, liver CT, retinal OCT, chest X-ray, and digital histopathology) and three key evaluation metrics, and includes a total of fourteen state-of-the-art AD algorithms.

Education

Carnegie Mellon University, Pittsburgh, United States (Aug 2023 - Dec 2024)

M.S. in Electrical & Computer Engineering
  • Overall GPA: 4.0/4.0
  • Teaching Assistant @ Introduction to Deep Learning, Introduction to Machine Learning

Southeast University, Nanjing, China (Sep 2019 - Jul 2023)

B.E. in Electronic Science and Technology
  • Overall GPA: 3.96/4.0, 93.69/100
  • 2021 & 2022 China National Scholarship

Professional Experience

ByteDance Inc., Seattle, United States (Jun 2024 - Present)

Machine Learning System Research Intern in Seed-Foundation Team
  • Introduced ShadowKV, a high-throughput long-context LLM inference system
  • ShadowKV can support 6x larger batch sizes and boost throughput by 3.04x while maintaining accuracy on downstream tasks

Apple Inc., Shenzhen, China (Nov 2022 - Jun 2023)

R&D Intern in iPad System EE
  • Built a python automation test frame that can run on multiple units, collect and analyze logs
  • Issue reproduction, symptom capture, and hands-on debugging for coexistence testing
  • Created web pages with diverse visualization of the analyzed data using Flask

© Hanshi Sun 2024