Publications

Publications by categories in reversed chronological order.

peer reviewed

2024

  1. CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving
    Yuhan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, Yuyang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, and Junchen Jiang
    In Proceedings of the ACM SIGCOMM 2024 Conference 2024
  2. Eloquent: A More Robust Transmission Scheme for LLM Token Streaming
    Hanchen Li, Yuhan Liu, Yihua Cheng, Siddhant Ray, Kuntai Du, and Junchen Jiang
    In Proceedings of the 2024 SIGCOMM Workshop on Networks for AI Computing 2024

2022

  1. A New Hope for Network Model Generalization
    Alexander Dietmüller, Siddhant Ray, Romain Jacob, and Laurent Vanbever
    In Proceedings of the 21st ACM Workshop on Hot Topics in Networks 2022

2020

  1. Machine learning based cell association for mMTC 5G communication networks
    Siddhant Ray, and Budhaditya Bhattacharyya
    International Journal of Mobile Network Design and Innovation 2020

preprints

  1. SwiftQueue: Optimizing Low-Latency Applications with Swift Packet Queuing
    Siddhant Ray, Xi Jiang, Jack Luo, Nick Feamster, and Junchen Jiang
    2024
  2. CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
    Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, and Junchen Jiang
    2024
  3. A Constraint Based K-Shortest Path Searching Algorithm for Software Defined Networking
    Siddhant Ray
    2019
  4. A Comparative Analysis and Testing of Supervised Machine Learning Algorithms
    Siddhant Ray
    2018

posters

  1. Transformer-based Predictions for Sudden Network Changes (Poster)
    Siddhant Ray, Xi Jiang, Zhuohan Gu, Junchen Jiang, and Nick Feamster
    In 21st USENIX Symposium on Networked Systems Design and Implementation 2024