Judging llm-as-a-judge with mt-bench and chatbot arena L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ... Advances in Neural Information Processing Systems 36, 46595-46623, 2023 | 1862* | 2023 |
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ... See https://vicuna. lmsys. org (accessed 14 April 2023) 2 (3), 6, 2023 | 1803* | 2023 |
Efficient memory management for large language model serving with pagedattention W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ... Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023 | 665 | 2023 |
cvc5: A versatile and industrial-strength SMT solver H Barbosa, C Barrett, M Brain, G Kremer, H Lachnitt, M Mann, ... International Conference on Tools and Algorithms for the Construction and …, 2022 | 439 | 2022 |
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Re, ... International Conference on Machine Learning, 2023 | 227 | 2023 |
Chatbot arena: An open platform for evaluating llms by human preference WL Chiang, L Zheng, Y Sheng, AN Angelopoulos, T Li, D Li, H Zhang, ... arXiv preprint arXiv:2403.04132, 2024 | 143 | 2024 |
H2o: Heavy-hitter oracle for efficient generative inference of large language models Z Zhang, Y Sheng, T Zhou, T Chen, L Zheng, R Cai, Z Song, Y Tian, C Ré, ... Advances in Neural Information Processing Systems 36, 2024 | 136 | 2024 |
How Long Can Context Length of Open-Source LLMs truly Promise? D Li, R Shao, A Xie, Y Sheng, L Zheng, J Gonzalez, I Stoica, X Ma, ... NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023 | 102* | 2023 |
{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023 | 94 | 2023 |
Lmsys-chat-1m: A large-scale real-world llm conversation dataset L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ... arXiv preprint arXiv:2309.11998, 2023 | 68 | 2023 |
SLoRA: Scalable Serving of Thousands of LoRA Adapters Y Sheng, S Cao, D Li, C Hooper, N Lee, S Yang, C Chou, B Zhu, L Zheng, ... Proceedings of Machine Learning and Systems 6, 296-311, 2024 | 46* | 2024 |
Subspace embedding and linear regression with orlicz norm A Andoni, C Lin, Y Sheng, P Zhong, R Zhong International Conference on Machine Learning, 224-233, 2018 | 38 | 2018 |
Efficiently programming large language models using sglang L Zheng, L Yin, Z Xie, J Huang, C Sun, CH Yu, S Cao, C Kozyrakis, ... arXiv preprint arXiv:2312.07104, 2023 | 35 | 2023 |
Distribution-free junta testing X Chen, Z Liu, RA Servedio, Y Sheng, J Xie STOC 2018, 2018 | 30* | 2018 |
Fairness in serving large language models Y Sheng, S Cao, D Li, B Zhu, Z Li, D Zhuo, JE Gonzalez, I Stoica 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024 | 17 | 2024 |
Clover: Closed-loop verifiable code generation C Sun, Y Sheng, O Padon, C Barrett AI Verification: First International Symposium, SAIV 2024, Montreal, QC …, 2024 | 14 | 2024 |
Towards Optimal Caching and Model Selection for Large Model Inference B Zhu, Y Sheng, L Zheng, C Barrett, M Jordan, J Jiao Advances in Neural Information Processing Systems 36, 2024 | 13* | 2024 |
Politeness for the theory of algebraic datatypes Y Sheng, Y Zohar, C Ringeissen, J Lange, P Fontaine, C Barrett International Joint Conference on Automated Reasoning, 238-255, 2020 | 12* | 2020 |
On the approximation of Nash equilibria in sparse win-lose games Z Liu, Y Sheng Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018 | 9 | 2018 |
Reasoning about vectors using an SMT theory of sequences Y Sheng, A Nötzli, A Reynolds, Y Zohar, D Dill, W Grieskamp, J Park, ... International Joint Conference on Automated Reasoning, 125-143, 2022 | 8* | 2022 |