Follow
Minjia Zhang
Title
Cited by
Cited by
Year
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
16252023
{Zero-offload}: Democratizing {billion-scale} model training
J Ren, S Rajbhandari, RY Aminabadi, O Ruwase, S Yang, M Zhang, D Li, ...
2021 USENIX Annual Technical Conference (USENIX ATC 21), 551-564, 2021
3712021
Zeroquant: Efficient and affordable post-training quantization for large-scale transformers
Z Yao, R Yazdani Aminabadi, M Zhang, X Wu, C Li, Y He
Advances in Neural Information Processing Systems 35, 27168-27183, 2022
3382022
Deepspeed-inference: enabling efficient inference of transformer models at unprecedented scale
RY Aminabadi, S Rajbhandari, AA Awan, C Li, D Li, E Zheng, O Ruwase, ...
SC22: International Conference for High Performance Computing, Networking …, 2022
2782022
Memcached design on high performance RDMA capable interconnects
J Jose, H Subramoni, M Luo, M Zhang, J Huang, M Wasi-ur-Rahman, ...
2011 International Conference on Parallel Processing, 743-752, 2011
2672011
Deepspeed-moe: Advancing mixture-of-experts inference and training to power next-generation ai scale
S Rajbhandari, C Li, Z Yao, M Zhang, RY Aminabadi, AA Awan, J Rasley, ...
International conference on machine learning, 18332-18346, 2022
2282022
OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization
G Ahdritz, N Bouatta, C Floristean, S Kadyan, Q Xia, W Gerecke, ...
Nature Methods, 1-11, 2024
1902024
Learning intrinsic sparse structures within long short-term memory
W Wen, Y He, S Rajbhandari, M Zhang, W Wang, F Liu, B Hu, Y Chen, ...
arXiv preprint arXiv:1709.05027, 2017
1562017
{DeepCPU}: Serving {RNN-based} Deep Learning Models 10x Faster
M Zhang, S Rajbhandari, W Wang, Y He
2018 USENIX Annual Technical Conference (USENIX ATC 18), 951-965, 2018
1252018
Model tells you what to discard: Adaptive kv cache compression for llms
S Ge, Y Zhang, L Liu, M Zhang, J Han, J Gao
arXiv preprint arXiv:2310.01801, 2023
1072023
Accelerating training of transformer-based language models with progressive layer dropping
M Zhang, Y He
Advances in neural information processing systems 33, 14011-14023, 2020
1072020
Valor: Efficient, software-only region conflict exceptions
S Biswas, M Zhang, MD Bond, B Lucia
ACM SIGPLAN Notices 50 (10), 241-259, 2015
742015
Sentinel: Efficient tensor migration and allocation on heterogeneous memory systems for deep learning
J Ren, J Luo, K Wu, M Zhang, H Jeon, D Li
2021 IEEE International Symposium on High-Performance Computer Architecture …, 2021
672021
Improving approximate nearest neighbor search through learned adaptive early termination
C Li, M Zhang, DG Andersen, Y He
Proceedings of the 2020 ACM SIGMOD International Conference on Management of …, 2020
592020
Hm-ann: Efficient billion-point nearest neighbor search on heterogeneous memory
J Ren, M Zhang, D Li
Advances in Neural Information Processing Systems 33, 10672-10684, 2020
592020
Octet: Capturing and controlling cross-thread dependences efficiently
MD Bond, M Kulkarni, M Cao, M Zhang, M Fathi Salmi, S Biswas, ...
ACM SIGPLAN Notices 48 (10), 693-712, 2013
582013
Bamboo: Making preemptible instances resilient for affordable training of large {DNNs}
J Thorpe, P Zhao, J Eyolfson, Y Qiao, Z Jia, M Zhang, R Netravali, GH Xu
20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023
552023
Deepspeed-chat: Easy, fast and affordable rlhf training of chatgpt-like models at all scales
Z Yao, RY Aminabadi, O Ruwase, S Rajbhandari, X Wu, AA Awan, ...
arXiv preprint arXiv:2308.01320, 2023
532023
Navigating with graph representations for fast and scalable decoding of neural language models
M Zhang, W Wang, X Liu, J Gao, Y He
Advances in neural information processing systems 31, 2018
532018
Deepspeed ulysses: System optimizations for enabling training of extreme long sequence transformer models
SA Jacobs, M Tanaka, C Zhang, M Zhang, SL Song, S Rajbhandari, Y He
arXiv preprint arXiv:2309.14509, 2023
462023
The system can't perform the operation now. Try again later.
Articles 1–20