Shigang Li

Cited by

	All	Since 2019
Citations	1223	1055
h-index	17	17
i10-index	28	26

380

190

285

20132014201520162017201820192020202120222023202413 20 15 16 39 56 95 71 140 255 368 124

Public access

View all

28 articles

12 articles

available

not available

Based on funding mandates

Co-authors

Torsten HoeflerProfessor of Computer Science at ETH ZurichVerified email at inf.ethz.ch
Yunquan ZhangProfessor of Institute of Computing Technology, CASVerified email at ict.ac.cn
Tal Ben-NunLawrence Livermore National LaboratoryVerified email at llnl.gov
Nikoli DrydenLawrence Livermore National LaboratoryVerified email at llnl.gov
Salvatore Di GirolamoNVIDIAVerified email at inf.ethz.ch
Dan AlistarhIST AustriaVerified email at ist.ac.at
Liang YuanInstitute of Computing TechnologyVerified email at ict.ac.cn
Marc SnirUniversity of Illinois at Urbana ChamapignVerified email at illinois.edu
Daniele De SensiTenure-Track Assistant Professor, Sapienza University of RomeVerified email at di.uniroma1.it
Kazuki OsawaGoogle DeepMindVerified email at google.com

Shigang Li

Professor, Beijing University of Posts and Telecommunications

Verified email at bupt.edu.cn - Homepage

Parallel Computing Deep Learning Systems Heterogeneous Computing High Performance Computing Computer Architecture


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Deep learning for post-processing ensemble weather forecasts P Grönquist, C Yao, T Ben-Nun, N Dryden, P Dueben, S Li, T Hoefler Philosophical Transactions of the Royal Society A 379 (2194), 20200092, 2021	143	2021
NUMA-aware shared-memory collective communication for MPI S Li, T Hoefler, M Snir Proceedings of the 22nd international symposium on High-performance parallel …, 2013	120	2013
Parallel processing systems for big data: a survey Y Zhang, T Cao, S Li, X Tian, L Yuan, H Jia, AV Vasilakos Proceedings of the IEEE 104 (11), 2114-2136, 2016	108	2016
Data Movement Is All You Need: A Case Study on Optimizing Transformers A Ivanov, N Dryden, T Ben-Nun, S Li, T Hoefler Proceedings of Machine Learning and Systems 3, 2021	107	2021
Chimera: efficiently training large-scale neural networks with bidirectional pipelines S Li, T Hoefler Proceedings of the International Conference for High Performance Computing …, 2021	73	2021
CAS‐ESM 2: Description and climate simulation performance of the Chinese Academy of Sciences (CAS) Earth System Model (ESM) version 2 H Zhang, M Zhang, J Jin, K Fei, D Ji, C Wu, J Zhu, J He, Z Chai, J Xie, ... Journal of Advances in Modeling Earth Systems, e2020MS002210, 2020	69	2020
Taming unbalanced training workloads in deep learning with partial collective operations S Li, T Ben-Nun, SD Girolamo, D Alistarh, T Hoefler Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of …, 2020	57	2020
Asynchronous Decentralized SGD with Quantized and Local Updates G Nadiradze, A Sabour, P Davies, S Li, D Alistarh Advances in Neural Information Processing Systems 34, 2021	41*	2021
Flare: flexible in-network allreduce D De Sensi, S Di Girolamo, S Ashkboos, S Li, T Hoefler Proceedings of the International Conference for High Performance Computing …, 2021	33	2021
Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods T Yao, J Wang, H Wu, P Zhang, S Li, K Xu, X Liu, X Chi IEEE Transactions on Sustainable Energy, 2021	33	2021
Improved MPI collectives for MPI processes in shared address spaces S Li, T Hoefler, C Hu, M Snir Cluster Computing 17 (4), 1139-1155, 2014	33	2014
Near-optimal sparse allreduce for distributed deep learning S Li, T Hoefler Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of …, 2022	30	2022
Cache-oblivious MPI all-to-all communications based on Morton order S Li, Y Zhang, T Hoefler IEEE Transactions on Parallel and Distributed Systems, 2018	28	2018
A photovoltaic power output dataset: Multi-source photovoltaic power output dataset with Python toolkit T Yao, J Wang, H Wu, P Zhang, S Li, Y Wang, X Chi, M Shi Solar Energy 230, 122-130, 2021	21	2021
Kernel optimization for short-range molecular dynamics C Hu, X Wang, J Li, X He, S Li, Y Feng, S Yang, H Bai Computer Physics Communications, 2016	21	2016
Efficient parallel optimizations of a high-performance SIFT on GPUs Z Li, H Jia, Y Zhang, S Liu, S Li, X Wang, H Zhang Journal of Parallel and Distributed Computing, 2018	20	2018
Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer S Li, B Wu, Y Zhang, X Wang, J Li, C Hu, J Wang, Y Feng, N Nie Proceedings of the 47th International Conference on Parallel Processing, 47, 2018	18	2018
Why Dataset Properties Bound the Scalability of Parallel Machine Learning Training Algorithms D Cheng, S Li, Z Hanping, F Xia, Y Zhang IEEE Transactions on Parallel and Distributed Systems, 2021	16*	2021
OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight K Li, H Shang, Y Zhang, S Li, B Wu, D Wang, L Zhang, F Li, D Chen, ... Proceedings of the International Conference for High Performance Computing …, 2019	16	2019
A Cross-Platform SpMV Framework on Many-Core Architectures Y Zhang, S Li, S Yan, H Zhou ACM Transactions on Architecture and Code Optimization (TACO) 13 (4), 33, 2016	15	2016

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors