Follow
Lingqi Zhang
Title
Cited by
Cited by
Year
High accuracy digital image correlation powered by GPU-based parallel computing
L Zhang, T Wang, Z Jiang, Q Kemao, Y Liu, Z Liu, L Tang, S Dong
Optics and Lasers in Engineering 69, 7-12, 2015
1042015
Matrix engines for high performance computing: A paragon of performance or grasping at straws?
J Domke, E Vatai, A Drozd, P ChenT, Y Oyama, L Zhang, S Salaria, ...
2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021
332021
Heterogeneous parallel computing accelerated iterative subpixel digital image correlation
JW Huang, LQ Zhang, ZY Jiang, SB Dong, W Chen, YP Liu, ZJ Liu, ...
Science China Technological Sciences 61, 74-85, 2018
262018
A study of single and multi-device synchronization methods in Nvidia GPUs
L Zhang, M Wahib, H Zhang, S Matsuoka
2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020
212020
Scaling distributed deep learning workloads beyond the memory capacity with KARMA
M Wahib, H Zhang, TT Nguyen, A Drozd, J Domke, L Zhang, R Takano, ...
SC20: International Conference for High Performance Computing, Networking …, 2020
202020
PipeMEM: A Framework to Speed Up BWA-MEM in Spark with Low Overhead
L Zhang, C Liu, S Dong
Genes 10 (11), 886, 2019
112019
Understanding the overheads of launching CUDA kernels
L Zhang, M Wahib, S Matsuoka
ICPP19, 5-8, 2019
112019
At the locus of performance: A case study in enhancing cpus with copious 3d-stacked cache
J Domke, E Vatai, B Gerofi, Y Kodama, M Wahib, A Podobas, S Mittal, ...
arXiv preprint arXiv:2204.02235, 2022
52022
Persistent Kernels for Iterative Memory-bound GPU Applications
L Zhang, M Wahib, P Chen, J Meng, X Wang, S Matsuoka
arXiv preprint arXiv:2204.02064, 2022
42022
PERKS: a Locality-Optimized Execution Model for Iterative Memory-bound GPU Applications
L Zhang, M Wahib, P Chen, J Meng, X Wang, T Endo, S Matsuoka
Proceedings of the 37th International Conference on Supercomputing, 167-179, 2023
32023
Revisiting Temporal Blocking Stencil Optimizations
L Zhang, M Wahib, P Chen, J Meng, X Wang, T Endo, S Matsuoka
Proceedings of the 37th International Conference on Supercomputing, 251-263, 2023
22023
At the locus of performance: Quantifying the effects of copious 3D-stacked cache on HPC workloads
J Domke, E Vatai, B Gerofi, Y Kodama, M Wahib, A Podobas, S Mittal, ...
ACM Transactions on Architecture and Code Optimization 20 (4), 1-26, 2023
12023
Exploiting Scratchpad Memory for Deep Temporal Blocking: A case study for 2D Jacobian 5-point iterative stencil kernel (j2d5pt)
L Zhang, M Wahib, P Chen, J Meng, X Wang, T Endo, S Matsuoka
Proceedings of the 15th Workshop on General Purpose Processing Using GPU, 34-35, 2023
2023
A Study of Synchronization Methods in Modern GPUs
L Zhang, M Wahib, H Zhang, S Matsuoka
2019
Breaking the limitation of GPU Memory for Deep Learning Workloads
H Zhang, M Wahib, L Zhang, Y Tsuji, S Mtsuoka
2019
GPU Accelerated High Accuracy Digital Volume Correlation
T Wang, L Zhang, Z Jiang, K Qian
International Digital Imaging Correlation Society: Proceedings of the First …, 2017
2017
The system can't perform the operation now. Try again later.
Articles 1–16