Mostafa Mahmoud
Title
Cited by
Cited by
Year
Laconic deep learning inference acceleration
S Sharify, AD Lascorz, M Mahmoud, M Nikolic, K Siu, DM Stuart, Z Poulos, ...
2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture …, 2019
442019
Bit-tactical: A software/hardware approach to exploiting value and bit sparsity in neural networks
A Delmas Lascorz, P Judd, DM Stuart, Z Poulos, M Mahmoud, S Sharify, ...
Proceedings of the Twenty-Fourth International Conference on Architectural …, 2019
402019
Diffy: A Déjà vu-free differential deep neural network accelerator
M Mahmoud, K Siu, A Moshovos
2018 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018
352018
Memory requirements for convolutional neural network hardware accelerators
K Siu, DM Stuart, M Mahmoud, A Moshovos
2018 IEEE International Symposium on Workload Characterization (IISWC), 111-121, 2018
322018
Bit-tactical: Exploiting ineffectual computations in convolutional neural networks: Which, why, and how
A Delmas, P Judd, DM Stuart, Z Poulos, M Mahmoud, S Sharify, M Nikolic, ...
arXiv preprint arXiv:1803.03688, 2018
192018
Shapeshifter: Enabling fine-grain data width adaptation in deep learning
AD Lascorz, S Sharify, I Edo, DM Stuart, OM Awad, P Judd, M Mahmoud, ...
Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019
152019
IDEAL: Image denoising accelerator
M Mahmoud, B Zheng, AD Lascorz, F Heide, J Assouline, P Boucher, ...
2017 50th Annual IEEE/ACM International Symposium on Microarchitecture …, 2017
122017
Characterizing sources of ineffectual computations in deep learning networks
M Nikolić, M Mahmoud, A Moshovos, Y Zhao, R Mullins
2019 IEEE International Symposium on Performance Analysis of Systems and …, 2019
112019
Tensordash: Exploiting sparsity to accelerate deep neural network training
M Mahmoud, I Edo, AH Zadeh, OM Awad, G Pekhimenko, J Albericio, ...
2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture …, 2020
72020
Laconic deep learning computing
S Sharify, M Mahmoud, AD Lascorz, M Nikolic, A Moshovos
arXiv preprint arXiv:1805.04513, 2018
32018
Hybrid limited-pointer linked-list cache directory and cache coherence protocol
M Mahmoud, A Wassal
JEC-ECC 2013, 77 - 82, 2013
22013
FPRaker: A Processing Element For Accelerating Neural Network Training
OM Awad, M Mahmoud, I Edo, AH Zadeh, C Bannon, A Jayarajan, ...
arXiv preprint arXiv:2010.08065, 2020
12020
Tensordash: Exploiting sparsity to accelerate deep neural network training and inference
M Mahmoud, I Edo, AH Zadeh, OM Awad, G Pekhimenko, J Albericio, ...
arXiv preprint arXiv:2009.00748, 2020
12020
Accelerating Image-Sensor-Based Deep Learning Applications
M Mahmoud, DM Stuart, Z Poulos, AD Lascorz, P Judd, S Sharify, ...
IEEE Micro 39 (5), 26-35, 2019
12019
Identifying and Exploiting Ineffectual Computations to Enable Hardware Acceleration of Deep Learning
A Moshovos, J Albericio, P Judd, A Delmas, S Sharify, M Mahmoud, ...
2018 16th IEEE International New Circuits and Systems Conference (NEWCAS …, 2018
12018
Memory controller design under cloud workloads
M Mahmoud, A Moshovos
2016 IEEE International Symposium on Workload Characterization (IISWC), 1-11, 2016
12016
Building an on-chip deep learning memory hierarchy brick by brick: late breaking results
IE Vivancos, S Sharify, M Nikolic, C Bannon, M Mahmoud, AD Lascorz, ...
Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference, 1-2, 2020
2020
Method and device with convolution neural network processing
M Mahmoud, A Moshovos
US Patent App. 16/549,002, 2020
2020
A novel 3D crossbar-based chip multiprocessor architecture
M Mahmoud, A Wassal
JEC-ECC 2013, 83 - 87, 2013
2013
The system can't perform the operation now. Try again later.
Articles 1–19