Jaewoong Sim
Title
Cited by
Cited by
Year
Can fpgas beat gpus in accelerating next-generation deep neural networks?
E Nurvitadhi, G Venkatesh, J Sim, D Marr, R Huang, J Ong Gee Hock, ...
Proceedings of the 2017 ACM/SIGDA International Symposium on Field …, 2017
3492017
A performance analysis framework for identifying potential benefits in GPGPU applications
J Sim, A Dasgupta, H Kim, R Vuduc
ACM SIGPLAN Notices 47 (8), 11-22, 2012
2342012
Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC
E Nurvitadhi, D Sheffield, J Sim, A Mishra, G Venkatesh, D Marr
2016 International Conference on Field-Programmable Technology (FPT), 77-84, 2016
2172016
Graphpim: Enabling instruction-level pim offloading in graph computing frameworks
L Nai, R Hadidi, J Sim, H Kim, P Kumar, H Kim
2017 IEEE International symposium on high performance computer architecture …, 2017
1802017
Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC
E Nurvitadhi, J Sim, D Sheffield, A Mishra, S Krishnan, D Marr
2016 26th International Conference on Field Programmable Logic and …, 2016
1342016
Transparent hardware management of stacked dram as part of memory
J Sim, AR Alameldeen, Z Chishti, C Wilkerson, H Kim
2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, 13-24, 2014
1202014
A mostly-clean DRAM cache for effective hit speculation and self-balancing dispatch
J Sim, GH Loh, H Kim, M OConnor, M Thottethodi
2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, 247-257, 2012
1152012
High performance binary neural networks on the Xeon+ FPGA™ platform
DJM Moss, E Nurvitadhi, J Sim, A Mishra, D Marr, S Subhaschandra, ...
2017 27th International Conference on Field Programmable Logic and …, 2017
662017
Resilient die-stacked DRAM caches
J Sim, GH Loh, V Sridharan, M O'Connor
ACM SIGARCH Computer Architecture News 41 (3), 416-427, 2013
662013
Macsim: A cpu-gpu heterogeneous simulation framework user guide
H Kim, J Lee, NB Lakshminarayana, J Sim, J Lim, T Pho
Georgia Institute of Technology, 2012
662012
A customizable matrix multiplication framework for the intel harpv2 xeon+ fpga platform: A deep learning case study
DJM Moss, S Krishnan, E Nurvitadhi, P Ratuszniak, C Johnson, J Sim, ...
Proceedings of the 2018 ACM/SIGDA International Symposium on Field …, 2018
632018
FLEXclusion: Balancing cache capacity and on-chip bandwidth via flexible exclusion
J Sim, J Lee, MK Qureshi, H Kim
ACM SIGARCH Computer Architecture News 40 (3), 321-332, 2012
482012
BSSync: Processing near memory for machine learning workloads with bounded staleness consistency models
JH Lee, J Sim, H Kim
2015 International Conference on Parallel Architecture and Compilation (PACT …, 2015
422015
Why compete when you can work together: Fpga-asic integration for persistent rnns
E Nurvitadhi, D Kwon, A Jafari, A Boutros, J Sim, P Tomson, H Sumbul, ...
2019 IEEE 27th Annual International Symposium on Field-Programmable Custom …, 2019
242019
Partitioning caches for sub-entities in computing devices
GH Loh, J Sim
US Patent 9,098,417, 2015
212015
Batch-aware unified memory management in GPUs for irregular workloads
H Kim, J Sim, P Gera, R Hadidi, H Kim
Proceedings of the Twenty-Fifth International Conference on Architectural …, 2020
142020
Method and apparatus for implementing a heterogeneous memory subsystem
CB Wilkerson, AR Alameldeen, ZA Chishti, J Sim
US Patent 9,472,248, 2016
142016
Dirty cacheline duplication
GH Loh, VK Sridharan, JM O'connor, J Sim
US Patent 9,229,803, 2016
142016
CoolPIM: Thermal-aware source throttling for efficient PIM instruction offloading
L Nai, R Hadidi, H Xiao, H Kim, J Sim, H Kim
2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018
112018
Bypassing memory requests to a main memory
J Sim, GH Loh
US Patent 8,996,818, 2015
92015
The system can't perform the operation now. Try again later.
Articles 1–20