Memcached design on high performance RDMA capable interconnects J Jose, H Subramoni, M Luo, M Zhang, J Huang, M Wasi-ur-Rahman, ... 2011 International Conference on Parallel Processing, 743-752, 2011 | 266 | 2011 |
MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters H Wang, S Potluri, M Luo, AK Singh, S Sur, DK Panda Computer Science-Research and Development 26 (3), 257-266, 2011 | 195 | 2011 |
High-performance design of hbase with rdma over infiniband J Huang, X Ouyang, J Jose, M Wasi-ur-Rahman, H Wang, M Luo, ... 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 114 | 2012 |
Optimized non-contiguous MPI datatype communication for GPU clusters: Design, implementation and evaluation with MVAPICH2 H Wang, S Potluri, M Luo, AK Singh, X Ouyang, S Sur, DK Panda 2011 IEEE International Conference on Cluster Computing, 308-316, 2011 | 80 | 2011 |
Rdma over ethernet—a preliminary study H Subramoni, P Lai, M Luo, DK Panda 2009 IEEE International Conference on Cluster Computing and Workshops, 1-9, 2009 | 69 | 2009 |
Unifying UPC and MPI runtimes: experience with MVAPICH J Jose, M Luo, S Sur, DK Panda Proceedings of the Fourth Conference on Partitioned Global Address Space …, 2010 | 66 | 2010 |
SSD-assisted hybrid memory to accelerate memcached over high performance networks X Ouyang, NS Islam, R Rajachandrasekar, J Jose, M Luo, H Wang, ... 2012 41st International Conference on Parallel Processing, 470-479, 2012 | 40 | 2012 |
Supporting hybrid MPI and OpenSHMEM over InfiniBand: Design and performance evaluation J Jose, K Kandalla, M Luo, DK Panda 2012 41st International Conference on Parallel Processing, 219-228, 2012 | 39 | 2012 |
Congestion avoidance on manycore high performance computing systems M Luo, DK Panda, KZ Ibrahim, C Iancu Proceedings of the 26th ACM international conference on Supercomputing, 121-132, 2012 | 35 | 2012 |
Reducing network contention with mixed workloads on modern multicore, clusters MJ Koop, M Luo, DK Panda 2009 IEEE International Conference on Cluster Computing and Workshops, 1-10, 2009 | 27 | 2009 |
UPC on MIC: Early experiences with native and symmetric modes M Luo, M Li, A Venkatesh, X Lu, DK Panda 7th International Conference on PGAS Programming Models, 198, 2013 | 23 | 2013 |
UPC Queues for scalable graph traversals: Design and evaluation on InfiniBand clusters J Jose, S Potluri, M Luo, S Sur, D Panda Conference on PGAS Programming Models, 2011 | 17 | 2011 |
Multi-threaded UPC runtime with network endpoints: Design alternatives and evaluation on multi-core architectures M Luo, J Jose, S Sur, DK Panda 2011 18th International Conference on High Performance Computing, 1-10, 2011 | 16 | 2011 |
Initial study of multi-endpoint runtime for MPI+ OpenMP hybrid programming model on multi-core systems M Luo, X Lu, K Hamidouche, K Kandalla, DK Panda Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of …, 2014 | 13 | 2014 |
High performance alltoall and allgather designs for infiniband mic clusters A Venkatesh, S Potluri, R Rajachandrasekar, M Luo, K Hamidouche, ... 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014 | 11 | 2014 |
Multi-Threaded UPC Runtime for GPU to GPU communication over InfiniBand M Luo, H Wang, DK Panda The 6th Conference on Partitioned Global Address Space Programming Models, 2012 | 11 | 2012 |
High performance design and implementation of nemesis communication layer for two-sided and one-sided mpi semantics in mvapich2 M Luo, S Potluri, P Lai, EP Mancini, H Subramoni, K Kandalla, S Sur, ... 2010 39th International Conference on Parallel Processing Workshops, 377-386, 2010 | 10 | 2010 |
Early evaluation of scalable fabric interface for PGAS programming models M Luo, K Seager, KS Murthy, CJ Archer, S Sur, S Hefty Proceedings of the 8th International Conference on Partitioned Global …, 2014 | 9 | 2014 |
Redesigning MPI shared memory communication for large multi-core architecture M Luo, H Wang, J Vienne, DK Panda Computer Science-Research and Development 28 (2), 137-146, 2013 | 6 | 2013 |
Multi-threaded UPC Runtime with Network Endpoints: Design Alternatives and Evaluation on InfiniBand Clusters M Luo, J Jose, S Sur, DK Panda International Conference on High Performance Computing (HiPC), 2011 | 4 | 2011 |