Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA G Bosilca, A Bouteiller, A Danalis, M Faverge, A Haidar, T Herault, ... 2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011 | 196* | 2011 |

Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers A Haidar, S Tomov, J Dongarra, NJ Higham SC18: International Conference for High Performance Computing, Networking …, 2018 | 145 | 2018 |

Accelerating numerical dense linear algebra calculations with GPUs J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek, S Tomov, ... Numerical computations with GPUs, 3-28, 2014 | 107 | 2014 |

Performance, design, and autotuning of batched GEMM for GPUs A Abdelfattah, A Haidar, S Tomov, J Dongarra International Conference on High Performance Computing, 21-38, 2016 | 106 | 2016 |

Seismic wave modeling for seismic imaging J Virieux, S Operto, H Ben-Hadj-Ali, R Brossier, V Etienne, F Sourbier, ... The Leading Edge 28 (5), 538-544, 2009 | 101 | 2009 |

Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels A Haidar, H Ltaief, J Dongarra Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 72 | 2011 |

Batched matrix computations on hardware accelerators based on GPUs A Haidar, T Dong, P Luszczek, S Tomov, J Dongarra The International Journal of High Performance Computing Applications 29 (2 …, 2015 | 65 | 2015 |

High-performance tensor contractions for GPUs A Abdelfattah, M Baboulin, V Dobrev, J Dongarra, C Earl, J Falcou, ... Procedia Computer Science 80, 108-118, 2016 | 61 | 2016 |

Parallel programming models for dense linear algebra on heterogeneous systems J Dongarra, M Abalenkovs, A Abdelfattah, M Gates, A Haidar, J Kurzak, ... Supercomputing frontiers and innovations 2 (4), 67-86, 2015 | 58 | 2015 |

Investigating half precision arithmetic to accelerate dense linear system solvers A Haidar, P Wu, S Tomov, J Dongarra Proceedings of the 8th workshop on latest advances in scalable algorithms …, 2017 | 56 | 2017 |

High-performance matrix-matrix multiplications of very small matrices I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ... European Conference on Parallel Processing, 659-671, 2016 | 56 | 2016 |

LU factorization of small matrices: Accelerating batched DGETRF on the GPU T Dong, A Haidar, P Luszczek, JA Harris, S Tomov, J Dongarra 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 …, 2014 | 50 | 2014 |

Parallel scalability study of hybrid preconditioners in three dimensions L Giraud, A Haidar, LT Watson Parallel Computing 34 (6-8), 363-379, 2008 | 48 | 2008 |

An improved parallel singular value algorithm and its implementation for multicore hardware A Haidar, J Kurzak, P Luszczek Proceedings of the International Conference on High Performance Computing …, 2013 | 46 | 2013 |

Sparse approximations of the Schur complement for parallel algebraic hybrid linear solvers in 3D L Giraud, A Haidar, Y Saad INRIA, 2010 | 45 | 2010 |

The singular value decomposition: Anatomy of optimizing an algorithm for extreme scale J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek, S Tomov, ... SIAM review 60 (4), 808-865, 2018 | 44 | 2018 |

A framework for batched and GPU-resident factorization algorithms applied to block householder transformations A Haidar, TT Dong, S Tomov, P Luszczek, J Dongarra International Conference on High Performance Computing, 31-47, 2015 | 43 | 2015 |

The design of fast and energy-efficient linear solvers: On the potential of half-precision arithmetic and iterative refinement techniques A Haidar, A Abdelfattah, M Zounon, P Wu, S Pranesh, S Tomov, ... International Conference on Computational Science, 586-600, 2018 | 42 | 2018 |

Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures A Haidar, H Ltaief, A YarKhan, J Dongarra Concurrency and Computation: Practice and Experience 24 (3), 305-321, 2012 | 42 | 2012 |

Three‐dimensional parallel frequency‐domain visco‐acoustic wave modelling based on a hybrid direct/iterative solver F Sourbier, A Haidar, L Giraud, H Ben‐Hadj‐Ali, S Operto, J Virieux Geophysical Prospecting 59 (Modelling Methods for Geophysical Imaging …, 2011 | 39 | 2011 |