Xlnet: Generalized autoregressive pretraining for language understanding Z Yang, Z Dai, Y Yang, J Carbonell, RR Salakhutdinov, QV Le Advances in neural information processing systems 32, 2019 | 8168 | 2019 |
Transformer-xl: Attentive language models beyond a fixed-length context Z Dai, Z Yang, Y Yang, J Carbonell, QV Le, R Salakhutdinov arXiv preprint arXiv:1901.02860, 2019 | 3426 | 2019 |
Unsupervised data augmentation for consistency training Q Xie, Z Dai, E Hovy, T Luong, Q Le Advances in neural information processing systems 33, 6256-6268, 2020 | 1924 | 2020 |
Coatnet: Marrying convolution and attention for all data sizes Z Dai, H Liu, QV Le, M Tan Advances in neural information processing systems 34, 3965-3977, 2021 | 806 | 2021 |
Meta pseudo labels H Pham, Z Dai, Q Xie, QV Le Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021 | 609 | 2021 |
Good semi-supervised learning that requires a bad gan Z Dai, Z Yang, F Yang, WW Cohen, RR Salakhutdinov Advances in neural information processing systems 30, 2017 | 527 | 2017 |
Simvlm: Simple visual language model pretraining with weak supervision Z Wang, J Yu, AW Yu, Z Dai, Y Tsvetkov, Y Cao arXiv preprint arXiv:2108.10904, 2021 | 522 | 2021 |
Characterizing and avoiding negative transfer Z Wang, Z Dai, B Póczos, J Carbonell Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019 | 418 | 2019 |
Breaking the softmax bottleneck: A high-rank RNN language model Z Yang, Z Dai, R Salakhutdinov, WW Cohen arXiv preprint arXiv:1711.03953, 2017 | 368 | 2017 |
Pay attention to mlps H Liu, Z Dai, D So, QV Le Advances in Neural Information Processing Systems 34, 9204-9215, 2021 | 361 | 2021 |
Controllable invariance through adversarial feature learning Q Xie, Z Dai, Y Du, E Hovy, G Neubig Advances in neural information processing systems 30, 2017 | 271 | 2017 |
Unsupervised data augmentation Q Xie, Z Dai, E Hovy, MT Luong, QV Le arXiv preprint arXiv:1904.12848 2 (6), 7, 2019 | 249 | 2019 |
SwitchOut: an efficient data augmentation algorithm for neural machine translation X Wang, H Pham, Z Dai, G Neubig arXiv preprint arXiv:1808.07512, 2018 | 204 | 2018 |
Cfo: Conditional focused neural question answering with large-scale knowledge bases Z Dai, L Li, W Xu arXiv preprint arXiv:1606.01994, 2016 | 179 | 2016 |
Funnel-transformer: Filtering out sequential redundancy for efficient language processing Z Dai, G Lai, Y Yang, Q Le Advances in neural information processing systems 33, 4271-4282, 2020 | 160 | 2020 |
An interpretable knowledge transfer model for knowledge base completion Q Xie, X Ma, Z Dai, E Hovy arXiv preprint arXiv:1704.05908, 2017 | 117 | 2017 |
Calibrating energy-based generative adversarial networks Z Dai, A Almahairi, P Bachman, E Hovy, A Courville arXiv preprint arXiv:1702.01691, 2017 | 105 | 2017 |
Transformer-xl: Attentive language models beyond a fixed-length context. arXiv 2019 Z Dai, Z Yang, Y Yang, J Carbonell, QV Le, R Salakhutdinov arXiv preprint arXiv:1901.02860, 0 | 100 | |
Transformer quality in linear time W Hua, Z Dai, H Liu, Q Le International Conference on Machine Learning, 9099-9117, 2022 | 99 | 2022 |
Combined scaling for zero-shot transfer learning H Pham, Z Dai, G Ghiasi, K Kawaguchi, H Liu, AW Yu, J Yu, YT Chen, ... Neurocomputing 555, 126658, 2023 | 98 | 2023 |