Gpipe: Efficient training of giant neural networks using pipeline parallelism Y Huang, Y Cheng, A Bapna, O Firat, D Chen, M Chen, HJ Lee, J Ngiam, ... Advances in neural information processing systems 32, 2019 | 1280 | 2019 |
The best of both worlds: Combining recent advances in neural machine translation MX Chen, O Firat, A Bapna, M Johnson, W Macherey, G Foster, L Jones, ... arXiv preprint arXiv:1804.09849, 2018 | 498 | 2018 |
Massively multilingual neural machine translation in the wild: Findings and challenges N Arivazhagan, A Bapna, O Firat, D Lepikhin, M Johnson, M Krikun, ... arXiv preprint arXiv:1907.05019, 2019 | 338 | 2019 |
Gmail Smart Compose: Real-Time Assisted Writing MX Chen, BN Lee, G Bansal, Y Cao, S Zhang, J Lu, J Tsay, Y Wang, ... Proceedings of the 25th ACM SIGKDD International Conference on Knowledge …, 2019 | 194 | 2019 |
Lingvo: a modular and scalable framework for sequence-to-sequence modeling J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... arXiv preprint arXiv:1902.08295, 2019 | 185 | 2019 |
Training deeper neural machine translation models with transparent attention A Bapna, MX Chen, O Firat, Y Cao, Y Wu arXiv preprint arXiv:1808.07561, 2018 | 116 | 2018 |
Leveraging monolingual data with self-supervision for multilingual neural machine translation A Siddhant, A Bapna, Y Cao, O Firat, M Chen, S Kudugunta, ... arXiv preprint arXiv:2005.04816, 2020 | 76 | 2020 |
Unsupervised deep haar scattering on graphs X Chen, X Cheng, S Mallat Advances in Neural Information Processing Systems 27, 2014 | 65 | 2014 |
Predicting a user's next cell with supervised learning based on channel states X Chen, F Mériaux, S Valentin 2013 IEEE 14th workshop on signal processing advances in wireless …, 2013 | 53 | 2013 |
Deep Haar scattering networks X Cheng, X Chen, S Mallat Information and Inference: A Journal of the IMA 5 (2), 105-133, 2016 | 43 | 2016 |
Building machine translation systems for the next thousand languages A Bapna, I Caswell, J Kreutzer, O Firat, D van Esch, A Siddhant, M Niu, ... arXiv preprint arXiv:2205.03983, 2022 | 39 | 2022 |
Music genre classification using multiscale scattering and sparse representations X Chen, PJ Ramadge 2013 47th Annual Conference on Information Sciences and Systems (CISS), 1-6, 2013 | 29 | 2013 |
Towards the next 1000 languages in multilingual machine translation: Exploring the synergy between supervised and self-supervised learning A Siddhant, A Bapna, O Firat, Y Cao, MX Chen, I Caswell, X Garcia arXiv preprint arXiv:2201.03110, 2022 | 26 | 2022 |
Towards end-to-end in-image neural machine translation E Mansimov, M Stern, M Chen, O Firat, J Uszkoreit, P Jain arXiv preprint arXiv:2010.10648, 2020 | 13 | 2020 |
Collaborative representation, sparsity or nonlinearity: What is key to dictionary based classification? X Chen, PJ Ramadge 2014 IEEE International Conference on Acoustics, Speech and Signal …, 2014 | 13 | 2014 |
Faster transformer decoding: N-gram masked self-attention C Chelba, M Chen, A Bapna, N Shazeer arXiv preprint arXiv:2001.04589, 2020 | 12 | 2020 |
Rapid domain adaptation for machine translation with monolingual data M Mahdieh, MX Chen, Y Cao, O Firat arXiv preprint arXiv:2010.12652, 2020 | 8 | 2020 |
Feedback-controlled sequential lasso screening Y Wang, X Chen, PJ Ramadge arXiv preprint arXiv:1608.06010, 2016 | 6 | 2016 |
Sparse representation classification via sequential lasso screening Y Wang, X Chen, PJ Ramadge 2013 IEEE Global Conference on Signal and Information Processing, 1001-1004, 2013 | 6 | 2013 |
Learning with sparsity and scattering networks MX Chen Princeton University, 2015 | 1 | 2015 |