Подписаться
Harsh Mehta
Harsh Mehta
Staff Engineer, Google DeepMind
Подтвержден адрес электронной почты в домене google.com
Название
Процитировано
Процитировано
Год
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
13442023
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
arXiv preprint arXiv:2206.04615, 2022
9712022
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ...
arXiv preprint arXiv:2403.05530, 2024
2922024
Transformer memory as a differentiable search index
Y Tay, VQ Tran, M Dehghani, J Ni, D Bahri, H Mehta, Z Qin, K Hui, Z Zhao, ...
Advances in Neural Information Processing Systems, 2022
2002022
Long range language modeling via gated state spaces
H Mehta, A Gupta, A Cutkosky, B Neyshabur
International Conference on Learning Representations, 2022
1562022
Momentum Improves Normalized SGD
A Cutkosky, H Mehta
International Conference on Machine Learning, 2020
1162020
Transferable representation learning in vision-and-language navigation
H Huang, V Jain, H Mehta, A Ku, G Magalhaes, J Baldridge, E Ie
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
942019
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails
A Cutkosky, H Mehta
Advances in Neural Information Processing Systems, 2021
482021
Large scale transfer learning for differentially private image classification
H Mehta, A Thakurta, A Kurakin, A Cutkosky
Transactions on Machine Learning Research, 2022
412022
Retouchdown: Adding touchdown to streetlearn as a shareable resource for language grounding tasks in street view
H Mehta, Y Artzi, J Baldridge, E Ie, P Mirowski
arXiv preprint arXiv:2001.03671, 2020
352020
Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion
A Cutkosky, H Mehta, F Orabona
International Conference on Machine Learning, 2023
332023
Multi-modal discriminative model for vision-and-language navigation
H Huang, V Jain, H Mehta, J Baldridge, E Ie
arXiv preprint arXiv:1905.13358, 2019
292019
Extreme Memorization via Scale of Initialization
H Mehta, A Cutkosky, B Neyshabur
International Conference on Learning Representations, 2021
172021
Simplifying and understanding state space models with diagonal linear rnns
A Gupta, H Mehta, J Berant
arXiv preprint arXiv:2212.00768, 2022
152022
When, why and how much? adaptive learning rate scheduling by refinement
A Defazio, A Cutkosky, H Mehta, K Mishchenko
arXiv preprint arXiv:2310.07831, 2023
92023
Mechanic: A Learning Rate Tuner
A Cutkosky, A Defazio, H Mehta
Advances in Neural Information Processing Systems, 2023
92023
Towards large scale transfer learning for differentially private image classification
H Mehta, AG Thakurta, A Kurakin, A Cutkosky
Transactions on Machine Learning Research, 2023
82023
VALAN: vision and language agent navigation
L Lansing, V Jain, H Mehta, H Huang, E Ie
arXiv preprint arXiv:1912.03241, 2019
82019
Differentially Private Image Classification from Features
H Mehta, W Krichene, A Thakurta, A Kurakin, A Cutkosky
Transactions on Machine Learning Research, 2022
72022
The Road Less Scheduled
A Defazio, H Mehta, K Mishchenko, A Khaled, A Cutkosky
arXiv preprint arXiv:2405.15682, 2024
62024
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–20