Подписаться
Xiao Wang
Xiao Wang
Google DeepMind
Подтвержден адрес электронной почты в домене google.com - Главная страница
Название
Процитировано
Процитировано
Год
Pali: A jointly-scaled multilingual language-image model
X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ...
ICLR 2023 (Oral), 2022
5492022
LiT: Zero-Shot Transfer with Locked-image Text Tuning
X Zhai, X Wang, B Mustafa, A Steiner, D Keysers, A Kolesnikov, L Beyer
CVPR 2022, 2021
5032021
Scaling vision transformers to 22 billion parameters
M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ...
ICML 2023 (Oral), 2023
4012023
Simple Open-Vocabulary Object Detection with Vision Transformers
M Minderer, A Gritsenko, A Stone, M Neumann, D Weissenborn, ...
ECCV 2022, 2022
391*2022
Measuring compositional generalization: A comprehensive method on realistic data
D Keysers, N Schärli, N Scales, H Buisman, D Furrer, S Kashubin, ...
ICLR 2020, 2019
3682019
Pali-x: On scaling up a multilingual vision and language model
X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ...
CVPR 2024, 2023
1282023
Pali-3 vision language models: Smaller, faster, stronger
X Chen, X Wang, L Beyer, A Kolesnikov, J Wu, P Voigtlaender, B Mustafa, ...
arXiv preprint arXiv:2310.09199, 2023
502023
PaliGemma: A versatile 3B VLM for transfer
L Beyer, A Steiner, AS Pinto, A Kolesnikov, X Wang, D Salz, M Neumann, ...
arXiv preprint arXiv:2407.07726, 2024
262024
Three Towers: Flexible Contrastive Learning with Pretrained Image Models
J Kossen, M Collier, B Mustafa, X Wang, X Zhai, L Beyer, A Steiner, ...
NeuIPS 2023, 2023
72023
No filter: Cultural and socioeconomic diversityin contrastive vision-language models
A Pouget, L Beyer, E Bugliarello, X Wang, AP Steiner, X Zhai, ...
NeurIPS 2024, 2024
52024
CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?
I Alabdulmohsin, X Wang, A Steiner, P Goyal, A D'Amour, X Zhai
ICLR 2024, 2024
52024
A study of autoregressive decoders for multi-tasking in computer vision
L Beyer, B Wan, G Madan, F Pavetic, A Steiner, A Kolesnikov, AS Pinto, ...
arXiv preprint arXiv:2303.17376, 2023
52023
LocCa: Visual Pretraining with Location-aware Captioners
B Wan, M Tschannen, Y Xian, F Pavetic, I Alabdulmohsin, X Wang, ...
NeurIPS 2024, 2024
22024
Locked-Model Multimodal Contrastive Tuning
D Keysers, X Zhai, X Wang, L Beyer, B Mustafa, A Steiner, A Kolesnikov
US Patent App. 18/051,106, 2024
2024
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–14