Follow
Hassan Akbari
Hassan Akbari
Senior Research Scientist, Google
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
H Akbari, L Yuan, R Qian, WH Chuang, SF Chang, Y Cui, B Gong
Advances in Neural Information Processing Systems, 2021, 2021
5232021
Pali: A jointly-scaled multilingual language-image model
X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ...
arXiv preprint arXiv:2209.06794, 2022
3982022
Towards reconstructing intelligible speech from the human auditory cortex
H Akbari, B Khalighinejad, JL Herrero, AD Mehta, N Mesgarani
Scientific reports 9 (1), 874, 2019
2292019
Lip2audspec: Speech reconstruction from silent lip movements video
H Akbari, H Arora, L Cao, N Mesgarani
2018 IEEE international conference on acoustics, speech and signal …, 2018
1062018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
H Akbari, S Karaman, S Bhargava, B Chen, C Vondrick, SF Chang
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
822019
Estimating and interpreting nonlinear receptive field of sensory neural responses with deep neural network models
H Akbari, M Keshishian, B Khalighinejad, JL Herrero, AD Mehta, ...
Elife 9, e53445, 2020
552020
Videopoet: A large language model for zero-shot video generation
D Kondratyuk, L Yu, X Gu, J Lezama, J Huang, R Hornung, H Adam, ...
arXiv preprint arXiv:2312.14125, 2023
282023
Fetal ECG extraction using πTucker decomposition
H Akbari, MB Shamsollahi, R Phlypo
2015 International Conference on Systems, Signals and Image Processing …, 2015
202015
GAIA-A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System.
T Zhang, A Subburathinam, G Shi, L Huang, D Lu, X Pan, M Li, B Zhang, ...
TAC, 2018
142018
GAIA at SM-KBP 2019-A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System
M Li, Y Lin, A Subburathinam, S Whitehead, X Pan, D Lu, Q Wang, ...
62019
A robust FCM algorithm for image segmentation based on spatial information and Total Variation
H Akbari, HM Kalkhoran, E Fatemizadeh
2015 9th Iranian Conference on Machine Vision and Image Processing (MVIP …, 2015
62015
Alternating gradient descent and mixture-of-experts for integrated multimodal perception
H Akbari, D Kondratyuk, Y Cui, R Hornung, H Wang, H Adam
Advances in Neural Information Processing Systems 36, 2024
42024
Scaling multimodal pre-training via cross-modality gradient harmonization
J Wu, Y Liang, H Akbari, Z Wang, C Yu
Advances in Neural Information Processing Systems 35, 36161-36173, 2022
42022
Face-speech bridging by cycle video/audio reconstruction
HV Joze, H Akbari
US Patent 10,931,976, 2021
42021
Neuro-symbolic representations for video captioning: A case for leveraging inductive biases for vision and language
H Akbari, H Palangi, J Yang, S Rao, A Celikyilmaz, R Fernandez, ...
arXiv preprint arXiv:2011.09530, 2020
32020
Modality Bridging and Unified Multimodal Understanding
H Akbari
Columbia University, 2022
12022
Time marking chapters in media items at a platform using machine-learning
C Gu, WH Chuang, MH Tsai, J Yang, J Zhang, H Zhou, H Akbari
US Patent App. 18/244,625, 2023
2023
Time marking chapters in media items at a platform using machine-learning
C Gu, WH Chuang, MH Tsai, J Yang, J Zhang, H Zhou, H Akbari
US Patent 11,758,233, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–18