Hassan Akbari

Cited by

	All	Since 2019
Citations	1483	1465
h-index	9	9
i10-index	9	9

600

300

150

450

201720182019202020212022202320246 11 59 81 149 305 586 279

Public access

View all

8 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Shih-Fu ChangProfessor of Electrical Engineering and Computer Science, Columbia UniversityVerified email at columbia.edu
Nima MesgaraniAssociate Professor, Columbia UniversityVerified email at ee.columbia.edu
Jose L HerreroAssistant Professor, Feinstein Institutes for Medical Research, New YorkVerified email at northwell.edu
Bahar KhalighinejadColumbia UniversityVerified email at columbia.edu
Carl VondrickAssociate Professor, Columbia UniversityVerified email at columbia.edu
Svebor KaramanResearch Manager at DataminrVerified email at dataminr.com
Mohammad B. SHAMSOLLAHISharif University of Technology, Tehran, IranVerified email at sharif.edu
Ronald PhlypoAssociate professor, ViBS @ GIPSA-Lab, Grenoble INPVerified email at grenoble-inp.fr

Hassan Akbari

Senior Research Scientist, Google

Verified email at google.com - Homepage

Multimodal Understanding Self-Supervised Learning Deep Learning Computer Vision


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text H Akbari, L Yuan, R Qian, WH Chuang, SF Chang, Y Cui, B Gong Advances in Neural Information Processing Systems, 2021, 2021	523	2021
Pali: A jointly-scaled multilingual language-image model X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ... arXiv preprint arXiv:2209.06794, 2022	398	2022
Towards reconstructing intelligible speech from the human auditory cortex H Akbari, B Khalighinejad, JL Herrero, AD Mehta, N Mesgarani Scientific reports 9 (1), 874, 2019	229	2019
Lip2audspec: Speech reconstruction from silent lip movements video H Akbari, H Arora, L Cao, N Mesgarani 2018 IEEE international conference on acoustics, speech and signal …, 2018	106	2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding H Akbari, S Karaman, S Bhargava, B Chen, C Vondrick, SF Chang Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019	82	2019
Estimating and interpreting nonlinear receptive field of sensory neural responses with deep neural network models H Akbari, M Keshishian, B Khalighinejad, JL Herrero, AD Mehta, ... Elife 9, e53445, 2020	55	2020
Videopoet: A large language model for zero-shot video generation D Kondratyuk, L Yu, X Gu, J Lezama, J Huang, R Hornung, H Adam, ... arXiv preprint arXiv:2312.14125, 2023	28	2023
Fetal ECG extraction using πTucker decomposition H Akbari, MB Shamsollahi, R Phlypo 2015 International Conference on Systems, Signals and Image Processing …, 2015	20	2015
GAIA-A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System. T Zhang, A Subburathinam, G Shi, L Huang, D Lu, X Pan, M Li, B Zhang, ... TAC, 2018	14	2018
GAIA at SM-KBP 2019-A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System M Li, Y Lin, A Subburathinam, S Whitehead, X Pan, D Lu, Q Wang, ...	6	2019
A robust FCM algorithm for image segmentation based on spatial information and Total Variation H Akbari, HM Kalkhoran, E Fatemizadeh 2015 9th Iranian Conference on Machine Vision and Image Processing (MVIP …, 2015	6	2015
Alternating gradient descent and mixture-of-experts for integrated multimodal perception H Akbari, D Kondratyuk, Y Cui, R Hornung, H Wang, H Adam Advances in Neural Information Processing Systems 36, 2024	4	2024
Scaling multimodal pre-training via cross-modality gradient harmonization J Wu, Y Liang, H Akbari, Z Wang, C Yu Advances in Neural Information Processing Systems 35, 36161-36173, 2022	4	2022
Face-speech bridging by cycle video/audio reconstruction HV Joze, H Akbari US Patent 10,931,976, 2021	4	2021
Neuro-symbolic representations for video captioning: A case for leveraging inductive biases for vision and language H Akbari, H Palangi, J Yang, S Rao, A Celikyilmaz, R Fernandez, ... arXiv preprint arXiv:2011.09530, 2020	3	2020
Modality Bridging and Unified Multimodal Understanding H Akbari Columbia University, 2022	1	2022
Time marking chapters in media items at a platform using machine-learning C Gu, WH Chuang, MH Tsai, J Yang, J Zhang, H Zhou, H Akbari US Patent App. 18/244,625, 2023		2023
Time marking chapters in media items at a platform using machine-learning C Gu, WH Chuang, MH Tsai, J Yang, J Zhang, H Zhou, H Akbari US Patent 11,758,233, 2023		2023

The system can't perform the operation now. Try again later.

Articles 1–18

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors