Is someone speaking? exploring long-term temporal features for audio-visual active speaker detection R Tao, Z Pan, RK Das, X Qian, MZ Shou, H Li Proceedings of the 29th ACM International Conference on Multimedia, 3927-3935, 2021 | 77 | 2021 |
Multi-speaker tracking from an audio–visual sensing device X Qian, A Brutti, O Lanz, M Omologo, A Cavallaro IEEE Transactions on Multimedia 21 (10), 2576-2588, 2019 | 38 | 2019 |
3D audio-visual speaker tracking with an adaptive particle filter X Qian, A Brutti, M Omologo, A Cavallaro 2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017 | 32 | 2017 |
3D mouth tracking from a compact microphone array co-located with a camera X Qian, A Xompero, A Cavallaro, A Brutti, O Lanz, M Omologo 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 18 | 2018 |
Multi-target DoA estimation with an audio-visual fusion mechanism X Qian, M Madhavi, Z Pan, J Wang, H Li ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 16 | 2021 |
Audio-visual tracking of concurrent speakers X Qian, A Brutti, O Lanz, M Omologo, A Cavallaro IEEE Transactions on Multimedia 24, 942-954, 2021 | 13 | 2021 |
GCC-PHAT with speech-oriented attention for robotic sound source localization J Wang, X Qian, Z Pan, M Zhang, H Li 2021 IEEE International Conference on Robotics and Automation (ICRA), 5876-5883, 2021 | 7 | 2021 |
Speaker extraction with co-speech gestures cue Z Pan, X Qian, H Li IEEE Signal Processing Letters 29, 1467-1471, 2022 | 6 | 2022 |
Locata challenge: speaker localization with a planar array X Qian, A Cavallaro, A Brutti, M Omologo arXiv preprint arXiv:1901.08983, 2019 | 6 | 2019 |
Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception J Wang, X Qian, H Li arXiv preprint arXiv:2209.01768, 2022 | 4 | 2022 |
Speech-Oriented Sparse Attention Denoising for Voice User Interface Toward Industry 5.0 H Zhu, Q Zhang, P Gao, X Qian IEEE Transactions on Industrial Informatics 19 (2), 2151-2160, 2022 | 3 | 2022 |
Audio-Visual Multi-Speaker Tracking Based on the GLMB Framework. S Lin, X Qian INTERSPEECH, 3082-3086, 2020 | 3 | 2020 |
Deep Audio-Visual Beamforming for Speaker Localization X Qian, Q Zhang, G Guan, W Xue IEEE Signal Processing Letters 29, 1132-1136, 2022 | 2 | 2022 |
Profile driven dataflow optimisation of mean shift visual tracking D Bhowmik, A Wallace, R Stewart, X Qian, G Michaelson 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP …, 2014 | 2 | 2014 |
A Time-Frequency Attention Module for Neural Speech Enhancement Q Zhang, X Qian, Z Ni, A Nicolson, E Ambikairajah, H Li IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 462-475, 2022 | 1 | 2022 |
Three-Dimensional Speaker Localization: Audio-Refined Visual Scaling Factor Estimation X Qian, Q Liu, J Wang, H Li IEEE Signal Processing Letters 28, 1405-1409, 2021 | 1 | 2021 |
Accurate target annotation in 3D from multimodal streams O Lanz, A Brutti, A Xompero, X Qian, M Omologo, A Cavallaro ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 1 | 2019 |
Stream Attention Based U-Net for L3DAS23 Challenge H Wang, Y Fu, J Li, M Ge, L Wang, X Qian ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |
Ripple sparse self-attention for monaural speech enhancement Q Zhang, H Zhu, Q Song, X Qian, Z Ni, H Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |
Device Features Based on Linear Transformation With Parallel Training Data for Replay Speech Detection L Xu, J Yang, CH You, X Qian, D Huang IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | | 2023 |