Follow
Yuqing Song
Title
Cited by
Cited by
Year
WenLan: Bridging vision and language by large-scale multi-modal pre-training
Y Huo, M Zhang, G Liu, H Lu, Y Gao, G Yang, J Wen, H Zhang, B Xu, ...
arXiv preprint arXiv:2103.06561, 2021
1402021
Unpaired cross-lingual image caption generation with self-supervised rewards
Y Song, S Chen, Y Zhao, Q Jin
Proceedings of the 27th ACM international conference on multimedia, 784-792, 2019
452019
Towards diverse paragraph captioning for untrimmed videos
Y Song, S Chen, Q Jin
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
442021
Unifying event detection and captioning as sequence generation via pre-training
Q Zhang, Y Song, Q Jin
European Conference on Computer Vision, 363-379, 2022
302022
Progressive learning for image retrieval with hybrid-modality queries
Y Zhao, Y Song, Q Jin
Proceedings of the 45th International ACM SIGIR Conference on Research and …, 2022
282022
Product-oriented machine translation with cross-modal cross-lingual pre-training
Y Song, S Chen, Q Jin, W Luo, J Xie, F Huang
Proceedings of the 29th ACM International Conference on Multimedia, 2843-2852, 2021
162021
Activitynet 2019 task 3: Exploring contexts for dense captioning events in videos
S Chen, Y Song, Y Zhao, Q Jin, Z Zeng, B Liu, J Fu, A Hauptmann
arXiv preprint arXiv:1907.05092, 2019
122019
Enhancing neural machine translation with dual-side multimodal awareness
Y Song, S Chen, Q Jin, W Luo, J Xie, F Huang
IEEE Transactions on Multimedia 24, 3013-3024, 2021
102021
Accommodating audio modality in CLIP for multimodal processing
L Ruan, A Hu, Y Song, L Zhang, S Zheng, Q Jin
Proceedings of the AAAI Conference on Artificial Intelligence 37 (8), 9641-9649, 2023
82023
RUC_AIM3 at TRECVID 2020: Ad-hoc Video Search & Video to Text Description.
Y Zhao, Y Song, S Chen, Q Jin
TRECVID 1, 2, 2020
72020
RUC+ CMU: system report for dense captioning events in videos
S Chen, Y Song, Y Zhao, J Qiu, Q Jin, A Hauptmann
arXiv preprint arXiv:1806.08854, 2018
72018
Team ruc_aim3 technical report at activitynet 2020 task 2: Exploring sequential events detection for dense video captioning
Y Song, S Chen, Y Zhao, Q Jin
arXiv preprint arXiv:2006.07896, 2020
42020
RUC_AIM3 at TRECVID 2019: Video to Text.
Y Song, Y Zhao, S Chen, Q Jin
TRECVID, 2019
22019
Team RUC_AIM3 technical report at activityNet 2021: Entities object localization
L Ruan, J Chen, Y Song, S Chen, Q Jin
arXiv preprint arXiv:2106.06138, 2021
12021
iMakeup: Makeup Instructional Video Dataset for Fine-Grained Dense Video Captioning
X Lin, Q Jin, S Chen, Y Song, Y Zhao
Advances in Multimedia Information Processing–PCM 2018: 19th Pacific-Rim …, 2018
12018
Integrating Temporal and Spatial Attentions for VATEX Video Captioning Challenge 2019
S Chen, Y Zhao, Y Song, Q Jin, Q Wu
arXiv preprint arXiv:1910.06737, 2019
2019
Supplementary Material for “Unifying Event Detection and Captioning as Sequence Generation via Pre-Training”
Q Zhang, Y Song, Q Jin
RUC_AIM3 at TRECVID 2021: Video to Text
L Zhang, Y Song, Q Jin
Team RUC AIˇ M3 Technical Report at VMT Challenge 2020: Enhancing Neural Machine Translation with Multimodal Rewards
Y Song, S Chen, Q Jin
Supplementary Material for “Towards Diverse Paragraph Captioning for Untrimmed Videos”
Y Song, S Chen, Q Jin
The system can't perform the operation now. Try again later.
Articles 1–20