Filip: Fine-grained interactive language-image pre-training L Yao, R Huang, L Hou, G Lu, M Niu, H Xu, X Liang, Z Li, X Jiang, C Xu arXiv preprint arXiv:2111.07783, 2021 | 567 | 2021 |
PixArt-: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis J Chen, J Yu, C Ge, L Yao, E Xie, Y Wu, Z Wang, J Kwok, P Luo, H Lu, Z Li arXiv preprint arXiv:2310.00426, 2023 | 291 | 2023 |
Auto-fpn: Automatic network architecture adaptation for object detection beyond classification H Xu, L Yao, W Zhang, X Liang, Z Li Proceedings of the IEEE/CVF international conference on computer vision …, 2019 | 238 | 2019 |
Detclip: Dictionary-enriched visual-concept paralleled pre-training for open-world detection L Yao, J Han, Y Wen, X Liang, D Xu, W Zhang, Z Li, C Xu, H Xu Advances in Neural Information Processing Systems 35, 9125-9138, 2022 | 132 | 2022 |
Wukong: A 100 million large-scale chinese cross-modal pre-training benchmark J Gu, X Meng, G Lu, L Hou, N Minzhe, X Liang, L Yao, R Huang, W Zhang, ... Advances in Neural Information Processing Systems 35, 26418-26431, 2022 | 109 | 2022 |
SM-NAS: Structural-to-modular neural architecture search for object detection L Yao, H Xu, W Zhang, X Liang, Z Li Proceedings of the AAAI conference on artificial intelligence 34 (07), 12661 …, 2020 | 84 | 2020 |
Detclipv2: Scalable open-vocabulary object detection pre-training via word-region alignment L Yao, J Han, X Liang, D Xu, W Zhang, Z Li, H Xu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 73 | 2023 |
Pixart-\sigma: Weak-to-strong training of diffusion transformer for 4k text-to-image generation J Chen, C Ge, E Xie, Y Wu, L Yao, X Ren, Z Wang, P Luo, H Lu, Z Li arXiv preprint arXiv:2403.04692, 2024 | 72 | 2024 |
Detgpt: Detect what you need via reasoning R Pi, J Gao, S Diao, R Pan, H Dong, J Zhang, L Yao, J Han, H Xu, L Kong, ... arXiv preprint arXiv:2305.14167, 2023 | 66 | 2023 |
Difffit: Unlocking transferability of large diffusion models via simple parameter-efficient fine-tuning E Xie, L Yao, H Shi, Z Liu, D Zhou, Z Liu, J Li, Z Li Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 55 | 2023 |
Dit-3d: Exploring plain diffusion transformers for 3d shape generation S Mo, E Xie, R Chu, L Hong, M Niessner, Z Li Advances in neural information processing systems 36, 67960-67971, 2023 | 43 | 2023 |
Joint-detnas: Upgrade your detector with nas, pruning and dynamic distillation L Yao, R Pi, H Xu, W Zhang, Z Li, T Zhang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021 | 39 | 2021 |
G-detkd: Towards general distillation framework for object detectors via contrastive and semantic-guided feature imitation L Yao, R Pi, H Xu, W Zhang, Z Li, T Zhang Proceedings of the IEEE/CVF international conference on computer vision …, 2021 | 37* | 2021 |
Perceptiongpt: Effectively fusing visual perception into llm R Pi, L Yao, J Gao, J Zhang, T Zhang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 16 | 2024 |
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection L Yao, R Pi, J Han, X Liang, H Xu, W Zhang, Z Li, D Xu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 4 | 2024 |
Ins-DetCLIP: Aligning Detection Model to Follow Human-Language Instruction R Pi, L Yao, J Han, X Liang, W Zhang, H Xu The Twelfth International Conference on Learning Representations, 0 | 2 | |
System and method for cross-modal interaction based on pre-trained model H Xu, HOU Lu, LU Guansong, NIU Minzhe, Z Li, R Huang, YAO Lewei, ... US Patent App. 17/900,592, 2024 | | 2024 |