I am a final-year Ph.D. student in the School of Computer Science at Nanjing University, where I am fortunate to be supervised by Professor Qing Gu (顾庆) and Assistant Professor Zhiwei Jiang (蒋智威). I earned my B.Sc. in Network Engineering from Nantong University in 2019, and shortly thereafter, I began my M.Sc. in Computer Technology at Nanjing University. In 2021, I successfully passed the examination to upgrade to the doctoral program, and I am now pursuing a Ph.D. in Software Engineering. My research focuses on multimodal video understanding and analysis.
👋 If you have opportunities in industry or academia related to my research, please email me and I would be delighted to connect and explore potential collaborations!
") does not match the recommended repository name for your site ("
").
", so that your site can be accessed directly at "http://
".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}
" in index.html
.
",
which does not match the baseurl
("
") configured in _config.yml
.
baseurl
in _config.yml
to "
".
Shiping Ge, Qiang Chen, Zhiwei Jiang, Yafeng Yin, Qin Liu, Ziyao Chen, Qing Gu
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI Oral, CCF-A) 2025
We propose a novel implicit location-caption alignment paradigm based on complementary masking, which addresses the problem of unavailable supervision on event localization in the WSDVC task.
Shiping Ge, Qiang Chen, Zhiwei Jiang, Yafeng Yin, Qin Liu, Ziyao Chen, Qing Gu
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI Oral, CCF-A) 2025
We propose a novel implicit location-caption alignment paradigm based on complementary masking, which addresses the problem of unavailable supervision on event localization in the WSDVC task.
Shiping Ge, Zhiwei Jiang, Yafeng Yin, Cong Wang, Zifeng Cheng, Qing Gu
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM, CCF-B) 2025
We propose a novel end-to-end ZS-CMR framework FGAN, which can learn fine-grained alignment-aware representation for data of different modalities.
Shiping Ge, Zhiwei Jiang, Yafeng Yin, Cong Wang, Zifeng Cheng, Qing Gu
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM, CCF-B) 2025
We propose a novel end-to-end ZS-CMR framework FGAN, which can learn fine-grained alignment-aware representation for data of different modalities.
Shiping Ge, Qiang Chen, Zhiwei Jiang, Yafeng Yin, Ziyao Chen, Qing Gu
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR, CCF-A) 2024
We introduce a novel Short Video Ordering (SVO) task, curate a dedicated multimodal dataset for this task and present the performance of some benchmark methods.
Shiping Ge, Qiang Chen, Zhiwei Jiang, Yafeng Yin, Ziyao Chen, Qing Gu
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR, CCF-A) 2024
We introduce a novel Short Video Ordering (SVO) task, curate a dedicated multimodal dataset for this task and present the performance of some benchmark methods.
Shiping Ge, Zhiwei Jiang, Yafeng Yin, Cong Wang, Zifeng Cheng, Qing Gu
Proceedings of the 31st ACM International Conference on Multimedia (ACMMM, CCF-A) 2023
We propose a new event-aware double-branch localization paradigm to utilize event preferences for more accurate audio-visual event localization.
Shiping Ge, Zhiwei Jiang, Yafeng Yin, Cong Wang, Zifeng Cheng, Qing Gu
Proceedings of the 31st ACM International Conference on Multimedia (ACMMM, CCF-A) 2023
We propose a new event-aware double-branch localization paradigm to utilize event preferences for more accurate audio-visual event localization.
Shiping Ge, Zhiwei Jiang, Cong Wang, Zifeng Cheng, Yafeng Yin, Qing Gu
Proceedings of the ACM Web Conference (WWW, CCF-A) 2023
We design a simple encoder-decoder style multi-modal emotion recognition model, and combine it with our specially-designed adversarial training strategies to learn more robust multi-modal representation for multi-label emotion recognition.
Shiping Ge, Zhiwei Jiang, Cong Wang, Zifeng Cheng, Yafeng Yin, Qing Gu
Proceedings of the ACM Web Conference (WWW, CCF-A) 2023
We design a simple encoder-decoder style multi-modal emotion recognition model, and combine it with our specially-designed adversarial training strategies to learn more robust multi-modal representation for multi-label emotion recognition.