{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T14:11:18Z","timestamp":1753884678660,"version":"3.41.2"},"reference-count":40,"publisher":"World Scientific Pub Co Pte Ltd","issue":"08","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U1612442"],"award-info":[{"award-number":["U1612442"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51661005"],"award-info":[{"award-number":["51661005"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"GHfund A","award":["2022020119853"],"award-info":[{"award-number":["2022020119853"]}]},{"name":"the National Key R&D Program of China","award":["2021YFF0901001"],"award-info":[{"award-number":["2021YFF0901001"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Patt. Recogn. Artif. Intell."],"published-print":{"date-parts":[[2023,6,30]]},"abstract":"<jats:p> The video moment retrieval task aims to fetch a target moment in an untrimmed video, which best matches the semantics of a sentence query. Existing methods mainly focus on utilizing two separate modules: one learns intra-modal relations to understand video and query contents, and the other explores inter-modal interactions to build a semantic bridge between video and language. However, intra-modal relations information can be easily overlooked when capturing inter-modal interactions. In fact, intra-modal relations and inter-modal interactions can be learned simultaneously within a unified module to make video and sentence guide each other. Towards this end, we propose a Cross-Modal Interaction Network (CMIN) for video moment retrieval by jointly exploring the intra-modal relations and inter-modal interactions between video frames and query words. In CMIN, a query-guided channel attention module is designed to suppress query-irrelevant visual features and enhance crucial contents; then a cross-attention module simultaneously considers intra-modal relations within each modality and fine-grained inter-modal interactions between frames and words, to enhance the semantic relevance between video and sentence query. Compared to the state-of-the-art methods, the experiments on two public datasets (Charades-STA and TACoS) demonstrate the superiority of our method. <\/jats:p>","DOI":"10.1142\/s0218001423550108","type":"journal-article","created":{"date-parts":[[2023,6,10]],"date-time":"2023-06-10T05:51:01Z","timestamp":1686376261000},"source":"Crossref","is-referenced-by-count":3,"title":["Cross-Modal Interaction Network for Video Moment Retrieval"],"prefix":"10.1142","volume":"37","author":[{"given":"Shen","family":"Ping","sequence":"first","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, P. R. China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiao","family":"Jiang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, P. R. China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7037-9939","authenticated-orcid":false,"given":"Zean","family":"Tian","sequence":"additional","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, P. R. China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ronghui","family":"Cao","sequence":"additional","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, P. R. China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weiming","family":"Chi","sequence":"additional","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, P. R. China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shenghong","family":"Yang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, P. R. China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2023,7,6]]},"reference":[{"key":"S0218001423550108BIB001","first-page":"5803","volume-title":"Proc. IEEE Int. Conf. Computer Vision","author":"Anne Hendricks L.","year":"2017"},{"key":"S0218001423550108BIB003","first-page":"6299","volume-title":"Proc. IEEE Conf. Computer Vision and Pattern Recognition","author":"Carreira J.","year":"2017"},{"key":"S0218001423550108BIB005","doi-asserted-by":"crossref","first-page":"162","DOI":"10.18653\/v1\/D18-1015","volume-title":"Proc. 2018 Conf. Empirical Methods in Natural Language Processing","author":"Chen J.","year":"2018"},{"issue":"07","key":"S0218001423550108BIB006","first-page":"10551","volume-title":"Proc. AAAI Conf. on Artificial Intelligence","volume":"34","author":"Chen L.","year":"2020"},{"key":"S0218001423550108BIB007","first-page":"10638","volume-title":"Proc. IEEE\/CVF Conf. Computer Vision and Pattern Recognition","author":"Chen S.","year":"2020"},{"key":"S0218001423550108BIB008","first-page":"4070","volume-title":"Proc. IEEE\/CVF Winter Conf. Applications of Computer Vision","author":"Deb T.","year":"2022"},{"key":"S0218001423550108BIB010","first-page":"5267","volume-title":"Proc. IEEE Int. Conf. Computer Vision","author":"Gao J.","year":"2017"},{"key":"S0218001423550108BIB011","first-page":"3628","volume-title":"Proc. IEEE Int. Conf. Computer Vision","author":"Gao J.","year":"2017"},{"key":"S0218001423550108BIB012","first-page":"245","volume-title":"2019 IEEE Winter Conf. Applications of Computer Vision WACV","author":"Ge R.","year":"2019"},{"issue":"8","key":"S0218001423550108BIB014","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"Hochreiter S.","year":"1997","journal-title":"Neural Comput."},{"key":"S0218001423550108BIB015","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1016\/j.neucom.2021.11.019","volume":"471","author":"Jia Z.","year":"2022","journal-title":"Neurocomputing"},{"key":"S0218001423550108BIB016","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.neucom.2019.08.042","volume":"370","author":"Jin T.","year":"2019","journal-title":"Neurocomputing"},{"key":"S0218001423550108BIB017","first-page":"1114","volume-title":"Proc. 44th Int. ACM SIGIR Conf. Research and Development in Information Retrieval","author":"Jin W.","year":"2021"},{"issue":"3","key":"S0218001423550108BIB018","first-page":"1902","volume-title":"Proc. AAAI Conf. Artificial Intelligence","volume":"35","author":"Li K.","year":"2021"},{"key":"S0218001423550108BIB019","doi-asserted-by":"crossref","first-page":"988","DOI":"10.1145\/3123266.3123343","volume-title":"Proc. 25th ACM Int. Conf. Multimedia","author":"Lin T.","year":"2017"},{"key":"S0218001423550108BIB020","first-page":"15","volume-title":"The 41st Int. ACM SIGIR Conf. Research & Development in Information Retrieval","author":"Liu M.","year":"2018"},{"key":"S0218001423550108BIB021","first-page":"21","volume-title":"European Conf. Computer Vision","author":"Liu W.","year":"2016"},{"key":"S0218001423550108BIB022","first-page":"5144","volume-title":"Proc. 2019 Conf. Empirical Methods in Natural Language Processing and the 9th Int. Joint Conf. Natural Language Processing EMNLP-IJCNLP","author":"Lu C.","year":"2019"},{"key":"S0218001423550108BIB023","first-page":"1","volume-title":"ACM Multimedia Asia","author":"Ma Z.","year":"2021"},{"key":"S0218001423550108BIB024","doi-asserted-by":"crossref","first-page":"55","DOI":"10.3115\/v1\/P14-5010","volume-title":"Proc. 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations","author":"Manning C. D.","year":"2014"},{"key":"S0218001423550108BIB025","first-page":"10810","volume-title":"Proc. IEEE\/CVF Conf. Computer Vision and Pattern Recognition","author":"Mun J.","year":"2020"},{"key":"S0218001423550108BIB026","first-page":"1532","volume-title":"Proc. 2014 Conf. Empirical Methods in Natural Language Processing EMNLP","author":"Pennington J.","year":"2014"},{"key":"S0218001423550108BIB027","first-page":"4280","volume-title":"Proc. 28th ACM Int. Conf. Multimedia","author":"Qu X.","year":"2020"},{"key":"S0218001423550108BIB028","first-page":"779","volume-title":"Proc. IEEE Conf. Computer Vision and Pattern Recognition","author":"Redmon J.","year":"2016"},{"key":"S0218001423550108BIB029","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1162\/tacl_a_00207","volume":"1","author":"Regneri M.","year":"2013","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"S0218001423550108BIB030","first-page":"2464","volume-title":"Proc. IEEE\/CVF Winter Conf. Applications of Computer Vision","author":"Rodriguez C.","year":"2020"},{"key":"S0218001423550108BIB031","first-page":"144","volume-title":"European Conf. Computer Vision","author":"Rohrbach M.","year":"2012"},{"issue":"11","key":"S0218001423550108BIB032","doi-asserted-by":"crossref","first-page":"2673","DOI":"10.1109\/78.650093","volume":"45","author":"Schuster M.","year":"1997","journal-title":"IEEE Trans. Signal Process."},{"key":"S0218001423550108BIB033","first-page":"17959","volume-title":"Proc. IEEE\/CVF Conf. Computer Vision and Pattern Recognition","author":"Seo P. H.","year":"2022"},{"key":"S0218001423550108BIB034","first-page":"510","volume-title":"European Conf. Computer Vision","author":"Sigurdsson G. A.","year":"2016"},{"key":"S0218001423550108BIB035","doi-asserted-by":"crossref","first-page":"1338","DOI":"10.1109\/TMM.2021.3063631","volume":"24","author":"Tang H.","year":"2022","journal-title":"IEEE Trans. Multimedia"},{"key":"S0218001423550108BIB036","first-page":"4489","volume-title":"Proc. IEEE Int. Conf. Computer Vision","author":"Tran D.","year":"2015"},{"key":"S0218001423550108BIB037","first-page":"6000","volume-title":"Advances in Neural Information Processing Systems","volume":"30","author":"Vaswani A.","year":"2017"},{"issue":"07","key":"S0218001423550108BIB038","first-page":"12168","volume-title":"Proc. AAAI Conf. Artificial Intelligence","volume":"34","author":"Wang J.","year":"2020"},{"key":"S0218001423550108BIB040","doi-asserted-by":"crossref","first-page":"3518","DOI":"10.1145\/3474085.3475515","volume-title":"Proc. 29th ACM Int. Conf. Multimedia","author":"Wu P.","year":"2021"},{"issue":"4","key":"S0218001423550108BIB042","first-page":"2986","volume-title":"Proc. AAAI Conf. Artificial Intelligence","volume":"35","author":"Xiao S.","year":"2021"},{"key":"S0218001423550108BIB044","first-page":"536","volume-title":"Advances in Neural Information Processing Systems","volume":"32","author":"Yuan Y.","year":"2019"},{"key":"S0218001423550108BIB045","first-page":"10287","volume-title":"Proc. IEEE\/CVF Conf. Computer Vision and Pattern Recognition","author":"Zeng R.","year":"2020"},{"issue":"07","key":"S0218001423550108BIB048","first-page":"12870","volume-title":"Proc. AAAI Conf. Artificial Intelligence","volume":"34","author":"Zhang S.","year":"2020"},{"key":"S0218001423550108BIB049","first-page":"13516","volume-title":"Proc. IEEE\/CVF Int. Conf. Computer Vision","author":"Zhu Z.","year":"2021"}],"container-title":["International Journal of Pattern Recognition and Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218001423550108","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,26]],"date-time":"2023-07-26T06:38:13Z","timestamp":1690353493000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S0218001423550108"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,30]]},"references-count":40,"journal-issue":{"issue":"08","published-print":{"date-parts":[[2023,6,30]]}},"alternative-id":["10.1142\/S0218001423550108"],"URL":"https:\/\/doi.org\/10.1142\/s0218001423550108","relation":{},"ISSN":["0218-0014","1793-6381"],"issn-type":[{"type":"print","value":"0218-0014"},{"type":"electronic","value":"1793-6381"}],"subject":[],"published":{"date-parts":[[2023,6,30]]},"article-number":"2355010"}}