{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T09:19:31Z","timestamp":1769159971312,"version":"3.49.0"},"reference-count":27,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,10,14]],"date-time":"2021-10-14T00:00:00Z","timestamp":1634169600000},"content-version":"vor","delay-in-days":286,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003453","name":"Natural Science Foundation of Guangdong Province","doi-asserted-by":"publisher","award":["2014A030310430"],"award-info":[{"award-number":["2014A030310430"]}],"id":[{"id":"10.13039\/501100003453","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Computational Intelligence and Neuroscience"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>With the development of computer technology, video description, which combines the key technologies in the field of natural language processing and computer vision, has attracted more and more researchers\u2019 attention. Among them, how to objectively and efficiently describe high\u2010speed and detailed sports videos is the key to the development of the video description field. In view of the problems of sentence errors and loss of visual information in the generation of the video description text due to the lack of language learning information in the existing video description methods, a multihead model combining the long\u2010term and short\u2010term memory network and attention mechanism is proposed for the intelligent description of the volleyball video. Through the introduction of the attention mechanism, the model pays much attention to the significant areas in the video when generating sentences. Through the comparative experiment with different models, the results show that the model with the attention mechanism can effectively solve the loss of visual information. Compared with the LSTM and base model, the multihead model proposed in this paper, which combines the long\u2010term and short\u2010term memory network and attention mechanism, has higher scores in all evaluation indexes and significantly improved the quality of the intelligent text description of the volleyball video.<\/jats:p>","DOI":"10.1155\/2021\/7088837","type":"journal-article","created":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T02:41:53Z","timestamp":1634265713000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Research on Volleyball Video Intelligent Description Technology Combining the Long\u2010Term and Short\u2010Term Memory Network and Attention Mechanism"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9332-1596","authenticated-orcid":false,"given":"Yuhua","family":"Gao","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0051-1655","authenticated-orcid":false,"given":"Yong","family":"Mo","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0993-2608","authenticated-orcid":false,"given":"Heng","family":"Zhang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8644-263X","authenticated-orcid":false,"given":"Ruiyin","family":"Huang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2977-1824","authenticated-orcid":false,"given":"Zilong","family":"Chen","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2021,10,14]]},"reference":[{"key":"e_1_2_9_1_2","first-page":"20","article-title":"Review of video description methods based on deep learning","volume":"36","author":"Chang Z.","year":"2020","journal-title":"Journal of Tianjin University of Technology"},{"key":"e_1_2_9_2_2","volume-title":"Video Description Generation Based on Visual Semantic Enhancement","author":"Ye J.","year":"2019"},{"key":"e_1_2_9_3_2","volume-title":"Key Technologies of Surveillance Video Moving Target Detection and Pedestrian Structured Description Based on Deep Learning","author":"Xu J.","year":"2019"},{"key":"e_1_2_9_4_2","first-page":"2508","article-title":"Review of convolutional neural networks","volume":"36","author":"Li Y.","year":"2016","journal-title":"Computer Applications"},{"key":"e_1_2_9_5_2","volume-title":"Research on Sports Video Content Analysis Method Based on Team Member Behavior Information","author":"Zhu G.","year":"2011"},{"key":"e_1_2_9_6_2","article-title":"Agricultural machinery movement navigation system based on binocular vision detection technology","volume":"62","author":"Wang C.","year":"2020","journal-title":"Electrotehnica, Electronica, Automatica"},{"key":"e_1_2_9_7_2","doi-asserted-by":"publisher","DOI":"10.1023\/a:1020346032608"},{"key":"e_1_2_9_8_2","doi-asserted-by":"crossref","unstructured":"RohrbachM. QiuW. andTitovI. Translating video content to natural language descriptions Proceedings of the 2013 IEEE International Conference on Computer Vision December 2013 Sydney Australia IEEE.","DOI":"10.1109\/ICCV.2013.61"},{"key":"e_1_2_9_9_2","unstructured":"ThomasonJ. VenugopalanS. andGuadarramaS. Integrating language and vision to generate natural language descriptions of videos in the wild Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014) December 2014 Dublin Ireland."},{"key":"e_1_2_9_10_2","volume-title":"Long-term Recurrent Convolutional Networks for Visual Recognition and Description","author":"Donahue J.","year":"2015"},{"key":"e_1_2_9_11_2","article-title":"Translating videos to natural language using deep recurrent neural networks","volume":"3","author":"Venugopalan S.","year":"2014","journal-title":"Computer Science"},{"key":"e_1_2_9_12_2","doi-asserted-by":"crossref","unstructured":"VenugopalanS. RohrbachM. andDonahueJ. Sequence to sequence\u2014video to text Proceedings of the IEEE 2015 IEEE International Conference on Computer Vision (ICCV) September 2015 Santiago Chile 4534\u20134542.","DOI":"10.1109\/ICCV.2015.515"},{"key":"e_1_2_9_13_2","doi-asserted-by":"crossref","unstructured":"ZhangC.andTianY. Automatic video captioning via multi-channel sequential encoding Proceedings of the European Conference on Computer Vision October 2016 Amsterdam The Netherlands Springer International Publishing.","DOI":"10.1007\/978-3-319-48881-3_11"},{"key":"e_1_2_9_14_2","doi-asserted-by":"crossref","unstructured":"PanY. MeiT. andYaoT. Jointly modeling embedding and translation to bridge video and language Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2016 Las Vegas NV USA.","DOI":"10.1109\/CVPR.2016.497"},{"key":"e_1_2_9_15_2","doi-asserted-by":"crossref","unstructured":"YaoL. TorabiA. andChoK. Describing videos by exploiting temporal structure Proceedings of the IEEE 2015 IEEE International Conference on Computer Vision (ICCV) December 2015 Santiago Chile 4507\u20134515.","DOI":"10.1109\/ICCV.2015.512"},{"key":"e_1_2_9_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2582924"},{"key":"e_1_2_9_17_2","first-page":"155150","article-title":"Mongolian Chinese neural machine translation based on encoder decoder reconstruction framework","volume":"37","author":"Sun X.","year":"2020","journal-title":"Computer applications and software"},{"key":"e_1_2_9_18_2","first-page":"143","article-title":"Semantic relation extraction of LSTM based on attention mechanism","volume":"35","author":"Wang H.","year":"2018","journal-title":"Computer application research"},{"key":"e_1_2_9_19_2","volume-title":"Analysis and Comparison of Image Salient Region Extraction Algorithms Based on Attention Mechanism","author":"Li M.","year":"2020"},{"key":"e_1_2_9_20_2","doi-asserted-by":"publisher","DOI":"10.34028\/iajit\/17\/5\/4"},{"key":"e_1_2_9_21_2","first-page":"130","article-title":"Chinese classification method integrating multi head self attention mechanism","volume":"43","author":"Xiong X.","year":"2020","journal-title":"Electronic measurement technology"},{"key":"e_1_2_9_22_2","doi-asserted-by":"crossref","unstructured":"GuadarramaS. KrishnamoorthyN. andMalkarnenkarG. YouTube2Text: recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition Proceedings of the IEEE International Conference on Computer Vision June 2014 Columbus OH USA IEEE.","DOI":"10.1109\/ICCV.2013.337"},{"key":"e_1_2_9_23_2","doi-asserted-by":"crossref","unstructured":"XuJ. TaoM. andYaoT. MSR-VTT: a large video description dataset for bridging video and language Proceedings of the Conference on Computer Vision And Pattern Recognition (CVPR) June 2016 Las Vegas NV USA IEEE.","DOI":"10.1109\/CVPR.2016.571"},{"key":"e_1_2_9_24_2","doi-asserted-by":"crossref","unstructured":"PapineniS. Blue; A Method for Automatic Evaluation of Machine Translation Proceedings of the Meeting of the Association for Computational Linguistics June 2002 College Park MA USA Association for Computational Linguistics.","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_2_9_25_2","doi-asserted-by":"crossref","unstructured":"NgJ. P.andAbrechtV. Better summarization evaluation with word embeddings for ROUGE 2015 https:\/\/arxiv.org\/abs\/1508.06034.","DOI":"10.18653\/v1\/D15-1222"},{"key":"e_1_2_9_26_2","first-page":"228","article-title":"METEOR: an automatic metric for MT evaluation with improved correlation with human judgments","volume":"7","author":"Satanjeev B.","year":"2005","journal-title":"ACL"},{"key":"e_1_2_9_27_2","unstructured":"CIDEr Consensus-based image description evaluation Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2015 Boston MA USA IEEE."}],"container-title":["Computational Intelligence and Neuroscience"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2021\/7088837.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2021\/7088837.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/7088837","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T12:06:43Z","timestamp":1722946003000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/7088837"}},"subtitle":[],"editor":[{"given":"Bai Yuan","family":"Ding","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":27,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/7088837"],"URL":"https:\/\/doi.org\/10.1155\/2021\/7088837","archive":["Portico"],"relation":{},"ISSN":["1687-5265","1687-5273"],"issn-type":[{"value":"1687-5265","type":"print"},{"value":"1687-5273","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2021-08-27","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-09-27","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-10-14","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"7088837"}}