{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T17:33:54Z","timestamp":1761845634563,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":50,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,8,19]],"date-time":"2022-08-19T00:00:00Z","timestamp":1660867200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,8,19]]},"DOI":"10.1145\/3561613.3561641","type":"proceedings-article","created":{"date-parts":[[2022,11,9]],"date-time":"2022-11-09T18:18:15Z","timestamp":1668017895000},"page":"179-186","source":"Crossref","is-referenced-by-count":5,"title":["Image-Based Storytelling Using Deep Learning"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7443-3285","authenticated-orcid":false,"given":"Yulin","family":"Zhu","sequence":"first","affiliation":[{"name":"Department of Computer Science, Auckland University of Technology, New Zealand"}]},{"given":"Wei Qi","family":"Yan","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Auckland University of Technology, New Zealand"}]}],"member":"320","published-online":{"date-parts":[[2022,11,9]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"159","volume-title":"International Conference on Telecommunications","author":"Smilevski Marko","year":"2018","unstructured":"Marko Smilevski , Ilija Lalkovski , Gjorgji Madjarov . 2018 . Stories for images-in- sequence by using visual and narrative components . In International Conference on Telecommunications , pages 148\u2013 159 . Springer. Marko Smilevski, Ilija Lalkovski, Gjorgji Madjarov. 2018. Stories for images-in- sequence by using visual and narrative components. In International Conference on Telecommunications, pages 148\u2013159. Springer."},{"key":"e_1_3_2_1_2_1","volume-title":"Artificial Intelligence: A Philosophical Introduction","author":"Copeland Jack","year":"2015","unstructured":"Jack Copeland . 2015 . Artificial Intelligence: A Philosophical Introduction . John Wiley & Sons Jack Copeland. 2015. Artificial Intelligence: A Philosophical Introduction. John Wiley & Sons"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","DOI":"10.1201\/b22400","volume-title":"Artificial Intelligence: With An Introduction to Machine Learning","author":"Neapolitan Richard E","year":"2018","unstructured":"Richard E Neapolitan and Xia Jiang . 2018 . Artificial Intelligence: With An Introduction to Machine Learning . CRC Press Richard E Neapolitan and Xia Jiang. 2018. Artificial Intelligence: With An Introduction to Machine Learning. CRC Press"},{"volume-title":"Machine Learning","author":"Mitchell Tom","key":"e_1_3_2_1_4_1","unstructured":"Tom Mitchell . 1997. Machine Learning . McGraw Hill Burr Ridge Tom Mitchell. 1997. Machine Learning. McGraw Hill Burr Ridge"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11831-019-09344-w"},{"key":"e_1_3_2_1_6_1","first-page":"168","volume-title":"International Conference on Machine Learning","author":"Caruana Rich","year":"2006","unstructured":"Rich Caruana , Alexandru Niculescu-Mizil . 2006 . An empirical comparison of supervised learning algorithms . In International Conference on Machine Learning , pages 161\u2013 168 Rich Caruana, Alexandru Niculescu-Mizil. 2006. An empirical comparison of supervised learning algorithms. In International Conference on Machine Learning, pages 161\u2013168"},{"key":"e_1_3_2_1_7_1","volume-title":"Terrence Joseph Sejnowski","author":"Hinton Geoffrey E","year":"1999","unstructured":"Geoffrey E Hinton , Terrence Joseph Sejnowski , 1999 . Unsupervised Learning : Foundations of Neural Computation. MIT Press . Geoffrey E Hinton, Terrence Joseph Sejnowski, 1999. Unsupervised Learning: Foundations of Neural Computation. MIT Press."},{"key":"e_1_3_2_1_8_1","first-page":"619","volume-title":"Neural networks: Tricks of the Trade","author":"Hinton Geoffrey E","unstructured":"Geoffrey E Hinton . 2012. A practical guide to training restricted Boltzmann machines . In Neural networks: Tricks of the Trade , pages 599\u2013 619 . Springer. Geoffrey E Hinton. 2012. A practical guide to training restricted Boltzmann machines. In Neural networks: Tricks of the Trade, pages 599\u2013619. Springer."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.drudis.2017.08.010"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2020.2993291"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_2_1_12_1","unstructured":"Keiron O'Shea and Ryan Nash. 2015. An introduction to convolutional neural networks. arXiv:1511.08458.  Keiron O'Shea and Ryan Nash. 2015. An introduction to convolutional neural networks. arXiv:1511.08458."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICEngTechnol.2017.8308186"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.2200\/S00822ED1V01Y201712COV015"},{"key":"e_1_3_2_1_15_1","first-page":"4708","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"Huang Gao","year":"2017","unstructured":"Gao Huang , Zhuang Liu , Laurens Van Der Maaten , Kilian Q Weinberger . 2017 . Densely connected convolutional networks . In IEEE Conference on Computer Vision and Pattern Recognition , pages 4700\u2013 4708 . Gao Huang, Zhuang Liu, Laurens Van Der Maaten, Kilian Q Weinberger. 2017. Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition, pages 4700\u20134708."},{"key":"e_1_3_2_1_16_1","first-page":"282","volume-title":"International Conference on Secure Cyber Computing and Communication (ICSCCC)","author":"Chauhan Rahul","year":"2018","unstructured":"Rahul Chauhan , Kamal Kumar Ghanshala , RC Joshi . 2018 . Convolutional neural network (CNN) for image detection and recognition . In International Conference on Secure Cyber Computing and Communication (ICSCCC) , pages 278\u2013 282 . IEEE. Rahul Chauhan, Kamal Kumar Ghanshala, RC Joshi. 2018. Convolutional neural network (CNN) for image detection and recognition. In International Conference on Secure Cyber Computing and Communication (ICSCCC), pages 278\u2013282. IEEE."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2014.09.003"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Chris Dyer Adhiguna Kuncoro Miguel Ballesteros Noah A Smith. 2016. Recurrent neural network grammars. arXiv:1602.07776.  Chris Dyer Adhiguna Kuncoro Miguel Ballesteros Noah A Smith. 2016. Recurrent neural network grammars. arXiv:1602.07776.","DOI":"10.18653\/v1\/N16-1024"},{"volume-title":"Deep Learning","author":"Goodfellow Ian","key":"e_1_3_2_1_19_1","unstructured":"Ian Goodfellow , Yoshua Bengio , Aaron Courville . 2016. Deep Learning . MIT Press . Ian Goodfellow, Yoshua Bengio, Aaron Courville. 2016. Deep Learning. MIT Press."},{"key":"e_1_3_2_1_20_1","first-page":"817","article-title":"An intelligent chatbot using deep learning with bidirectional RNN and attention model","volume":"34","author":"Dhyani Manyu","year":"2021","unstructured":"Manyu Dhyani , Rajiv Kumar . 2021 . An intelligent chatbot using deep learning with bidirectional RNN and attention model . Materials Today , 34 : 817 \u2013 824 . Manyu Dhyani, Rajiv Kumar. 2021. An intelligent chatbot using deep learning with bidirectional RNN and attention model. Materials Today, 34:817\u2013824.","journal-title":"Materials Today"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/78.650093"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2582924"},{"key":"e_1_3_2_1_23_1","volume-title":"Annual Conference of The International Speech Communi- cation Association.","author":"Sundermeyer Martin","year":"2012","unstructured":"Martin Sundermeyer , Ralf Schlu\u00a8ter , Hermann Ney . 2012 . LSTM neural networks for language modeling . In Annual Conference of The International Speech Communi- cation Association. Martin Sundermeyer, Ralf Schlu\u00a8ter, Hermann Ney. 2012. LSTM neural networks for language modeling. In Annual Conference of The International Speech Communi- cation Association."},{"key":"e_1_3_2_1_24_1","volume-title":"AAAI Conference on Artificial Intelligence.","author":"Lai Siwei","year":"2015","unstructured":"Siwei Lai , Liheng Xu , Kang Liu , Jun Zhao . 2015 . Recurrent convolutional neural networks for text classification . In AAAI Conference on Artificial Intelligence. Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In AAAI Conference on Artificial Intelligence."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.physd.2019.132306"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.2140\/pjm.1968.27.211"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.physa.2018.11.037"},{"key":"e_1_3_2_1_28_1","first-page":"29","volume-title":"European Conference on Computer Vision","author":"Gupta Abhinav","year":"2008","unstructured":"Abhinav Gupta , Larry S Davis . 2008 . Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers . In European Conference on Computer Vision , pages 16\u2013 29 . Springer. Abhinav Gupta, Larry S Davis. 2008. Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers. In European Conference on Computer Vision, pages 16\u201329. Springer."},{"key":"e_1_3_2_1_29_1","first-page":"2415","volume-title":"IEEE International Conference on Computer Vision","author":"Jia Xu","year":"2015","unstructured":"Xu Jia , Efstratios Gavves , Basura Fernando , Tinne Tuytelaars . 2015 . Guiding the long-short term memory model for image caption generation . In IEEE International Conference on Computer Vision , pages 2407\u2013 2415 . Xu Jia, Efstratios Gavves, Basura Fernando, Tinne Tuytelaars. 2015. Guiding the long-short term memory model for image caption generation. In IEEE International Conference on Computer Vision, pages 2407\u20132415."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2017.2741510"},{"key":"e_1_3_2_1_31_1","volume-title":"Computational Intelligence and Neuroscience","author":"Wang Haoran","year":"2020","unstructured":"Haoran Wang , Yue Zhang , Xiaosheng Yu . 2020 . An overview of image caption generation methods . Computational Intelligence and Neuroscience , 2020. Haoran Wang, Yue Zhang, Xiaosheng Yu. 2020. An overview of image caption generation methods. Computational Intelligence and Neuroscience, 2020."},{"key":"e_1_3_2_1_32_1","volume-title":"Samir Kumar Borgohain","author":"Katiyar Sulabh","year":"2021","unstructured":"Sulabh Katiyar , Samir Kumar Borgohain . 2021 . Comparative evaluation of CNN architectures for image caption generation. arXiv:2102.11506. Sulabh Katiyar, Samir Kumar Borgohain. 2021. Comparative evaluation of CNN architectures for image caption generation. arXiv:2102.11506."},{"key":"e_1_3_2_1_33_1","first-page":"6","volume-title":"International Conference on Automation and Computing (ICAC)","author":"Ding Xintao","year":"2017","unstructured":"Xintao Ding , Yonglong Luo , Qingying Yu , Qingde Li , Yongqiang Cheng , Robert Munnoch , Dongfei Xue , Guorong Cai . 2017 . Indoor object recognition using pre-trained convolutional neural network . In International Conference on Automation and Computing (ICAC) , pages 1\u2013 6 . Xintao Ding, Yonglong Luo, Qingying Yu, Qingde Li, Yongqiang Cheng, Robert Munnoch, Dongfei Xue, Guorong Cai. 2017. Indoor object recognition using pre-trained convolutional neural network. In International Conference on Automation and Computing (ICAC), pages 1\u20136."},{"key":"e_1_3_2_1_34_1","volume-title":"AAAI Conference on Artificial Intelligence.","author":"Li Linghui","year":"2017","unstructured":"Linghui Li , Sheng Tang , Lixi Deng , Yongdong Zhang , Qi Tian 2017 . Image caption with global-local attention . In AAAI Conference on Artificial Intelligence. Linghui Li, Sheng Tang, Lixi Deng, Yongdong Zhang, Qi Tian 2017. Image caption with global-local attention. In AAAI Conference on Artificial Intelligence."},{"key":"e_1_3_2_1_35_1","first-page":"12475","volume-title":"IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Biten Ali Furkan","year":"2019","unstructured":"Ali Furkan Biten , Lluis Gomez , Mar\u00b8cal Rusinol , Dimosthenis Karatzas . 2019 . Good news, everyone! context driven entity-aware captioning for news images . In IEEE\/CVF Conference on Computer Vision and Pattern Recognition , pages 12466\u2013 12475 . Ali Furkan Biten, Lluis Gomez, Mar\u00b8cal Rusinol, Dimosthenis Karatzas. 2019. Good news, everyone! context driven entity-aware captioning for news images. In IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pages 12466\u201312475."},{"key":"e_1_3_2_1_36_1","volume-title":"International Conference on Image and Vision Computing New Zealand.","author":"Liu Xiaoxu","year":"2020","unstructured":"Xiaoxu Liu , Wei Qi Yan . 2020 . Vehicle-related scene segmentation using CapsNets . In International Conference on Image and Vision Computing New Zealand. Xiaoxu Liu, Wei Qi Yan. 2020. Vehicle-related scene segmentation using CapsNets. In International Conference on Image and Vision Computing New Zealand."},{"key":"e_1_3_2_1_37_1","first-page":"1448","volume-title":"Fast R-CNN. In IEEE International Conference on Computer Vision","author":"Girshick Ross","year":"2015","unstructured":"Ross Girshick . 2015 . Fast R-CNN. In IEEE International Conference on Computer Vision , pages 1440\u2013 1448 . Ross Girshick. 2015. Fast R-CNN. In IEEE International Conference on Computer Vision, pages 1440\u20131448."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2389824"},{"key":"e_1_3_2_1_39_1","first-page":"91","article-title":"Faster R-CNN: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren Shaoqing","year":"2015","unstructured":"Shaoqing Ren , Kaiming He , Ross Girshick , Jian Sun . 2015 . Faster R-CNN: Towards real-time object detection with region proposal networks . Advances in Neural Information Processing Systems , 28 : 91 \u2013 99 . Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 28:91\u201399.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_40_1","first-page":"788","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"Redmon Joseph","year":"2016","unstructured":"Joseph Redmon , Santosh Divvala , Ross Girshick , Ali Farhadi . 2016 . You only look once: Unified, real-time object detection . In IEEE Conference on Computer Vision and Pattern Recognition , pages 779\u2013 788 Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition, pages 779\u2013788"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.18178\/joig.9.4.122-134"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.18178\/joig.6.2.127-136"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.18178\/joig.4.1.46-50"},{"volume-title":"Traffic Sign Recognition Based on Deep Learning. Multimedia Tools and Applications","author":"Zhu Yanzhao","key":"e_1_3_2_1_44_1","unstructured":"Yanzhao Zhu and Wei Qi Yan . 2022. Traffic Sign Recognition Based on Deep Learning. Multimedia Tools and Applications , Springer . Yanzhao Zhu and Wei Qi Yan. 2022. Traffic Sign Recognition Based on Deep Learning. Multimedia Tools and Applications, Springer."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.18178\/joig.8.3.59-66"},{"volume-title":"Image-Based Storytelling for Tourist Using Deep Learning. Research Project Report","author":"Zhu Yulin","key":"e_1_3_2_1_46_1","unstructured":"Yulin Zhu . 2022. Image-Based Storytelling for Tourist Using Deep Learning. Research Project Report , Auckland University of Technology , New Zealand . Yulin Zhu. 2022. Image-Based Storytelling for Tourist Using Deep Learning. Research Project Report, Auckland University of Technology, New Zealand."},{"volume-title":"Computational Methods for Deep Learning: Theoretic, Practice and Applications","author":"Yan Wei Qi","key":"e_1_3_2_1_47_1","unstructured":"Wei Qi Yan . 2021. Computational Methods for Deep Learning: Theoretic, Practice and Applications . Springer . Wei Qi Yan. 2021. Computational Methods for Deep Learning: Theoretic, Practice and Applications. Springer."},{"volume-title":"Introduction to Intelligent Surveillance Surveillance Data Capture, Transmission, and Analytics","author":"Yan Wei Qi","key":"e_1_3_2_1_48_1","unstructured":"Wei Qi Yan . 2019. Introduction to Intelligent Surveillance Surveillance Data Capture, Transmission, and Analytics . Springer . Wei Qi Yan. 2019. Introduction to Intelligent Surveillance Surveillance Data Capture, Transmission, and Analytics. Springer."},{"key":"e_1_3_2_1_49_1","article-title":"Salient object detection based on visual perceptual saturation and two-stream hybrid networks","author":"Pan Chen","year":"2021","unstructured":"Chen Pan , Jianfeng Liu , Wei Qi Yan . 2021 . Salient object detection based on visual perceptual saturation and two-stream hybrid networks . IEEE Transactions on Image Processing. Chen Pan, Jianfeng Liu, Wei Qi Yan. 2021. Salient object detection based on visual perceptual saturation and two-stream hybrid networks. IEEE Transactions on Image Processing.","journal-title":"IEEE Transactions on Image Processing."},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-020-08866-x"}],"event":{"name":"ICCCV 2022: 2022 The 5th International Conference on Control and Computer Vision","acronym":"ICCCV 2022","location":"Xiamen China"},"container-title":["2022 The 5th International Conference on Control and Computer Vision"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3561613.3561641","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3561613.3561641","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:35Z","timestamp":1750186835000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3561613.3561641"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,19]]},"references-count":50,"alternative-id":["10.1145\/3561613.3561641","10.1145\/3561613"],"URL":"https:\/\/doi.org\/10.1145\/3561613.3561641","relation":{},"subject":[],"published":{"date-parts":[[2022,8,19]]}}}