{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:15:59Z","timestamp":1750220159026,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":21,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,7,29]],"date-time":"2022-07-29T00:00:00Z","timestamp":1659052800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,7,29]]},"DOI":"10.1145\/3549179.3549182","type":"proceedings-article","created":{"date-parts":[[2022,8,20]],"date-time":"2022-08-20T22:07:10Z","timestamp":1661033230000},"page":"14-22","source":"Crossref","is-referenced-by-count":0,"title":["A Machine-Learning Pipeline for Semantic-Aware and Contexts-Rich Video Description Method"],"prefix":"10.1145","author":[{"given":"Yichiet","family":"Aun","sequence":"first","affiliation":[{"name":"Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Malaysia"}]},{"given":"Jasmina Yen Min","family":"Khaw","sequence":"additional","affiliation":[{"name":"Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Malaysia"}]},{"given":"Ming-Lee","family":"Gan","sequence":"additional","affiliation":[{"name":"Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Malaysia"}]},{"given":"Ley-Ter","family":"Tin","sequence":"additional","affiliation":[{"name":"Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Malaysia"}]}],"member":"320","published-online":{"date-parts":[[2022,8,20]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Yolov3: An Incremental Improvement,\" in arXiv preprint","author":"Redmon J.","year":"2018","unstructured":"J. Redmon and A. Farhadi , \" Yolov3: An Incremental Improvement,\" in arXiv preprint , 2018 . J. Redmon and A. Farhadi, \"Yolov3: An Incremental Improvement,\" in arXiv preprint, 2018."},{"key":"e_1_3_2_1_2_1","volume-title":"Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV Workshops)","author":"Khan M. U. G.","year":"2011","unstructured":"M. U. G. Khan , L. Zhang and G. Y. , \" Human Focused Video Description ,\" in Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV Workshops) , 2011 . M. U. G. Khan, L. Zhang and G. Y., \"Human Focused Video Description,\" in Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011."},{"key":"e_1_3_2_1_3_1","volume-title":"USA","author":"Gatt A.","year":"2009","unstructured":"A. Gatt and E. Reiter , \" SimpleNLG: A Realisation Engine for Practical Applications.,\" in Proceedings of the 12th European Workshop on Natural Language Generation, ENLG \u201909, Stroudsburg, PA , USA , 2009 . A. Gatt and E. Reiter, \"SimpleNLG: A Realisation Engine for Practical Applications.,\" in Proceedings of the 12th European Workshop on Natural Language Generation, ENLG \u201909, Stroudsburg, PA, USA, 2009."},{"key":"e_1_3_2_1_4_1","volume-title":"Sequence to Sequence Video to Text.,\" in Proceeding ICCV","author":"Venugopalan S.","year":"2015","unstructured":"S. Venugopalan , M. Rohrbach , R. Mooney , T. Darrell and K. Saenko , \" Sequence to Sequence Video to Text.,\" in Proceeding ICCV , 2015 . S. Venugopalan, M. Rohrbach, R. Mooney, T. Darrell and K. Saenko, \"Sequence to Sequence Video to Text.,\" in Proceeding ICCV, 2015."},{"key":"e_1_3_2_1_5_1","volume-title":"Dense-captioning Events in Videos,\" in arXiv preprint arXiv:1705.00754","author":"Krishna R.","year":"2017","unstructured":"R. Krishna , K. Hata , F. Ren , L. Fei-Fei and J. C. Niebles , \" Dense-captioning Events in Videos,\" in arXiv preprint arXiv:1705.00754 , 2017 . R. Krishna, K. Hata, F. Ren, L. Fei-Fei and J. C. Niebles, \"Dense-captioning Events in Videos,\" in arXiv preprint arXiv:1705.00754, 2017."},{"key":"e_1_3_2_1_6_1","volume-title":"Automated textual descriptions for a wide range of video events with 48 human actions,\" in Proceedings of the European Conference on Computer Vision Workshops and Demonstrations","author":"Hanckmann P.","year":"2012","unstructured":"P. Hanckmann , K. Schutte and G. J. Burghouts , \" Automated textual descriptions for a wide range of video events with 48 human actions,\" in Proceedings of the European Conference on Computer Vision Workshops and Demonstrations , 2012 . P. Hanckmann, K. Schutte and G. J. Burghouts, \"Automated textual descriptions for a wide range of video events with 48 human actions,\" in Proceedings of the European Conference on Computer Vision Workshops and Demonstrations, 2012."},{"key":"e_1_3_2_1_7_1","volume-title":"Places: A 10 million image database for scene recognition,\" in IEEE TPAMI","author":"Zhou B.","year":"2017","unstructured":"B. Zhou , A. Lapedriza , A. Khosla , A. . Oliva and A. Torralba , \" Places: A 10 million image database for scene recognition,\" in IEEE TPAMI , 2017 . B. Zhou, A. Lapedriza, A. Khosla, A. .Oliva and A. Torralba, \"Places: A 10 million image database for scene recognition,\" in IEEE TPAMI, 2017."},{"key":"e_1_3_2_1_8_1","volume-title":"Quo vadis, action recognition? a new model and the kinetics dataset,\" in Computer Vision and Pattern Recognition (CVPR)","author":"Carreira J.","year":"2017","unstructured":"J. Carreira and A. Zisserman , \" Quo vadis, action recognition? a new model and the kinetics dataset,\" in Computer Vision and Pattern Recognition (CVPR) , 2017 . J. Carreira and A. Zisserman, \"Quo vadis, action recognition? a new model and the kinetics dataset,\" in Computer Vision and Pattern Recognition (CVPR), 2017."},{"key":"e_1_3_2_1_9_1","volume-title":"Simple online and realtime tracking with a deep association metric,\" in CoRR abs\/1703.07402","author":"Wojke N.","year":"2017","unstructured":"N. Wojke , A. Bewley and D. Paulus , \" Simple online and realtime tracking with a deep association metric,\" in CoRR abs\/1703.07402 , 2017 . N. Wojke, A. Bewley and D. Paulus, \"Simple online and realtime tracking with a deep association metric,\" in CoRR abs\/1703.07402, 2017."},{"key":"e_1_3_2_1_10_1","volume-title":"Available: https:\/\/pypi.org\/project\/face-recognition\/. [Accessed","author":"Geitgey A.","year":"2020","unstructured":"A. Geitgey , \"face-recognition 1.3.0,\" 2 0 February 2020. [Online]. Available: https:\/\/pypi.org\/project\/face-recognition\/. [Accessed 20 November 2020 ]. A. Geitgey, \"face-recognition 1.3.0,\" 20 February 2020. [Online]. Available: https:\/\/pypi.org\/project\/face-recognition\/. [Accessed 20 November 2020]."},{"key":"e_1_3_2_1_11_1","volume-title":"Age and gender classification using convolutional neural networks,\" in IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW)","author":"Levi G.","year":"2015","unstructured":"G. Levi and T. Hassner , \" Age and gender classification using convolutional neural networks,\" in IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW) , 2015 . G. Levi and T. Hassner, \"Age and gender classification using convolutional neural networks,\" in IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2015."},{"key":"e_1_3_2_1_12_1","volume-title":"Modanet: A large-scale street fashion dataset with polygon annotations,\" in ACM Multimedia","author":"Zheng S.","year":"2018","unstructured":"S. Zheng , F. Yang , M. H. Kiapour and R. Piramuthu , \" Modanet: A large-scale street fashion dataset with polygon annotations,\" in ACM Multimedia , 2018 . S. Zheng, F. Yang, M. H. Kiapour and R. Piramuthu, \"Modanet: A large-scale street fashion dataset with polygon annotations,\" in ACM Multimedia, 2018."},{"volume-title":"Faster RCNN in ModaNet and DeepFashion2 dataset","year":"2020","key":"e_1_3_2_1_13_1","unstructured":"Simaiden, \"Clothing detection using YOLOv3, RetinaNet , Faster RCNN in ModaNet and DeepFashion2 dataset ,\" 12 January 2020 . [Online]. Available: https:\/\/github.com\/simaiden\/Clothing-Detection. [Accessed 20 November 2020]. Simaiden, \"Clothing detection using YOLOv3, RetinaNet, Faster RCNN in ModaNet and DeepFashion2 dataset,\" 12 January 2020. [Online]. Available: https:\/\/github.com\/simaiden\/Clothing-Detection. [Accessed 20 November 2020]."},{"key":"e_1_3_2_1_14_1","volume-title":"MMFashion: An Open-Source Toolbox for Visual Fashion Analysis,\" arXiv preprint arXiv:2005.08847","author":"Liu X.","year":"2020","unstructured":"X. Liu , J. Li , J. Wang and Z. Liu , \" MMFashion: An Open-Source Toolbox for Visual Fashion Analysis,\" arXiv preprint arXiv:2005.08847 , 2020 . X. Liu, J. Li, J. Wang and Z. Liu, \"MMFashion: An Open-Source Toolbox for Visual Fashion Analysis,\" arXiv preprint arXiv:2005.08847, 2020."},{"key":"e_1_3_2_1_15_1","volume-title":"Deepfashion: Powering robust clothes recognition and retrieval with rich annotations,\" in CVPR","author":"Liu Z.","year":"2016","unstructured":"Z. Liu , P. Luo , S. Qiu , X. Wang and X. Tang , \" Deepfashion: Powering robust clothes recognition and retrieval with rich annotations,\" in CVPR , 2016 . Z. Liu, P. Luo, S. Qiu, X. Wang and X. Tang, \"Deepfashion: Powering robust clothes recognition and retrieval with rich annotations,\" in CVPR, 2016."},{"key":"e_1_3_2_1_16_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language under- standing,\" arXiv preprint arXiv:1810.04805","author":"Devlin J.","year":"2018","unstructured":"J. Devlin , M. W. Chang , K. Lee and K. B. Toutanova , \" Bert: Pre-training of deep bidirectional transformers for language under- standing,\" arXiv preprint arXiv:1810.04805 , 2018 . J. Devlin, M. W. Chang, K. Lee and K. B. Toutanova, \"Bert: Pre-training of deep bidirectional transformers for language under- standing,\" arXiv preprint arXiv:1810.04805, 2018."},{"key":"e_1_3_2_1_17_1","volume-title":"fitbert 0.9.0","author":"Biljana R.","year":"2020","unstructured":"R. Biljana , S. Aneta , Q. Jenkins , J. and S. Havens , \" fitbert 0.9.0 ,\" 22 May 2020 . [Online]. Available: https:\/\/pypi.org\/project\/fitbert\/. [Accessed 20 November 2020]. R. Biljana, S. Aneta, Q. Jenkins, J. and S. Havens, \"fitbert 0.9.0,\" 22 May 2020. [Online]. Available: https:\/\/pypi.org\/project\/fitbert\/. [Accessed 20 November 2020]."},{"key":"e_1_3_2_1_18_1","volume-title":"Available: https:\/\/github.com\/stewartmcgown\/grammarly-api. [Accessed","author":"Client API","year":"2020","unstructured":"Stewartmcgown, \"Unofficial Grammarly API Client ,\" 9 September 2019. [Online]. Available: https:\/\/github.com\/stewartmcgown\/grammarly-api. [Accessed 20 November 2020 ]. Stewartmcgown, \"Unofficial Grammarly API Client,\" 9 September 2019. [Online]. Available: https:\/\/github.com\/stewartmcgown\/grammarly-api. [Accessed 20 November 2020]."},{"key":"e_1_3_2_1_19_1","volume-title":"Grammarly","author":"Lytvyn M.","year":"2020","unstructured":"M. Lytvyn , A. Shevchenko and D. Lider , \" Grammarly ,\" 2020 . [Online]. Available: https:\/\/www.grammarly.com. [Accessed 20 November 2020]. M. Lytvyn, A. Shevchenko and D. Lider, \"Grammarly,\" 2020. [Online]. Available: https:\/\/www.grammarly.com. [Accessed 20 November 2020]."},{"key":"e_1_3_2_1_21_1","unstructured":"S. Banerjee and A. Lavie \"METEOR: An automatic metric for MT evaluation with improved correlation with human judgments \" in ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Sum- marization 2005.  S. Banerjee and A. Lavie \"METEOR: An automatic metric for MT evaluation with improved correlation with human judgments \" in ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Sum- marization 2005."},{"key":"e_1_3_2_1_22_1","volume-title":"Video description: A survey of methods, datasets and evaluation metrics,\" arXiv preprint arXiv:1806.00186","author":"Aafaq N.","year":"2018","unstructured":"N. Aafaq , A. Mian , W. Liu , S. Z. Gilani and M. Shah , \" Video description: A survey of methods, datasets and evaluation metrics,\" arXiv preprint arXiv:1806.00186 , 2018 . N. Aafaq, A. Mian, W. Liu, S. Z. Gilani and M. Shah, \"Video description: A survey of methods, datasets and evaluation metrics,\" arXiv preprint arXiv:1806.00186, 2018."}],"event":{"name":"PRIS 2022: 2022 4th International Conference on Pattern Recognition and Intelligent Systems","acronym":"PRIS 2022","location":"Wuhan China"},"container-title":["2022 4th International Conference on Pattern Recognition and Intelligent Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3549179.3549182","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3549179.3549182","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:22Z","timestamp":1750186822000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3549179.3549182"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,29]]},"references-count":21,"alternative-id":["10.1145\/3549179.3549182","10.1145\/3549179"],"URL":"https:\/\/doi.org\/10.1145\/3549179.3549182","relation":{},"subject":[],"published":{"date-parts":[[2022,7,29]]}}}