{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T14:14:10Z","timestamp":1753884850598,"version":"3.41.2"},"reference-count":23,"publisher":"World Scientific Pub Co Pte Ltd","issue":"12","funder":[{"name":"ISI-UTS Joint Research Cluster","award":["GPF096A-2020","GPF096B-2020","GPF096C-2020"],"award-info":[{"award-number":["GPF096A-2020","GPF096B-2020","GPF096C-2020"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Patt. Recogn. Artif. Intell."],"published-print":{"date-parts":[[2021,9,30]]},"abstract":"<jats:p> Achieving a better recognition rate for text in action video images is challenging due to multiple types of text with unpredictable actions in the background. In this paper, we propose a new method for the classification of caption (which is edited text) and scene text (text that is a part of the video) in video images. This work considers five action classes, namely, Yoga, Concert, Teleshopping, Craft, and Recipes, where it is expected that both types of text play a vital role in understanding the video content. The proposed method introduces a new fusion criterion based on Discrete Cosine Transform (DCT) and Fourier coefficients to obtain the reconstructed images for caption and scene text. The fusion criterion involves computing the variances for coefficients of corresponding pixels of DCT and Fourier images, and the same variances are considered as the respective weights. This step results in Reconstructed image-1. Inspired by the special property of Chebyshev-Harmonic-Fourier-Moments (CHFM) that has the ability to reconstruct a redundancy-free image, we explore CHFM for obtaining the Reconstructed image-2. The reconstructed images along with the input image are passed to a Deep Convolutional Neural Network (DCNN) for classification of caption\/scene text. Experimental results on five action classes and a comparative study with the existing methods demonstrate that the proposed method is effective. In addition, the recognition results of the before and after the classification obtained from different methods show that the recognition performance improves significantly after classification, compared to before classification. <\/jats:p>","DOI":"10.1142\/s0218001421600090","type":"journal-article","created":{"date-parts":[[2021,9,7]],"date-time":"2021-09-07T06:15:10Z","timestamp":1630995310000},"source":"Crossref","is-referenced-by-count":3,"title":["A New Hybrid Method for Caption and Scene Text Classification in Action Video Images"],"prefix":"10.1142","volume":"35","author":[{"given":"Lokesh","family":"Nandanwar","sequence":"first","affiliation":[{"name":"Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia"}]},{"given":"Palaiahnakote","family":"Shivakumara","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia"}]},{"given":"Umapada","family":"Pal","sequence":"additional","affiliation":[{"name":"Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India"}]},{"given":"Tong","family":"Lu","sequence":"additional","affiliation":[{"name":"National Key Lab for Novel Software Technology, Nanjing University, Nanjing, P. R. China"}]},{"given":"Michael","family":"Blumenstein","sequence":"additional","affiliation":[{"name":"University of Technology Sydney, Australia"}]}],"member":"219","published-online":{"date-parts":[[2021,9,6]]},"reference":[{"key":"S0218001421600090BIB001","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2016.2581311"},{"key":"S0218001421600090BIB002","doi-asserted-by":"publisher","DOI":"10.1109\/ICDARW.2019.00020"},{"key":"S0218001421600090BIB003","first-page":"49","volume":"25","author":"Janocha K.","year":"2017","journal-title":"Schedae Inform."},{"key":"S0218001421600090BIB004","first-page":"1","volume-title":"Proc. ICLR","author":"Kingma P. D.","year":"2015"},{"key":"S0218001421600090BIB005","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2015.2465169"},{"key":"S0218001421600090BIB006","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.01.020"},{"key":"S0218001421600090BIB007","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-59830-3_7"},{"key":"S0218001421600090BIB008","doi-asserted-by":"publisher","DOI":"10.1016\/S0020-0255(96)00200-9"},{"key":"S0218001421600090BIB009","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.06.041"},{"key":"S0218001421600090BIB010","doi-asserted-by":"publisher","DOI":"10.1109\/DAS.2016.18"},{"key":"S0218001421600090BIB011","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2018.02.014"},{"key":"S0218001421600090BIB012","doi-asserted-by":"publisher","DOI":"10.1109\/ICFHR.2016.0020"},{"key":"S0218001421600090BIB013","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2017.65"},{"key":"S0218001421600090BIB014","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2848939"},{"key":"S0218001421600090BIB015","doi-asserted-by":"publisher","DOI":"10.1109\/DAS.2014.20"},{"key":"S0218001421600090BIB016","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2017.58"},{"key":"S0218001421600090BIB018","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2016.01.002"},{"key":"S0218001421600090BIB019","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2778011"},{"issue":"2","key":"S0218001421600090BIB020","first-page":"60","volume":"3","author":"Upneja R.","year":"2015","journal-title":"Lecture Notes Inf. Theory"},{"key":"S0218001421600090BIB021","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2017.184"},{"key":"S0218001421600090BIB022","first-page":"12216","volume-title":"Proc. AAAI","author":"Wang T.","year":"2020"},{"key":"S0218001421600090BIB023","doi-asserted-by":"publisher","DOI":"10.1016\/j.bspc.2016.02.008"},{"key":"S0218001421600090BIB024","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2017.03.010"}],"container-title":["International Journal of Pattern Recognition and Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218001421600090","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,11,3]],"date-time":"2021-11-03T04:06:21Z","timestamp":1635912381000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0218001421600090"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,6]]},"references-count":23,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2021,9,30]]}},"alternative-id":["10.1142\/S0218001421600090"],"URL":"https:\/\/doi.org\/10.1142\/s0218001421600090","relation":{},"ISSN":["0218-0014","1793-6381"],"issn-type":[{"type":"print","value":"0218-0014"},{"type":"electronic","value":"1793-6381"}],"subject":[],"published":{"date-parts":[[2021,9,6]]},"article-number":"2160009"}}