{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,2]],"date-time":"2025-04-02T10:06:14Z","timestamp":1743588374404,"version":"3.37.3"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2023,3,7]],"date-time":"2023-03-07T00:00:00Z","timestamp":1678147200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,7]],"date-time":"2023-03-07T00:00:00Z","timestamp":1678147200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["SN COMPUT. SCI."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Margin, in typography, is described as the space between the text content and the document edges and is often essential information for the consumer of the document, digital or physical. In the present age of digital disruption, it is customary to store and retrieve documents digitally and retrieve information automatically from the documents when necessary. Margin is one such non-textual information that becomes important for some business processes, and the demand for computing margins algorithmically mounts to facilitate RPA. We propose a computer vision-based text localization model, utilizing classical DIP techniques such as smoothing, thresholding, and morphological transformation to programmatically compute the top, left, right, and bottom margins within a digital document image. The proposed model has been experimented with different noise filters and structural elements of various shapes and size to finalize the bilateral filter and lines and structural elements for the removal of noises most commonly occurring due to scans. The proposed model is targeted towards text document images and not the natural scene images. Hence, the existing benchmark models developed for text localization in natural scene images have not performed with the expected accuracy. The model is validated with 485 document images of a real-time business process of a reputed TI company. The results show that <jats:inline-formula><jats:alternatives><jats:tex-math>$$91.34\\%$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mn>91.34<\/mml:mn>\n                    <mml:mo>%<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> of the document images have conferred more than <jats:inline-formula><jats:alternatives><jats:tex-math>$$90\\%$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mn>90<\/mml:mn>\n                    <mml:mo>%<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> IoU value which is well beyond the accuracy range determined by the company for that specific process.<\/jats:p>","DOI":"10.1007\/s42979-023-01693-5","type":"journal-article","created":{"date-parts":[[2023,3,7]],"date-time":"2023-03-07T10:02:57Z","timestamp":1678183377000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Computer Vision Based Automatic Margin Computation Model for Digital Document Images"],"prefix":"10.1007","volume":"4","author":[{"given":"Abhijit","family":"Guha","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Debabrata","family":"Samanta","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2171-9332","authenticated-orcid":false,"given":"Sandeep Singh","family":"Sengar","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,3,7]]},"reference":[{"key":"1693_CR1","unstructured":"Dutta A, Gupta A, Zissermann A, VGG image annotator (VIA). http:\/\/www.robots.ox.ac.uk\/vgg\/software\/via;2016."},{"key":"1693_CR2","doi-asserted-by":"crossref","unstructured":"Dutta A, Zisserman A. The VIA annotation so\u03d5\u03d5ftware for images, audio and video. Proceedings of the 27th ACM International Conference on Multimedia. 2019;pp. 2276\u20132279.","DOI":"10.1145\/3343031.3350535"},{"key":"1693_CR3","doi-asserted-by":"crossref","unstructured":"Pizenberg M, Carlier A, Faure E, Charvillat V. Web-based configurable image annotations. Proceedings of the 26th ACM international conference on Multimedia. 2018;1368\u20131371.","DOI":"10.1145\/3240508.3243656"},{"key":"1693_CR4","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1016\/j.procs.2015.03.147","volume":"45","author":"TA Jundale","year":"2015","unstructured":"Jundale TA, Hegadi RS. Skew detection and correction of Devanagari script using Hough transform. Proc Comput Sci. 2015;45:305\u201311.","journal-title":"Proc Comput Sci"},{"issue":"1","key":"1693_CR5","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0029740","volume":"7","author":"C Kanan","year":"2012","unstructured":"Kanan C, Cottrell GW. Color-to-grayscale: does the method matter in image recognition? PLoS ONE. 2012;7(1): e29740.","journal-title":"PLoS ONE"},{"issue":"5","key":"1693_CR6","doi-asserted-by":"publisher","first-page":"853","DOI":"10.1007\/s11760-015-0828-7","volume":"10","author":"A G\u00fcne\u015f","year":"2016","unstructured":"G\u00fcne\u015f A, Kalkan H, Durmu\u015f E. Optimizing the color-to-grayscale conversion for image classification. SIViP. 2016;10(5):853\u201360.","journal-title":"SIViP"},{"issue":"3","key":"1693_CR7","first-page":"2033","volume":"6","author":"AM Hambal","year":"2017","unstructured":"Hambal AM, Pei Z, Ishabailu FL. Image noise reduction and filtering techniques. IJSR. 2017;6(3):2033\u20138.","journal-title":"IJSR"},{"issue":"8","key":"1693_CR8","first-page":"816","volume":"9","author":"N Win","year":"2019","unstructured":"Win N, Kyaw K, Win T, Aung P. Image noise reduction using linear and non-linear filtering technique. Int J Sci Res Publ. 2019;9(8):816\u201321.","journal-title":"Int J Sci Res Publ"},{"key":"1693_CR9","doi-asserted-by":"publisher","first-page":"130","DOI":"10.1016\/j.ijleo.2017.07.040","volume":"145","author":"SS Sengar","year":"2017","unstructured":"Sengar SS, Mukhopadhyay S. \u2019Detection of moving objects based on enhancement of optical flow. Optik. 2017;145:130\u201341.","journal-title":"Optik"},{"issue":"2","key":"1693_CR10","doi-asserted-by":"publisher","first-page":"779","DOI":"10.1109\/TIP.2018.2871597","volume":"28","author":"RG Gavaskar","year":"2018","unstructured":"Gavaskar RG, Chaudhury KN. Fast adaptive bilateral filtering. IEEE Trans Image Process. 2018;28(2):779\u201390.","journal-title":"IEEE Trans Image Process"},{"issue":"11","key":"1693_CR11","doi-asserted-by":"publisher","first-page":"3357","DOI":"10.1109\/TIP.2015.2442916","volume":"24","author":"K Sugimoto","year":"2015","unstructured":"Sugimoto K, Kamata S-I. Compressive bilateral filtering. IEEE Trans Image Process. 2015;24(11):3357\u201369.","journal-title":"IEEE Trans Image Process"},{"key":"1693_CR12","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1016\/j.measurement.2017.09.052","volume":"114","author":"TY Goh","year":"2018","unstructured":"Goh TY, Basah SN, Yazid H, Safar MJA, Saad FSA. Performance analysis of image thresholding: Otsu technique. Measurement. 2018;114:298\u2013307.","journal-title":"Measurement"},{"key":"1693_CR13","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1016\/j.sigpro.2016.11.004","volume":"134","author":"F Nie","year":"2017","unstructured":"Nie F, Zhang P, Li J, Ding D. A novel generalized entropy and its application in image thresholding. Signal Process. 2017;134:23\u201334.","journal-title":"Signal Process"},{"issue":"16","key":"1693_CR14","doi-asserted-by":"publisher","first-page":"6258","DOI":"10.1016\/j.ijleo.2016.03.061","volume":"127","author":"SS Sengar","year":"2016","unstructured":"Sengar SS, Mukhopadhyay S. Moving object area detection using normalized self-adaptive optical flow. Optik. 2016;127(16):6258\u201367.","journal-title":"Optik"},{"issue":"4","key":"1693_CR15","doi-asserted-by":"publisher","first-page":"2136","DOI":"10.1016\/j.eswa.2014.09.043","volume":"42","author":"HVH Ayala","year":"2015","unstructured":"Ayala HVH, dos Santos FM, Mariani VC, dos Santos Coelho L. Image thresholding segmentation based on a novel beta differential evolution approach. Expert Syst Appl. 2015;42(4):2136\u201342.","journal-title":"Expert Syst Appl"},{"key":"1693_CR16","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1016\/j.eswa.2016.08.046","volume":"65","author":"U Mlakar","year":"2016","unstructured":"Mlakar U, Poto\u010dnik B, Brest J. A hybrid differential evolution for optimal multilevel image thresholding. Expert Syst Appl. 2016;65:221\u201332.","journal-title":"Expert Syst Appl"},{"key":"1693_CR17","unstructured":"Sreedhar K, Panlal B. Enhancement of images using morphological transformation. 2012; 2514 arXiv preprint arXiv:1203.2012"},{"issue":"3","key":"1693_CR18","doi-asserted-by":"publisher","first-page":"2933","DOI":"10.1093\/mnras\/stv1007","volume":"451","author":"R Brennan","year":"2015","unstructured":"Brennan R. Quenching and morphological transformation in semi-analytic models and CANDELS. Mon Not R Astron Soc. 2015;451(3):2933\u201356.","journal-title":"Mon Not R Astron Soc"},{"key":"1693_CR19","unstructured":"Ashwitha K, Srikanth R. morphological background detection for enhancement of images. LAP LAMBERT Academic Publishing.2018"},{"issue":"3","key":"1693_CR20","doi-asserted-by":"publisher","first-page":"613","DOI":"10.1109\/TIP.2008.2010152","volume":"18","author":"AR Jim\u00e9nez-S\u00e1nchez","year":"2009","unstructured":"Jim\u00e9nez-S\u00e1nchez AR. Morphological background detection and enhancement of images with poor lighting. IEEE Trans Image Process. 2009;18(3):613\u201323.","journal-title":"IEEE Trans Image Process"},{"issue":"2","key":"1693_CR21","first-page":"2231","volume":"11","author":"G Bhatia","year":"2011","unstructured":"Bhatia G, Chahar V. An enhanced approach to improve the contrast of images having bad light by detecting and extracting their background. Int J Comput Sci Manag Stud. 2011;11(2):2231\u20135268.","journal-title":"Int J Comput Sci Manag Stud"},{"issue":"2","key":"1693_CR22","first-page":"17","volume":"25","author":"K Narasimhan","year":"2011","unstructured":"Narasimhan K, Sudarshan CR, Raju N. A comparison of contrast enhancement techniques in poor illuminated gray level and color images. Int J Comput Appl. 2011;25(2):17\u201325.","journal-title":"Int J Comput Appl"},{"issue":"7","key":"1693_CR23","doi-asserted-by":"publisher","first-page":"1480","DOI":"10.1109\/TPAMI.2014.2366765","volume":"37","author":"Q Ye","year":"2014","unstructured":"Ye Q, Doermann D. Text detection and recognition in imagery: a survey. IEEE Trans Pattern Anal Mach Intell. 2014;37(7):1480\u2013500.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"1","key":"1693_CR24","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1007\/s11704-015-4488-0","volume":"10","author":"Y Zhu","year":"2016","unstructured":"Zhu Y, Yao C, Bai X. Scene text detection and recognition: recent advances and future trends. Front Comp Sci. 2016;10(1):19\u201336.","journal-title":"Front Comp Sci"},{"issue":"5","key":"1693_CR25","first-page":"970","volume":"36","author":"X-C Yin","year":"2013","unstructured":"Yin X-C, Yin X, Huang K, Hao H-W. Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell. 2013;36(5):970\u201383.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1693_CR26","doi-asserted-by":"crossref","unstructured":"Karatzas D. \u2018ICDAR 2013 robust reading competition\u2019, 2013 12th International Conference on Document Analysis and Recognition.2013; 1484\u20131493.","DOI":"10.1109\/ICDAR.2013.221"},{"key":"1693_CR27","doi-asserted-by":"crossref","unstructured":"Karatzas D. \u2018ICDAR 2015 competition on robust reading\u2019, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).2015;1156\u20131160.","DOI":"10.1109\/ICDAR.2015.7333942"},{"key":"1693_CR28","unstructured":"Yao C, Bai X, Liu W, Ma Y, Tu Z.\u2018Detecting texts of arbitrary orientations in natural images\u2019, 2012 IEEE conference on computer vision and pattern recognition.2012; 1083\u20131090."},{"key":"1693_CR29","doi-asserted-by":"crossref","unstructured":"Khan T, Mollah AF.\u2018A novel text localization scheme for camera captured document images\u2019, Proceedings of 2nd International Conference on Computer Vision & Image Processing.2018; 253\u2013264.","DOI":"10.1007\/978-981-10-7895-8_20"},{"key":"1693_CR30","doi-asserted-by":"crossref","unstructured":"Nikitin F, Dokholyan V, Zharikov I, Strijov V.\u2018U-net based architectures for document text detection and binarization\u2019, International Symposium on Visual Computing.2019; 79\u201388.","DOI":"10.1007\/978-3-030-33723-0_7"},{"key":"1693_CR31","doi-asserted-by":"crossref","unstructured":"Nagaoka Y, Miyazaki T, Sugaya Y, Omachi S. \u2018Text detection by faster R-CNN with multiple region proposal networks\u2019, 2017 14th IAPR international conference on document analysis and recognition (ICDAR).2017; 6, 15\u201320.","DOI":"10.1109\/ICDAR.2017.343"},{"key":"1693_CR32","doi-asserted-by":"crossref","unstructured":"Risnumawan A, Shivakumara P, Chan CS, Tan CL.\u2018A robust arbitrary text detection system for natural scene images\u2019, Expert Systems with Applications.2014; 41(18), 8027\u20138048.","DOI":"10.1016\/j.eswa.2014.07.008"},{"issue":"9","key":"1693_CR33","doi-asserted-by":"publisher","first-page":"2906","DOI":"10.1016\/j.patcog.2015.04.002","volume":"48","author":"L Sun","year":"2015","unstructured":"Sun L, Huo Q, Jia W, Chen K. A robust approach for text detection from natural scene images. Pattern Recogn. 2015;48(9):2906\u201320.","journal-title":"Pattern Recogn"},{"key":"1693_CR34","doi-asserted-by":"crossref","unstructured":"Yi C, Tian Y.\u2018Text detection in natural scene images by stroke gabor words\u2019, 2011 international conference on document analysis and recognition.2011;177\u2013181.","DOI":"10.1109\/ICDAR.2011.44"},{"issue":"11","key":"1693_CR35","doi-asserted-by":"publisher","first-page":"3111","DOI":"10.1109\/TMM.2018.2818020","volume":"20","author":"J Ma","year":"2018","unstructured":"Ma J. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia. 2018;20(11):3111\u201322.","journal-title":"IEEE Trans Multimedia"},{"key":"1693_CR36","doi-asserted-by":"crossref","unstructured":"Cho H, Sung M, Jun B.\u2018Canny text detector: Fast and robust scene text localization algorithm\u2019, Proceedings of the IEEE conference on computer vision and pattern recognition.2016; 3566\u20133573.","DOI":"10.1109\/CVPR.2016.388"},{"key":"1693_CR37","doi-asserted-by":"crossref","unstructured":"Zhu A, Gao R, Uchida S.\u2018Could scene context be beneficial for scene text detection?\u2019, Pattern Recognition. 2016; 58, 204\u2013215.","DOI":"10.1016\/j.patcog.2016.04.011"},{"issue":"15","key":"1693_CR38","doi-asserted-by":"publisher","first-page":"11443","DOI":"10.1007\/s00521-019-04635-6","volume":"32","author":"SS Sengar","year":"2020","unstructured":"Sengar SS, Mukhopadhyay S. Motion segmentation-based surveillance video compression using adaptive particle swarm optimization. Neural Comput Appl. 2020;32(15):11443\u201357.","journal-title":"Neural Comput Appl"},{"key":"1693_CR39","doi-asserted-by":"crossref","unstructured":"Prasad S, Kong AWK,\u2018Using object information for spotting text\u2019, Proceedings of the European Conference on Computer Vision (ECCV).2018; 540\u2013557.","DOI":"10.1007\/978-3-030-01270-0_33"},{"key":"1693_CR40","doi-asserted-by":"crossref","unstructured":"Wu H, Zou B, Zhao Y-Q, Chen Z, Zhu C, Guo J.\u2018Natural scene text detection by multi-scale adaptive color clustering and non-text filtering\u2019, Neurocomputing. 2016; 214, 1011\u20131025","DOI":"10.1016\/j.neucom.2016.07.016"},{"key":"1693_CR41","doi-asserted-by":"crossref","unstructured":"Li H, Doermann D, Kia O.\u2018Automatic text detection and tracking in digital video\u2019, IEEE transactions on image processing. 2000; 9(1), 147\u2013156","DOI":"10.1109\/83.817607"},{"key":"1693_CR42","doi-asserted-by":"crossref","unstructured":"Sharma N, Shivakumara P, Pal U, Blumenstein M, Tan CL. \u2018A new method for arbitrarily-oriented text detection in video\u2019, 2012 10th IAPR International Workshop on Document Analysis Systems, 2012, 74\u201378.","DOI":"10.1109\/DAS.2012.6"},{"key":"1693_CR43","doi-asserted-by":"crossref","unstructured":"Sengar SS. \u2019Motion segmentation based on structure-texture decomposition and improved three frame differencing,\u2019 In IFIP International Conference on Artificial Intelligence Applications and Innovations, 609-622, 2019. Springer, Cham.","DOI":"10.1007\/978-3-030-19823-7_51"},{"key":"1693_CR44","doi-asserted-by":"crossref","unstructured":"Carbonell M, Mas J, Villegas M, Forn\u00e9s A, Llad\u00f3s J. \u2018End-to-end handwritten text detection and transcription in full pages\u2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).2019; 5, 29\u201334.","DOI":"10.1109\/ICDARW.2019.40077"},{"key":"1693_CR45","doi-asserted-by":"crossref","unstructured":"Guha A, Samanta D.\u2018Real-time application of document classification based on machine learning\u2019, International Conference on Information, Communication and Computing Technology.2019; 366\u2013379.","DOI":"10.1007\/978-3-030-38501-9_37"},{"key":"1693_CR46","doi-asserted-by":"crossref","unstructured":"Guha A, Samanta D, Banerjee A, Agarwal D. \u2018A deep learning model for Information Loss Prevention from multi-page digital documents\u2019, IEEE Access.2021.","DOI":"10.1109\/ACCESS.2021.3084841"},{"key":"1693_CR47","doi-asserted-by":"crossref","unstructured":"Sengar SS, Hariharan U, Rajkumar K. \u2019Multimodal biometric authentication system using deep learning method,\u2019 In 2020 International Conference on Emerging Smart Computing and Informatics (ESCI), pp. 309-312, 2020 IEEE.","DOI":"10.1109\/ESCI48226.2020.9167512"},{"key":"1693_CR48","doi-asserted-by":"crossref","unstructured":"Guha A, Samanta D.\u2018Hybrid Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance\u2019, International Journal of Automation and Computing. 2021;18(1), 55\u201372","DOI":"10.1007\/s11633-020-1247-y"},{"key":"1693_CR49","doi-asserted-by":"crossref","unstructured":"Neumann L, Matas J.\u2018Efficient scene text localization and recognition with local character refinement\u2019, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).2015; 746\u2013750.","DOI":"10.1109\/ICDAR.2015.7333861"},{"key":"1693_CR50","doi-asserted-by":"crossref","unstructured":"Neumann L, Matas J. \u2018A method for text localization and recognition in real-world images\u2019, Asian conference on computer vision, 2010; 770\u2013783.","DOI":"10.1007\/978-3-642-19318-7_60"}],"container-title":["SN Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-023-01693-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s42979-023-01693-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-023-01693-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,30]],"date-time":"2023-04-30T10:15:42Z","timestamp":1682849742000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s42979-023-01693-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,7]]},"references-count":50,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,5]]}},"alternative-id":["1693"],"URL":"https:\/\/doi.org\/10.1007\/s42979-023-01693-5","relation":{},"ISSN":["2661-8907"],"issn-type":[{"type":"electronic","value":"2661-8907"}],"subject":[],"published":{"date-parts":[[2023,3,7]]},"assertion":[{"value":"11 October 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 January 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 March 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest"}},{"value":"The evaluation data that support the findings of this study are available on request from the corresponding author.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Data availability"}}],"article-number":"253"}}