{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,8]],"date-time":"2026-06-08T22:19:03Z","timestamp":1780957143799,"version":"3.54.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"31","license":[{"start":{"date-parts":[[2025,9,13]],"date-time":"2025-09-13T00:00:00Z","timestamp":1757721600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,9,13]],"date-time":"2025-09-13T00:00:00Z","timestamp":1757721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000289","name":"Cancer Research UK","doi-asserted-by":"publisher","award":["RadNet Cambridge [C17918\/A28870]"],"award-info":[{"award-number":["RadNet Cambridge [C17918\/A28870]"]}],"id":[{"id":"10.13039\/501100000289","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2025,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Pattern recognition models, particularly neural networks, often focus on maximising classification accuracy. However, in practice, the types of errors made (misclassification between different classes) can have varying associated costs. Current methods overlook varying misclassification error types. Misclassification labels can either be available from expert knowledge or derived from semantics of textual descriptions of class labels. Exploiting such misclassification costs can have significant implications when deploying machine learning systems. Here, using five examples from image and tabular domains, we show how a deep neural architecture trained in a nested layer-wise fashion (cascade learning) in which early layers solve easier problems than later ones could exploit such hierarchical aspects of class labels. We employ a measure of performance called \u201cseverity\u201d of errors and show how emphasis could be placed on classes that are deeper in the hierarchy, ignoring errors that arise between semantic neighbours.<\/jats:p>","DOI":"10.1007\/s00521-025-11613-8","type":"journal-article","created":{"date-parts":[[2025,9,13]],"date-time":"2025-09-13T13:01:41Z","timestamp":1757768501000},"page":"26021-26036","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Balancing misclassification errors in image-based inference using problem domain semantics and a nested cascade architecture"],"prefix":"10.1007","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7948-4081","authenticated-orcid":false,"given":"Xin","family":"Du","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Rajesh","family":"Jena","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Katayoun","family":"Farrahi","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mahesan","family":"Niranjan","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2025,9,13]]},"reference":[{"key":"11613_CR1","first-page":"531","volume-title":"Pattern recognition and machine learning","author":"C Bishop","year":"2006","unstructured":"Bishop C (2006) Pattern recognition and machine learning, vol 2, 1st edn. Springer, Berlin, pp 531\u2013537","edition":"1"},{"issue":"11","key":"11613_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TNNLS.2018.2805098","volume":"29","author":"ES Marquez","year":"2018","unstructured":"Marquez ES, Hare JS, Niranjan M (2018) Deep cascade learning. IEEE Trans Neural Netw Learn Syst 29(11):1\u201311","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"11613_CR3","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781"},{"issue":"1","key":"11613_CR4","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1038\/s41597-019-0055-0","volume":"6","author":"Y Zhang","year":"2019","unstructured":"Zhang Y, Chen Q, Yang Z, Lin H, Lu Z (2019) Biowordvec, improving biomedical word embeddings with subword information and mesh. Sci Data 6(1):52","journal-title":"Sci Data"},{"key":"11613_CR5","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199549221.003.02","volume-title":"What are categories and concepts?","author":"GL Murphy","year":"2010","unstructured":"Murphy GL (2010) What are categories and concepts? Oxford University Press, Oxford"},{"key":"11613_CR6","unstructured":"Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. In: Proceedings of the advances in neural information processing systems (NeurIPS), pp 524\u2013532"},{"key":"11613_CR7","unstructured":"Belilovsky E, Eickenberg M, Oyallon E (2019) Greedy layerwise learning can scale to imagenet. In: Proceedings of the international conference on machine learning (ICML). PMLR, pp 583\u2013593"},{"key":"11613_CR8","doi-asserted-by":"crossref","unstructured":"Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR). IEEE, pp 248\u2013255","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"11613_CR9","unstructured":"Shwartz-Ziv R, Tishby N (2017) Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810"},{"issue":"10","key":"11613_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3390\/e23101360","volume":"23","author":"X Du","year":"2021","unstructured":"Du X, Farrahi K, Niranjan M (2021) Information bottleneck theory based exploration of cascade learning. Entropy 23(10):1\u201316","journal-title":"Entropy"},{"key":"11613_CR11","unstructured":"Wang J, Du X, Farrahi K, Niranjan M (2022) Deep cascade learning for optimal medical image feature representation. In: Proceedings of the 7th machine learning for healthcare conference. Proceedings of machine learning research, vol 182, pp 54\u201378"},{"key":"11613_CR12","doi-asserted-by":"crossref","unstructured":"Du X, Farrahi K, Niranjan M (2019) Transfer learning across human activities using a cascade neural network architecture. In: Proceedings of the 23rd international symposium on wearable computers (ISWC), pp 35\u201344","DOI":"10.1145\/3341163.3347730"},{"key":"11613_CR13","doi-asserted-by":"crossref","unstructured":"Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball, R, Shpanskaya K (2019) CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the association for the advancement of artificial intelligence (AAAI) conference, vol 33, pp 590\u2013597","DOI":"10.1609\/aaai.v33i01.3301590"},{"issue":"1","key":"11613_CR14","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1038\/s41746-020-0273-z","volume":"3","author":"Y-X Tang","year":"2020","unstructured":"Tang Y-X, Tang Y-B, Peng Y, Yan K, Bagheri M, Redd BA, Brandon CJ, Lu Z, Han M, Xiao J (2020) Automated abnormality classification of chest radiographs using deep convolutional neural networks. NPJ Digit Med 3(1):70","journal-title":"NPJ Digit Med"},{"issue":"10","key":"11613_CR15","doi-asserted-by":"publisher","first-page":"867","DOI":"10.1038\/s42256-022-00536-x","volume":"4","author":"A Saporta","year":"2022","unstructured":"Saporta A, Gui X, Agrawal A, Pareek A, Truong SQ, Nguyen CD, Ngo V-D, Seekins J, Blankenberg FG, Ng AY (2022) Benchmarking saliency methods for chest x-ray interpretation. Nat Mach Intell 4(10):867\u2013878","journal-title":"Nat Mach Intell"},{"key":"11613_CR16","unstructured":"Raghu M, Zhang C, Kleinberg J, Bengio S (2019) Transfusion: understanding transfer learning for medical imaging. In: Proceedings of the advances in neural information processing systems (NeurIPS), pp 3347\u20133357"},{"key":"11613_CR17","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1007\/s10618-010-0175-9","volume":"22","author":"CN Silla","year":"2011","unstructured":"Silla CN, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22:31\u201372","journal-title":"Data Min Knowl Disc"},{"key":"11613_CR18","doi-asserted-by":"publisher","first-page":"186","DOI":"10.1016\/j.neucom.2020.03.127","volume":"437","author":"HH Pham","year":"2021","unstructured":"Pham HH, Le TT, Tran DQ, Ngo DT, Nguyen HQ (2021) Interpreting chest x-rays via cnns that exploit hierarchical disease dependencies and uncertainty labels. Neurocomputing 437:186\u2013194","journal-title":"Neurocomputing"},{"issue":"1","key":"11613_CR19","doi-asserted-by":"publisher","first-page":"21903","DOI":"10.1038\/s41598-023-49185-z","volume":"13","author":"S Srivastava","year":"2023","unstructured":"Srivastava S, Mishra D (2023) Severity of error in hierarchical datasets. Sci Rep 13(1):21903","journal-title":"Sci Rep"},{"key":"11613_CR20","doi-asserted-by":"crossref","unstructured":"Bertinetto L, Mueller R, Tertikas K, Samangooei S, Lord NA (2020) Making better mistakes: leveraging class hierarchies with deep networks. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 12506\u201312515","DOI":"10.1109\/CVPR42600.2020.01252"},{"key":"11613_CR21","unstructured":"Karthik S, Prabhu A, Dokania PK, Gandhi V (2021) No cost likelihood manipulation at test time for making better mistakes in deep networks. In: International conference on learning representations (ICLR)"},{"key":"11613_CR22","unstructured":"Helus V, Vaska N, Abreu N (2022) Addressing mistake severity in neural networks with semantic knowledge. In: Progress and challenges in building trustworthy embodied AI"},{"key":"11613_CR23","unstructured":"Ben-Shaul I, Shwartz-Ziv R, Galanti T, Dekel S, LeCun Y (2023) Reverse engineering self-supervised learning. arXiv preprint arXiv:2305.15614"},{"key":"11613_CR24","doi-asserted-by":"crossref","unstructured":"Li L, Zhou T, Wang W, Li J, Yang Y (2022) Deep hierarchical semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 1246\u20131257","DOI":"10.1109\/CVPR52688.2022.00131"},{"key":"11613_CR25","doi-asserted-by":"crossref","unstructured":"Barz B, Denzler J (2019) Hierarchy-based image embeddings for semantic image retrieval. In: 2019 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 638\u2013647","DOI":"10.1109\/WACV.2019.00073"},{"key":"11613_CR26","unstructured":"Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: Visualising image classification models and saliency maps. In: Proceedings of the workshop at international conference on learning representations (ICLR)"},{"key":"11613_CR27","doi-asserted-by":"crossref","unstructured":"Rahman MA, Wang Y (2016) Optimizing intersection-over-union in deep neural networks for image segmentation. In: International symposium on visual computing. Springer, pp 234\u2013244","DOI":"10.1007\/978-3-319-50835-1_22"},{"key":"11613_CR28","unstructured":"Krizhevsky A, Hinton G, Nair V (2009) Learning multiple layers of features from tiny images. Technical Report TR-2009. University of Toronto"},{"key":"11613_CR29","doi-asserted-by":"publisher","DOI":"10.1016\/j.dib.2019.104344","volume":"25","author":"FM Palechor","year":"2019","unstructured":"Palechor FM, Hoz Manotas A (2019) Dataset for estimation of obesity levels based on eating habits and physical condition in individuals from colombia, peru and mexico. Data Brief 25:104344","journal-title":"Data Brief"},{"key":"11613_CR30","doi-asserted-by":"crossref","unstructured":"Wood RL, Jensen T, Wadsworth C, Clement M, Nagpal P, Pitt WG (2020) Analysis of identification method for bacterial species and antibiotic resistance genes using optical data from DNA oligomers. Front Microbiol 11","DOI":"10.3389\/fmicb.2020.00257"},{"key":"11613_CR31","unstructured":"Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980"},{"key":"11613_CR32","unstructured":"Bjorck N, Gomes CP, Selman B, Weinberger KQ (2018) Understanding batch normalization. In: Advances in neural information processing systems (NeurIPS), vol 31"},{"key":"11613_CR33","unstructured":"Krizhevsky A, Hinton G (2010) Convolutional deep belief networks on CIFAR-10. Technical Report UTML TR 2010-003. University of Toronto"},{"key":"11613_CR34","unstructured":"Malinowski M, Fritz M (2013) Learnable pooling regions for image classification. In: International conference on learning representations (ICLR)"},{"key":"11613_CR35","unstructured":"Van\u00a0Amersfoort J, Smith L, Teh YW, Gal Y (2020) Uncertainty estimation using a single deep deterministic neural network. In: Proceedings of the international conference on machine learning (ICML). PMLR, pp 9690\u20139700"},{"key":"11613_CR36","unstructured":"Yang Y, Li X, Alfarra M, Hammoud H, Bibi A, Torr P, Ghanem B (2024) Towards interpretable deep local learning with successive gradient reconciliation. arXiv preprint arXiv:2406.05222"},{"issue":"9","key":"11613_CR37","first-page":"4555","volume":"44","author":"X Wang","year":"2021","unstructured":"Wang X, Chen Y, Zhu W (2021) A survey on curriculum learning. IEEE Trans Pattern Anal Mach Intell 44(9):4555\u20134576","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"3","key":"11613_CR38","doi-asserted-by":"publisher","first-page":"159","DOI":"10.4314\/gmj.v49i3.6","volume":"49","author":"Y Mensah","year":"2015","unstructured":"Mensah Y, Mensah K, Asiamah S, Gbadamosi H, Idun E, Brakohiapa W, Oddoye A (2015) Establishing the cardiothoracic ratio using chest radiographs in an indigenous ghanaian population: a simple tool for cardiomegaly screening. Ghana Med J 49(3):159\u2013164","journal-title":"Ghana Med J"},{"issue":"29","key":"11613_CR39","doi-asserted-by":"publisher","first-page":"861","DOI":"10.21105\/joss.00861","volume":"3","author":"L McInnes","year":"2018","unstructured":"McInnes L, Healy J, Saul N, Gro\u00dfberger L (2018) Umap: uniform manifold approximation and projection. J Open Source Softw 3(29):861\u2013862","journal-title":"J Open Source Softw"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-025-11613-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-025-11613-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-025-11613-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,18]],"date-time":"2025-10-18T18:20:28Z","timestamp":1760811628000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-025-11613-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,13]]},"references-count":39,"journal-issue":{"issue":"31","published-print":{"date-parts":[[2025,11]]}},"alternative-id":["11613"],"URL":"https:\/\/doi.org\/10.1007\/s00521-025-11613-8","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,13]]},"assertion":[{"value":"29 July 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 August 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 September 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The author has no conflict of interest to declare.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}