{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T23:33:20Z","timestamp":1768520000034,"version":"3.49.0"},"reference-count":76,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2025,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Many real-world datasets, such as those used for failure and anomaly detection, are severely imbalanced, with a relatively small number of failed instances compared to the number of normal instances. To address these issues, this paper leverages the Backblaze hard disk drives (HDDs) data and makes several contributions to hard drive failure prediction. This research explores 1D convolutional neural networks (CNN) to utilize the sequential nature of hard drive sensor data. The performance of 1D CNN models is compared to traditional machine learning (ML) algorithms, such as the synthetic minority over-sampling technique (SMOTE) and weighted logistic regression (WLR), demonstrating superior results, suggesting the potential effectiveness of the proposed approaches. In addition to these efforts, this paper aims to provide a comprehensive understanding of hard drive longevity and the critical factors contributing to their eventual failure through survival analysis. The 1D CNN models employ weighted binary cross-entropy (WCE) loss and modified focal loss (MFL) functions to manage class imbalanced issues commonly observed in hard drive data. The findings suggest that 1D CNN models outperform traditional ML models, with regularization techniques like dropout and early stopping proving effective in controlling overfitting. Notably, the 1D CNN model with WCE loss demonstrated the best overall performance with a <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$G_{mean}$$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msub>\n                    <mml:mi>G<\/mml:mi>\n                    <mml:mrow>\n                      <mml:mi>mean<\/mml:mi>\n                    <\/mml:mrow>\n                  <\/mml:msub>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula> of 0.692, successfully maximizing the FDR without increasing the FAR. In parallel, the research also employs Cox regression to identify key SMART parameters influencing drive failure. The high concordance index (c-index) of the Cox model (0.958) adds confidence to the insights derived. The research thus sets a solid groundwork for data center management strategies, with a future focus on practical implementation and evaluation of these findings.<\/jats:p>","DOI":"10.1007\/s00521-024-10479-6","type":"journal-article","created":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T06:01:55Z","timestamp":1729058515000},"page":"1089-1104","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Leveraging survival analysis in cost-aware deepnet for efficient hard drive failure prediction"],"prefix":"10.1007","volume":"37","author":[{"given":"Jishan","family":"Ahmed","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3792-2725","authenticated-orcid":false,"given":"Robert C.","family":"Green\u00a0II","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,10,16]]},"reference":[{"key":"10479_CR1","unstructured":"Desjardins J (2019) How much data is generated each day? World Econ. Forum. Accessed 2-04-17"},{"key":"10479_CR2","unstructured":"Pinheiro E, Weber W-D, Barroso LA (2007) Failure trends in a large disk drive population. In: 5th USENIX conference on file and storage technologies (FAST 07). USENIX Association, San Jose, CA. https:\/\/www.usenix.org\/conference\/fast-07\/failure-trends-large-disk-drive-population"},{"key":"10479_CR3","doi-asserted-by":"publisher","unstructured":"Vishwanath K, Nagappan N (2010) Characterizing cloud computing hardware reliability, pp. 193\u2013204. https:\/\/doi.org\/10.1145\/1807128.1807161","DOI":"10.1145\/1807128.1807161"},{"issue":"117","key":"10479_CR4","first-page":"9","volume":"2004","author":"B Allen","year":"2004","unstructured":"Allen B (2004) Monitoring hard disks with smart. Linux J 2004(117):9","journal-title":"Linux J"},{"key":"10479_CR5","unstructured":"Lu S, Luo B, Patel T, Yao Y, Tiwari D, Shi W (2020) Making disk failure predictions smarter! In: 18th USENIX conference on file and storage technologies (FAST 20), pp. 151\u2013167. USENIX Association, Santa Clara, CA. https:\/\/www.usenix.org\/conference\/fast20\/presentation\/lu"},{"key":"10479_CR6","unstructured":"Murray JF, Hughes GF, Kreutz-Delgado K (2003) Hard drive failure prediction using non-parametric statistical methods. In: Proceedings of Icann\/Iconip"},{"key":"10479_CR7","unstructured":"Hamerly G, Elkan C et al (2001) Bayesian approaches to failure prediction for disk drives. In: ICML, vol. 1, pp. 202\u2013209"},{"key":"10479_CR8","unstructured":"Murray JF, Hughes GF, Kreutz-Delgado K, Schuurmans D (2005) Machine learning methods for predicting failures in hard drives: a multiple-instance application. Journal of Machine Learning Research 6(5)"},{"key":"10479_CR9","doi-asserted-by":"publisher","unstructured":"Pereira FLF, Teixeira DN, Gomes JPP, Machado JC (2019) Evaluating one-class classifiers for fault detection in hard disk drives. In: 2019 8th Brazilian conference on intelligent systems (BRACIS), pp. 586\u2013591. https:\/\/doi.org\/10.1109\/BRACIS.2019.00108","DOI":"10.1109\/BRACIS.2019.00108"},{"key":"10479_CR10","doi-asserted-by":"publisher","unstructured":"Aussel N, Jaulin S, Gandon G, Petetin Y, Fazli E, Chabridon S (2017) Predictive models of hard drive failures based on operational data. In: 2017 16th IEEE international conference on machine learning and applications (ICMLA), pp. 619\u2013625. https:\/\/doi.org\/10.1109\/ICMLA.2017.00-92","DOI":"10.1109\/ICMLA.2017.00-92"},{"issue":"2","key":"10479_CR11","doi-asserted-by":"publisher","first-page":"749","DOI":"10.1109\/TR.2020.2995724","volume":"70","author":"Q Yang","year":"2021","unstructured":"Yang Q, Jia X, Li X, Feng J, Li W, Lee J (2021) Evaluating feature selection and anomaly detection methods of hard drive failure prediction. IEEE Trans Reliab 70(2):749\u2013760. https:\/\/doi.org\/10.1109\/TR.2020.2995724","journal-title":"IEEE Trans Reliab"},{"issue":"1","key":"10479_CR12","doi-asserted-by":"publisher","first-page":"419","DOI":"10.1109\/TII.2013.2264060","volume":"10","author":"Y Wang","year":"2013","unstructured":"Wang Y, Ma EW, Chow TW, Tsui K-L (2013) A two-step parametric method for failure prediction in hard disk drives. IEEE Trans Indus Inform 10(1):419\u2013430","journal-title":"IEEE Trans Indus Inform"},{"key":"10479_CR13","doi-asserted-by":"crossref","unstructured":"Zhao Y, Liu X, Gan S, Zheng W (2010) Predicting disk failures with hmm-and hsmm-based approaches. In: Industrial conference on data mining, pp. 390\u2013404. Springer","DOI":"10.1007\/978-3-642-14400-4_30"},{"issue":"27","key":"10479_CR14","first-page":"783","volume":"6","author":"JF Murray","year":"2005","unstructured":"Murray JF, Hughes GF, Kreutz-Delgado K (2005) Machine learning methods for predicting failures in hard drives: a multiple-instance application. Journal of Machine Learning Research 6(27):783\u2013816","journal-title":"Journal of Machine Learning Research"},{"key":"10479_CR15","unstructured":"Backblaze: Hard drive data and stats (2021). https:\/\/www.backblaze.com\/b2\/hard-drive-test-data.html"},{"key":"10479_CR16","doi-asserted-by":"publisher","unstructured":"Kaur K, Kaur K (2019) Failure prediction and health status assessment of storage systems with decision trees. In: Advanced informatics for computing research, pp. 366\u2013376. Springer, Singapore. https:\/\/doi.org\/10.1007\/978-981-13-3140-4_33","DOI":"10.1007\/978-981-13-3140-4_33"},{"key":"10479_CR17","doi-asserted-by":"publisher","DOI":"10.1145\/2907070","author":"P Branco","year":"2016","unstructured":"Branco P, Torgo L, Ribeiro RP (2016) A survey of predictive modeling on imbalanced domains. ACM Comput Surv. https:\/\/doi.org\/10.1145\/2907070","journal-title":"ACM Comput Surv"},{"issue":"9","key":"10479_CR18","doi-asserted-by":"publisher","first-page":"2155","DOI":"10.1109\/TPDS.2020.2985346","volume":"31","author":"J Zhang","year":"2020","unstructured":"Zhang J, Zhou K, Huang P, He X, Xie M, Cheng B, Ji Y, Wang Y (2020) Minority disk failure prediction based on transfer learning in large data centers of heterogeneous disk systems. IEEE Trans Parallel Distrub Syst 31(9):2155\u20132169. https:\/\/doi.org\/10.1109\/TPDS.2020.2985346","journal-title":"IEEE Trans Parallel Distrub Syst"},{"issue":"11","key":"10479_CR19","doi-asserted-by":"publisher","first-page":"3502","DOI":"10.1109\/TC.2016.2538237","volume":"65","author":"C Xu","year":"2016","unstructured":"Xu C, Wang G, Liu X, Guo D, Liu T-Y (2016) Health status assessment and failure prediction for hard drives with recurrent neural networks. IEEE Trans Comput 65(11):3502\u20133508. https:\/\/doi.org\/10.1109\/TC.2016.2538237","journal-title":"IEEE Trans Comput"},{"key":"10479_CR20","doi-asserted-by":"publisher","unstructured":"Li X, Chen S, Hu X, Yang J (2019) Understanding the disharmony between dropout and batch normalization by variance shift. In: 2019 IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp. 2677\u20132685. https:\/\/doi.org\/10.1109\/CVPR.2019.00279","DOI":"10.1109\/CVPR.2019.00279"},{"key":"10479_CR21","doi-asserted-by":"publisher","unstructured":"Hu L, Han L, Xu Z, Jiang T, Qi H (2020) A disk failure prediction method based on lstm network due to its individual specificity. In: Procedia Computer Science 176, 791\u2013799 https:\/\/doi.org\/10.1016\/j.procs.2020.09.074. Knowledge-based and intelligent information & engineering systems: proceedings of the 24th international conference KES2020","DOI":"10.1016\/j.procs.2020.09.074"},{"issue":"11","key":"10479_CR22","doi-asserted-by":"publisher","first-page":"155014771880648","DOI":"10.1177\/1550147718806480","volume":"14","author":"J Shen","year":"2018","unstructured":"Shen J, Wan J, Lim S-J, Yu L (2018) Random-forest-based failure prediction for hard disk drives. Int J Distrib Sensor Netw 14(11):1550147718806480. https:\/\/doi.org\/10.1177\/1550147718806480","journal-title":"Int J Distrib Sensor Netw"},{"key":"10479_CR23","doi-asserted-by":"publisher","unstructured":"Zhu B, Wang G, Liu X, Hu D, Lin S, Ma J (2013) Proactive drive failure prediction for large scale storage systems. In: 2013 IEEE 29th symposium on mass storage systems and technologies (MSST), pp. 1\u20135. https:\/\/doi.org\/10.1109\/MSST.2013.6558427","DOI":"10.1109\/MSST.2013.6558427"},{"key":"10479_CR24","doi-asserted-by":"publisher","unstructured":"C.\u00a0Rinc\u00f3n CA, P\u00e2ris J-F, Vilalta R, Cheng AMK, Long DDE (2017) Disk failure prediction in heterogeneous environments. In: 2017 international symposium on performance evaluation of computer and telecommunication systems (SPECTS), pp. 1\u20137. https:\/\/doi.org\/10.23919\/SPECTS.2017.8046776","DOI":"10.23919\/SPECTS.2017.8046776"},{"issue":"9","key":"10479_CR25","doi-asserted-by":"publisher","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","volume":"21","author":"H He","year":"2009","unstructured":"He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263\u20131284. https:\/\/doi.org\/10.1109\/TKDE.2008.239","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"4","key":"10479_CR26","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1109\/MCI.2018.2866730","volume":"13","author":"MS Santos","year":"2018","unstructured":"Santos MS, Soares JP, Abreu PH, Araujo H, Santos J (2018) Cross-validation for imbalanced datasets: avoiding overoptimistic and overfitting approaches [research frontier]. IEEE Computat Intell Mag 13(4):59\u201376. https:\/\/doi.org\/10.1109\/MCI.2018.2866730","journal-title":"IEEE Computat Intell Mag"},{"issue":"1","key":"10479_CR27","first-page":"37","volume":"2","author":"DM Powers","year":"2011","unstructured":"Powers DM (2011) Evaluation: from precision, recall and f-measure to roc, informedness, markedness & correlation. J Mach Learn Technol 2(1):37\u201363","journal-title":"J Mach Learn Technol"},{"key":"10479_CR28","doi-asserted-by":"publisher","unstructured":"Pereira FLF, Castro\u00a0Chaves I, Gomes JPP, Machado JC (2020) Using autoencoders for anomaly detection in hard disk drives. In: 2020 international joint conference on neural networks (IJCNN), pp. 1\u20137. https:\/\/doi.org\/10.1109\/IJCNN48605.2020.9206689","DOI":"10.1109\/IJCNN48605.2020.9206689"},{"key":"10479_CR29","unstructured":"Pascanu R, Gulcehre C, Cho K, Bengio Y (2014) How to construct deep recurrent neural networks. In: Proceedings of the second international conference on learning representations (ICLR 2014)"},{"key":"10479_CR30","doi-asserted-by":"publisher","DOI":"10.1007\/s00542-019-04454-8","author":"C-J Su","year":"2019","unstructured":"Su C-J, Li Y (2019) Recurrent neural network based real-time failure detection of storage devices. Microsyst Technol. https:\/\/doi.org\/10.1007\/s00542-019-04454-8","journal-title":"Microsyst Technol"},{"key":"10479_CR31","doi-asserted-by":"publisher","unstructured":"Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 international joint conference on neural networks (IJCNN), pp. 1578\u20131585. https:\/\/doi.org\/10.1109\/IJCNN.2017.7966039","DOI":"10.1109\/IJCNN.2017.7966039"},{"key":"10479_CR32","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1016\/j.ins.2017.05.008","volume":"409\u2013419","author":"W-C Lin","year":"2017","unstructured":"Lin W-C, Tsai C-F, Hu Y-H, Jhang J-S (2017) Clustering-based undersampling in class-imbalanced data. Information Sciences 409\u2013419:17\u201326. https:\/\/doi.org\/10.1016\/j.ins.2017.05.008","journal-title":"Information Sciences"},{"key":"10479_CR33","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-019-0192-5","volume":"6","author":"JM Johnson","year":"2019","unstructured":"Johnson JM, Khoshgoftaar T (2019) Survey on deep learning with class imbalance. J Big Data 6:1\u201354","journal-title":"J Big Data"},{"key":"10479_CR34","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y LeCun","year":"2015","unstructured":"LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436\u2013444. https:\/\/doi.org\/10.1038\/nature14539","journal-title":"Nature"},{"key":"10479_CR35","doi-asserted-by":"publisher","DOI":"10.1145\/3065386","author":"A Krizhevsky","year":"2012","unstructured":"Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Neural Inform Process Syst. https:\/\/doi.org\/10.1145\/3065386","journal-title":"Neural Inform Process Syst"},{"key":"10479_CR36","doi-asserted-by":"publisher","unstructured":"Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp. 1\u20139. https:\/\/doi.org\/10.1109\/CVPR.2015.7298594","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"10479_CR37","unstructured":"Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. ArXiv 1409"},{"issue":"3","key":"10479_CR38","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","volume":"115","author":"O Russakovsky","year":"2015","unstructured":"Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision (IJCV) 115(3):211\u2013252. https:\/\/doi.org\/10.1007\/s11263-015-0816-y","journal-title":"Int J Comput Vision (IJCV)"},{"key":"10479_CR39","unstructured":"Zhou C, Sun C, Liu Z, Lau F (2015) A c-lstm neural network for text classification. arXiv preprint arXiv:1511.08630"},{"issue":"4","key":"10479_CR40","doi-asserted-by":"publisher","first-page":"531","DOI":"10.1016\/j.dcan.2022.03.023","volume":"8","author":"K Yan","year":"2022","unstructured":"Yan K, Zhou X (2022) Chiller faults detection and diagnosis with sensor network and adaptive 1d cnn. Digit Commun Netw 8(4):531\u2013539. https:\/\/doi.org\/10.1016\/j.dcan.2022.03.023","journal-title":"Digit Commun Netw"},{"key":"10479_CR41","volume-title":"Deep learning","author":"I Goodfellow","year":"2016","unstructured":"Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge, MA"},{"issue":"2","key":"10479_CR42","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1093\/oxfordjournals.pan.a004868","volume":"9","author":"G King","year":"2001","unstructured":"King G, Zeng L (2001) Logistic regression in rare events data. Political Anal 9(2):137\u2013163. https:\/\/doi.org\/10.1093\/oxfordjournals.pan.a004868","journal-title":"Political Anal"},{"issue":"02","key":"10479_CR43","doi-asserted-by":"publisher","first-page":"318","DOI":"10.1109\/TPAMI.2018.2858826","volume":"42","author":"T Lin","year":"2020","unstructured":"Lin T, Goyal P, Girshick R, He K, Dollar P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(02):318\u2013327. https:\/\/doi.org\/10.1109\/TPAMI.2018.2858826","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"10479_CR44","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4419-6646-9","volume-title":"Survival analysis: a self-learning text","author":"DG Kleinbaum","year":"2012","unstructured":"Kleinbaum DG, Klein M (2012) Survival analysis: a self-learning text. Springer, New York, NY"},{"key":"10479_CR45","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1613\/jair.953","volume":"16","author":"N Chawla","year":"2002","unstructured":"Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res (JAIR) 16:321\u2013357. https:\/\/doi.org\/10.1613\/jair.953","journal-title":"J Artif Intell Res (JAIR)"},{"key":"10479_CR46","doi-asserted-by":"crossref","unstructured":"Z\u00fcfle M, Erhard F, Kounev S (2021) Machine learning model update strategies for hard disk drive failure prediction. In: 2021 20th IEEE international conference on machine learning and applications (ICMLA), pp. 1379\u20131386. IEEE","DOI":"10.1109\/ICMLA52953.2021.00223"},{"key":"10479_CR47","doi-asserted-by":"crossref","unstructured":"Xu R, Wang X, Wu J (2022) Classification based hard disk drive failure prediction: methodologies, performance evaluation and comparison. In: 2022 IEEE 18th international conference on automation science and engineering (CASE), pp. 189\u2013195. IEEE","DOI":"10.1109\/CASE49997.2022.9926720"},{"key":"10479_CR48","doi-asserted-by":"crossref","unstructured":"Sun X, Chakrabarty K, Huang R, Chen Y, Zhao B, Cao H, Han Y, Liang X, Jiang L (2019) System-level hardware failure prediction using deep learning. In: Proceedings of the 56th annual design automation conference 2019, pp. 1\u20136","DOI":"10.1145\/3316781.3317918"},{"key":"10479_CR49","doi-asserted-by":"crossref","unstructured":"Basak S, Sengupta S, Dubey A (2019) Mechanisms for integrated feature normalization and remaining useful life estimation using lstms applied to hard-disks. In: 2019 IEEE international conference on smart computing (SMARTCOMP), pp. 208\u2013216. IEEE","DOI":"10.1109\/SMARTCOMP.2019.00055"},{"issue":"1","key":"10479_CR50","first-page":"8878364","volume":"2021","author":"J Shen","year":"2021","unstructured":"Shen J, Ren Y, Wan J, Lan Y (2021) Hard disk drive failure prediction for mobile edge computing based on an lstm recurrent neural network. Mobile Inform Syst 2021(1):8878364","journal-title":"Mobile Inform Syst"},{"key":"10479_CR51","unstructured":"Ma Y, He H (2013) Imbalanced learning: foundations, algorithms, and applications. Wiley, Hoboken, NJ. https:\/\/books.google.com\/books?id=CVHx-Gp9jzUC"},{"key":"10479_CR52","first-page":"145","volume":"40","author":"MD Gordon","year":"1989","unstructured":"Gordon MD, Kochen M (1989) Recall-precision trade-off: a derivation. JASIS 40:145\u2013151","journal-title":"Recall-precision trade-off: a derivation. JASIS"},{"issue":"3","key":"10479_CR53","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pone.0118432","volume":"10","author":"T Saito","year":"2015","unstructured":"Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10(3):1\u201321. https:\/\/doi.org\/10.1371\/journal.pone.0118432","journal-title":"PLoS ONE"},{"key":"10479_CR54","doi-asserted-by":"crossref","unstructured":"Arp D, Quiring E, Pendlebury F, Warnecke A, Pierazzi F, Wressnegger C, Cavallaro L, Rieck K (2021) Dos and don\u2019ts of machine learning in computer security, 2022 edn. USENIX Security Symposium (USENIX). USENIX Association, Berkeley, CA","DOI":"10.1109\/MSEC.2023.3287207"},{"issue":"4","key":"10479_CR55","doi-asserted-by":"publisher","first-page":"853","DOI":"10.1162\/neco_a_01362","volume":"33","author":"CK Williams","year":"2021","unstructured":"Williams CK (2021) The effect of class imbalance on precision-recall curves. Neural Comput 33(4):853\u2013857","journal-title":"Neural Comput"},{"issue":"10","key":"10479_CR56","doi-asserted-by":"publisher","first-page":"12049","DOI":"10.1007\/s10489-021-03041-7","volume":"52","author":"IM De Diego","year":"2022","unstructured":"De Diego IM, Redondo AR, Fern\u00e1ndez RR, Navarro J, Moguerza JM (2022) General performance score for classification problems. Appl Intell 52(10):12049\u201312063. https:\/\/doi.org\/10.1007\/s10489-021-03041-7","journal-title":"Appl Intell"},{"issue":"8","key":"10479_CR57","doi-asserted-by":"publisher","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","volume":"27","author":"T Fawcett","year":"2006","unstructured":"Fawcett T (2006) An introduction to roc analysis. Pattern Recognit Lett 27(8):861\u2013874","journal-title":"Pattern Recognit Lett"},{"issue":"1","key":"10479_CR58","doi-asserted-by":"publisher","first-page":"168","DOI":"10.1016\/j.aci.2018.08.003","volume":"17","author":"A Tharwat","year":"2021","unstructured":"Tharwat A (2021) Classification assessment methods. Appl Comput Inform 17(1):168\u2013192","journal-title":"Appl Comput Inform"},{"key":"10479_CR59","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1007\/s10618-012-0295-5","volume":"28","author":"G Menardi","year":"2012","unstructured":"Menardi G, Torelli N (2012) Training and assessing classification rules with imbalanced data. Data Mining Knowl Discov 28:92\u2013122","journal-title":"Data Mining Knowl Discov"},{"issue":"6","key":"10479_CR60","doi-asserted-by":"publisher","first-page":"786","DOI":"10.1109\/TKDE.2005.95","volume":"17","author":"G Wu","year":"2005","unstructured":"Wu G, Chang EY (2005) Kba: kernel boundary alignment considering imbalanced data distribution. IEEE Trans Knowl Data Eng 17(6):786\u2013795. https:\/\/doi.org\/10.1109\/TKDE.2005.95","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"2009","key":"10479_CR61","doi-asserted-by":"publisher","first-page":"1322","DOI":"10.5772\/7544","volume":"10","author":"GH Nguyen","year":"2009","unstructured":"Nguyen GH, Bouzerdoum A, Phung SL (2009) Learning pattern classification tasks with imbalanced data sets. Pattern Recognit 10(2009):1322\u20131328. https:\/\/doi.org\/10.5772\/7544","journal-title":"Pattern Recognit"},{"key":"10479_CR62","doi-asserted-by":"publisher","unstructured":"Karagiannopoulos MG, Anyfantis D, Kotsiantis S, Pintelas P (2007) Local cost sensitive learning for handling imbalanced data sets, pp. 1\u20136. https:\/\/doi.org\/10.1109\/MED.2007.4433808","DOI":"10.1109\/MED.2007.4433808"},{"issue":"2","key":"10479_CR63","doi-asserted-by":"publisher","first-page":"405","DOI":"10.1109\/TKDE.2012.232","volume":"26","author":"S Barua","year":"2014","unstructured":"Barua S, Islam MM, Yao X, Murase K (2014) Mwmote-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 26(2):405\u2013425. https:\/\/doi.org\/10.1109\/TKDE.2012.232","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"10479_CR64","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1007\/978-3-540-30115-8_7","volume-title":"Machine learning: ECML 2004","author":"R Akbani","year":"2004","unstructured":"Akbani R, Kwek S, Japkowicz N (2004) Applying support vector machines to imbalanced datasets. In: Boulicaut J-F, Esposito F, Giannotti F, Pedreschi D (eds) Machine learning: ECML 2004. Springer, Berlin, Heidelberg, pp 39\u201350"},{"key":"10479_CR65","doi-asserted-by":"publisher","unstructured":"Ertekin S, Huang J, Bottou L, Giles C (2007) Learning on the border: Active learning in imbalanced data classification, pp. 127\u2013136. https:\/\/doi.org\/10.1145\/1321440.1321461","DOI":"10.1145\/1321440.1321461"},{"issue":"3","key":"10479_CR66","doi-asserted-by":"publisher","first-page":"849","DOI":"10.1016\/S0031-3203(02)00257-1","volume":"36","author":"R Barandela","year":"2003","unstructured":"Barandela R, Sanchez JS, Garc\u0131aa V, Rangel E (2003) Strategies for learning in class imbalance problems. Pattern Recognit 36(3):849\u2013851. https:\/\/doi.org\/10.1016\/S0031-3203(02)00257-1","journal-title":"Pattern Recognit"},{"issue":"1","key":"10479_CR67","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1093\/bib\/bbs006","volume":"14","author":"W-J Lin","year":"2012","unstructured":"Lin W-J, Chen JJ (2012) Class-imbalanced classifiers for high-dimensional data. Brief Bioinform 14(1):13\u201326. https:\/\/doi.org\/10.1093\/bib\/bbs006","journal-title":"Brief Bioinform"},{"key":"10479_CR68","doi-asserted-by":"publisher","first-page":"102637","DOI":"10.1016\/j.dsp.2019.102637","volume":"98","author":"J Ri","year":"2020","unstructured":"Ri J, Kim H (2020) G-mean based extreme learning machine for imbalance learning. Digit Signal Process 98:102637. https:\/\/doi.org\/10.1016\/j.dsp.2019.102637","journal-title":"Digit Signal Process"},{"key":"10479_CR69","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1145\/1007730.1007736","volume":"6","author":"H Guo","year":"2004","unstructured":"Guo H, Viktor H (2004) Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explor 6:30\u201339. https:\/\/doi.org\/10.1145\/1007730.1007736","journal-title":"SIGKDD Explor"},{"issue":"4","key":"10479_CR70","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1002\/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4","volume":"15","author":"FE Harrell Jr","year":"1996","unstructured":"Harrell FE Jr, Lee KL, Mark DB (1996) Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15(4):361\u2013387","journal-title":"Stat Med"},{"key":"10479_CR71","doi-asserted-by":"publisher","unstructured":"Chen S, Sun F-B, Yang JJ (1999) A new method of hard disk drive mttf projection using data from an early life test. In: Annual reliability and maintainability. Symposium. 1999 Proceedings (Cat. No.99CH36283), pp. 252\u2013257. https:\/\/doi.org\/10.1109\/RAMS.1999.744127","DOI":"10.1109\/RAMS.1999.744127"},{"key":"10479_CR72","unstructured":"Schroeder B, Gibson GA (2007) Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In: 5th USENIX conference on file and storage technologies (FAST 07). USENIX Association, San Jose, CA. https:\/\/www.usenix.org\/conference\/fast-07\/disk-failures-real-world-what-does-mttf-1000000-hours-mean-you"},{"key":"10479_CR73","unstructured":"Chollet F (2015) et al.: Keras. https:\/\/keras.io"},{"key":"10479_CR74","unstructured":"Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Man\u00e9 D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Vi\u00e9gas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems. Software available from tensorflow.org. https:\/\/www.tensorflow.org\/"},{"key":"10479_CR75","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"10479_CR76","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1007\/s13748-016-0094-0","volume":"5","author":"B Krawczyk","year":"2016","unstructured":"Krawczyk B (2016) Learning from imbalanced data: open challenges and future directions. Progress Artif Intell 5:221\u2013232","journal-title":"Progress Artif Intell"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-024-10479-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-024-10479-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-024-10479-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,23]],"date-time":"2025-01-23T18:30:03Z","timestamp":1737657003000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-024-10479-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,16]]},"references-count":76,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,1]]}},"alternative-id":["10479"],"URL":"https:\/\/doi.org\/10.1007\/s00521-024-10479-6","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,16]]},"assertion":[{"value":"30 January 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 September 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 October 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}