{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T08:20:06Z","timestamp":1773044406109,"version":"3.50.1"},"reference-count":59,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T00:00:00Z","timestamp":1738713600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T00:00:00Z","timestamp":1738713600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Research Affairs Office, UAE University","award":["31S445"],"award-info":[{"award-number":["31S445"]}]},{"name":"Research Affairs Office, UAE University","award":["31S445"],"award-info":[{"award-number":["31S445"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Groundwater is a vital global resource. However, mapping aquifers remains challenging, particularly in developing nations. This study proposes a novel methodology for aquifer delineation using time-series clustering of groundwater-level data. The modular clustering framework utilizes hierarchical agglomerative clustering and a custom hydrology-specific distance function. This accounts for the variability in the length, temporal position, and consistency of the time series, in addition to gaps in records, aligning them temporally before comparison. Advantages over traditional techniques such as dynamic time warping, and Euclidean distance are provided for analyzing real-world hydrological data. The algorithm was optimized on a synthetic Texas aquifer dataset to identify the minimum time series lengths required for accurate clustering (&gt;\u200990% accuracy). Applying this to real data from the Texas Groundwater Database GWDB with over one million readings and 60,000 wells, the modeling achieved\u2009~\u200973% accuracy, delineating the nine major Texan aquifers using a filtered number of 74 representative wells. The aquifer boundaries were geographically visualized using the GeoZ library. These findings suggest the effectiveness of groundwater characterization given the limited data. The optimized algorithm could provide inexpensive mapping capabilities in developing nations, requiring only historical data from existing wells over the decades. This technique is adaptive and can be improved through ongoing monitoring. The algorithm components are modular and upgradable thus future studies should optimize and test their generalizability using additional datasets.<\/jats:p>","DOI":"10.1186\/s40537-025-01060-6","type":"journal-article","created":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T07:23:17Z","timestamp":1738740197000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["GeoTemporal clustering for aquifer delineation: a big data approach to synchronizing and analyzing variable-length groundwater time series"],"prefix":"10.1186","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7564-5063","authenticated-orcid":false,"given":"Khalid","family":"ElHaj","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0565-6570","authenticated-orcid":false,"given":"Dalal","family":"Alshamsi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,2,5]]},"reference":[{"issue":"11","key":"1060_CR1","doi-asserted-by":"publisher","first-page":"945","DOI":"10.1038\/nclimate2425","volume":"4","author":"JS Famiglietti","year":"2014","unstructured":"Famiglietti JS. The global groundwater crisis. Nat Clim Chang. 2014;4(11):945\u20138.","journal-title":"Nat Clim Chang"},{"key":"1060_CR2","doi-asserted-by":"publisher","DOI":"10.1201\/9781439809013","volume-title":"Groundwater economics Elsevier Science","author":"CA Job","year":"2009","unstructured":"Job CA. Groundwater economics Elsevier Science. Paris: ISSN; 2009."},{"key":"1060_CR3","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1016\/j.is.2015.04.007","volume":"53","author":"S Aghabozorgi","year":"2015","unstructured":"Aghabozorgi S, Seyed Shirkhorshidi A, Ying Wah T. Time-series clustering \u2013 A decade review. Inf Syst. 2015;53:16\u201338.","journal-title":"Inf Syst"},{"key":"1060_CR4","doi-asserted-by":"crossref","unstructured":"AlMahamid F, Grolinger K. Agglomerative Hierarchical Clustering with Dynamic Time Warping for Household Load Curve Clustering. In: 2022 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) [Internet]. IEEE; 2022. p. 241\u20137. Available from: https:\/\/ieeexplore.ieee.org\/document\/9918481\/","DOI":"10.1109\/CCECE49351.2022.9918481"},{"issue":"2","key":"1060_CR5","first-page":"63","volume":"13","author":"M Yohansa","year":"2022","unstructured":"Yohansa M, Notodiputro KA, Erfiani E. Dynamic time warping techniques for time series clustering of Covid-19 cases in DKI Jakarta. ComTech Comput Math Eng Appl. 2022;13(2):63\u201373.","journal-title":"ComTech Comput Math Eng Appl"},{"issue":"1\u20134","key":"1060_CR6","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1016\/j.jhydrol.2011.07.008","volume":"407","author":"M Corduas","year":"2011","unstructured":"Corduas M. Clustering streamflow time series for regional classification. J Hydrol. 2011;407(1\u20134):73\u201380.","journal-title":"J Hydrol"},{"issue":"9","key":"1060_CR7","doi-asserted-by":"publisher","first-page":"1198","DOI":"10.1002\/hyp.7583","volume":"24","author":"R Ouyang","year":"2010","unstructured":"Ouyang R, Ren L, Cheng W, Zhou C. Similarity search and pattern discovery in hydrological time series data mining. Hydrol Process. 2010;24(9):1198\u2013210. https:\/\/doi.org\/10.1002\/hyp.7583.","journal-title":"Hydrol Process"},{"issue":"13","key":"1060_CR8","doi-asserted-by":"publisher","first-page":"2095","DOI":"10.3390\/w14132095","volume":"14","author":"I Prakaisak","year":"2022","unstructured":"Prakaisak I, Wongchaisuwat P. Hydrological time series clustering: a case study of telemetry stations in Thailand. Water. 2022;14(13):2095.","journal-title":"Water"},{"key":"1060_CR9","doi-asserted-by":"publisher","first-page":"129409","DOI":"10.1016\/j.jhydrol.2023.129409","volume":"620","author":"M Yang","year":"2023","unstructured":"Yang M, Olivera F. Classification of watersheds in the conterminous United States using shape-based time-series clustering and Random Forests. J Hydrol. 2023;620:129409.","journal-title":"J Hydrol"},{"issue":"6","key":"1060_CR10","doi-asserted-by":"publisher","first-page":"826","DOI":"10.1111\/gwat.12927","volume":"57","author":"M Bakker","year":"2019","unstructured":"Bakker M, Schaars F. Solving groundwater flow problems with time series analysis: you may not even need another model. Groundwater. 2019;57(6):826\u201333. https:\/\/doi.org\/10.1111\/gwat.12927.","journal-title":"Groundwater"},{"issue":"6","key":"1060_CR11","doi-asserted-by":"publisher","first-page":"808","DOI":"10.1111\/gwat.13119","volume":"59","author":"JJ Butler","year":"2021","unstructured":"Butler JJ, Knobbe S, Reboulet EC, Whittemore DO, Wilson BB, Bohling GC. Water well hydrographs: an underutilized resource for characterizing subsurface conditions. Groundwater. 2021;59(6):808\u201318. https:\/\/doi.org\/10.1111\/gwat.13119.","journal-title":"Groundwater"},{"issue":"10","key":"1060_CR12","doi-asserted-by":"publisher","first-page":"1685","DOI":"10.1080\/02626667.2020.1762888","volume":"65","author":"M Giese","year":"2020","unstructured":"Giese M, Haaf E, Heudorfer B, Barthel R. Comparative hydrogeology \u2013 reference analysis of groundwater dynamics from neighbouring observation wells. Hydrol Sci J. 2020;65(10):1685\u2013706. https:\/\/doi.org\/10.1080\/02626667.2020.1762888.","journal-title":"Hydrol Sci J"},{"issue":"6","key":"1060_CR13","doi-asserted-by":"publisher","first-page":"1801","DOI":"10.1007\/s10040-022-02528-y","volume":"30","author":"BP Marchant","year":"2022","unstructured":"Marchant BP, Cuba D, Brauns B, Bloomfield JP. Temporal interpolation of groundwater level hydrographs for regional drought analysis using mixed models. Hydrogeol J. 2022;30(6):1801\u201317. https:\/\/doi.org\/10.1007\/s10040-022-02528-y.","journal-title":"Hydrogeol J"},{"key":"1060_CR14","doi-asserted-by":"publisher","first-page":"222","DOI":"10.1016\/j.jhydrol.2018.02.035","volume":"559","author":"E Haaf","year":"2018","unstructured":"Haaf E, Barthel R. An inter-comparison of similarity-based methods for organisation and classification of groundwater hydrographs. J Hydrol. 2018;559:222\u201337.","journal-title":"J Hydrol"},{"issue":"7","key":"1060_CR15","doi-asserted-by":"publisher","first-page":"5575","DOI":"10.1029\/2018WR024418","volume":"55","author":"B Heudorfer","year":"2019","unstructured":"Heudorfer B, Haaf E, Stahl K, Barthel R. Index-based characterization and quantification of groundwater dynamics. Water Resour Res. 2019;55(7):5575\u201392. https:\/\/doi.org\/10.1029\/2018WR024418.","journal-title":"Water Resour Res"},{"issue":"4","key":"1060_CR16","doi-asserted-by":"publisher","first-page":"1063","DOI":"10.3390\/w12041063","volume":"12","author":"N Naranjo-Fern\u00e1ndez","year":"2020","unstructured":"Naranjo-Fern\u00e1ndez N, Guardiola-Albert C, Aguilera H, Serrano-Hidalgo C, Montero-Gonz\u00e1lez E. Clustering groundwater level time series of the exploited almonte-marismas aquifer in southwest Spain. Water. 2020;12(4):1063.","journal-title":"Water"},{"issue":"4","key":"1060_CR17","doi-asserted-by":"publisher","first-page":"1157","DOI":"10.1007\/s10040-022-02494-5","volume":"30","author":"D Sartirana","year":"2022","unstructured":"Sartirana D, Rotiroti M, Bonomi T, De Amicis M, Nava V, Fumagalli L, et al. Data-driven decision management of urban underground infrastructure through groundwater-level time-series cluster analysis: the case of Milan (Italy). Hydrogeol J. 2022;30(4):1157\u201377. https:\/\/doi.org\/10.1007\/s10040-022-02494-5.","journal-title":"Hydrogeol J"},{"issue":"1","key":"1060_CR18","doi-asserted-by":"publisher","first-page":"148","DOI":"10.3390\/w15010148","volume":"15","author":"C Zanotti","year":"2022","unstructured":"Zanotti C, Rotiroti M, Redaelli A, Caschetto M, Fumagalli L, Stano C, et al. Multivariate time series clustering of groundwater quality data to develop data-driven monitoring strategies in a historically contaminated urban area. Water. 2022;15(1):148.","journal-title":"Water"},{"issue":"5","key":"1060_CR19","doi-asserted-by":"publisher","first-page":"1693","DOI":"10.1007\/s10040-021-02358-4","volume":"29","author":"R Barthel","year":"2021","unstructured":"Barthel R, Haaf E, Giese M, Nygren M, Heudorfer B, Stahl K. Similarity-based approaches in hydrogeology: proposal of a new concept for data-scarce groundwater resource characterization and prediction. Hydrogeol J. 2021;29(5):1693\u2013709. https:\/\/doi.org\/10.1007\/s10040-021-02358-4.","journal-title":"Hydrogeol J"},{"key":"1060_CR20","unstructured":"Castellarin A, P C, PA T, T W, R W. Catchment classification and PUB. Hydrol Earth Syst Sci. 2011;Special is(136)."},{"key":"1060_CR21","doi-asserted-by":"publisher","first-page":"105295","DOI":"10.1016\/j.envsoft.2022.105295","volume":"149","author":"SR Clark","year":"2022","unstructured":"Clark SR. Unravelling groundwater time series patterns: Visual analytics-aided deep learning in the Namoi region of Australia. Environ Model Softw. 2022;149:105295.","journal-title":"Environ Model Softw"},{"key":"1060_CR22","unstructured":"Lombardo R, Falcone M. Crime and Economic Performance. a Cluster Analysis of Panel Data on Italy\u2019S Nuts 3 Regions.. Available from: https:\/\/econpapers.repec.org\/RePEc:clb:wpaper:201112. Accessed 11 Jan 2023"},{"issue":"1","key":"1060_CR23","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1007\/s41651-023-00146-0","volume":"7","author":"K ElHaj","year":"2023","unstructured":"ElHaj K, Alshamsi D, Aldahan A. GeoZ: a region-based visualization of clustering algorithms. J Geovisualization Spat Anal. 2023;7(1):15. https:\/\/doi.org\/10.1007\/s41651-023-00146-0.","journal-title":"J Geovisualization Spat Anal"},{"key":"1060_CR24","unstructured":"George PG, Mace PGRE, Petrossian R. Aquifers of Texas. Vol. 380, Texas Water Development Board. Texas Water Development Board Austin, TX; 2011. 1\u2013182 p."},{"key":"1060_CR25","unstructured":"Ashworth JB, Flores RR. Delineation criteria for the major and minor aquifer maps of Texas: Texas Water Development Board Limited Publication LP-212. Austin; 1991."},{"key":"1060_CR26","unstructured":"Alvarez EC, Plocheck R. Texas Almanac 2016\u20132017. Texas State Historical Assn; 2016. Available from: https:\/\/books.google.ae\/books?id=E-QnDQAAQBAJ"},{"key":"1060_CR27","unstructured":"TWDB TWDB. Groundwater Database (GWDB) [Internet]. Texas Water Development Board; https:\/\/www.twdb.texas.gov\/groundwater\/data\/gwdbrpt.asp. Accessed 6 Oct 2022"},{"key":"1060_CR28","first-page":"118","volume":"21","author":"R Tavenard","year":"2020","unstructured":"Tavenard R, Faouzi J, Vandewiele G, Divo F, Androz G, Holtz C, et al. Tslearn, a machine learning toolkit for time series data. J Mach Learn Res. 2020;21:118.","journal-title":"J Mach Learn Res"},{"issue":"5","key":"1060_CR29","doi-asserted-by":"publisher","first-page":"851","DOI":"10.1007\/s10040-015-1257-y","volume":"23","author":"Y Tremblay","year":"2015","unstructured":"Tremblay Y, Lemieux J-M, Fortier R, Molson J, Therrien R, Therrien P, et al. Semi-automated filtering of data outliers to improve spatial analysis of piezometric data. Hydrogeol J. 2015;23(5):851\u201368. https:\/\/doi.org\/10.1007\/s10040-015-1257-y.","journal-title":"Hydrogeol J"},{"issue":"1","key":"1060_CR30","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1080\/00031305.2017.1380080","volume":"72","author":"SJ Taylor","year":"2018","unstructured":"Taylor SJ, Letham B. Forecasting at scale. Am Stat. 2018;72(1):37\u201345.","journal-title":"Am Stat"},{"issue":"1","key":"1060_CR31","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2733381","volume":"10","author":"RJGB Campello","year":"2015","unstructured":"Campello RJGB, Moulavi D, Zimek A, Sander J. Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data. 2015;10(1):1\u201351. https:\/\/doi.org\/10.1145\/2733381.","journal-title":"ACM Trans Knowl Discov Data"},{"key":"1060_CR32","doi-asserted-by":"crossref","unstructured":"B. Everitt, S. Landau, M. Leese DS. Cluster Analysis, 5th Edition. John Wiley Sons Ltd. 2011; 5(1):75\u2013100.","DOI":"10.1002\/9780470977811"},{"issue":"3","key":"1060_CR33","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","volume":"17","author":"P Virtanen","year":"2020","unstructured":"Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17(3):261\u201372.","journal-title":"Nat Methods"},{"key":"1060_CR34","unstructured":"Heath RC. Basic Ground-Water Hydrology. [Internet]. US Geological Survey Water Supply Paper. 1983. Available from: https:\/\/pubs.usgs.gov\/publication\/wsp2220"},{"issue":"2","key":"1060_CR35","doi-asserted-by":"publisher","first-page":"284","DOI":"10.1111\/j.1467-8306.2004.09402005.x","volume":"94","author":"HJ Miller","year":"2004","unstructured":"Miller HJ. Tobler\u2019s first law and spatial analysis. Ann Assoc Am Geogr. 2004;94(2):284\u20139. https:\/\/doi.org\/10.1111\/j.1467-8306.2004.09402005.x.","journal-title":"Ann Assoc Am Geogr"},{"key":"1060_CR36","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74048-3","volume-title":"Information retrieval for music and motion","author":"M M\u00fcller","year":"2007","unstructured":"M\u00fcller M. Dynamic Time Warping. In: Information retrieval for music and motion. Cham: Springer; 2007."},{"key":"1060_CR37","doi-asserted-by":"crossref","unstructured":"Chen D, Cheng X. Pattern recognition and string matching. Comb Optim v 13. 2002.","DOI":"10.1007\/978-1-4613-0231-5"},{"key":"1060_CR38","doi-asserted-by":"publisher","DOI":"10.1007\/s12530-019-09306-4","author":"A Dakhli","year":"2020","unstructured":"Dakhli A, Ben AC. Power spectrum and dynamic time warping for DNA sequences classification. Evol Syst. 2020. https:\/\/doi.org\/10.1007\/s12530-019-09306-4.","journal-title":"Evol Syst"},{"key":"1060_CR39","unstructured":"Bear J. Dynamics of fluids in porous media. Courier Corporation; 2013."},{"key":"1060_CR40","unstructured":"Balasubramanian V, Ho SS, Vovk V. Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications [Internet]. Elsevier Science; 2014. https:\/\/books.google.ae\/books?id=pgfUAgAAQBAJ"},{"key":"1060_CR41","unstructured":"Triebe O, Hewamalage H, Pilyugina P, Laptev N, Bergmeir C, Rajagopal R. NeuralProphet: Explainable Forecasting at Scale. 2021; http:\/\/arxiv.org\/abs\/2111.15397"},{"key":"1060_CR42","unstructured":"Angelopoulos AN, Bates S. A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. 2021; http:\/\/arxiv.org\/abs\/2107.07511"},{"issue":"2","key":"1060_CR43","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1007\/s10040-017-1660-7","volume":"26","author":"TJ Peterson","year":"2018","unstructured":"Peterson TJ, Western AW, Cheng X. The good, the bad and the outliers: automated detection of errors and outliers from groundwater hydrographs. Hydrogeol J. 2018;26(2):371\u201380. https:\/\/doi.org\/10.1007\/s10040-017-1660-7.","journal-title":"Hydrogeol J"},{"key":"1060_CR44","first-page":"1","volume":"20","author":"Y Zhao","year":"2019","unstructured":"Zhao Y, Nasrullah Z, Li Z. PyOD: a python toolbox for scalable outlier detection. J Mach Learn Res. 2019;20:1.","journal-title":"J Mach Learn Res"},{"issue":"1","key":"1060_CR45","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1093\/bioinformatics\/btr597","volume":"28","author":"DJ Stekhoven","year":"2012","unstructured":"Stekhoven DJ, B\u00fchlmann P. MissForest\u2014non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012;28(1):112\u20138.","journal-title":"Bioinformatics"},{"key":"1060_CR46","doi-asserted-by":"publisher","DOI":"10.1863\/jss.v045.i03","author":"S van Buuren","year":"2011","unstructured":"Van Buuren S, Groothuis-Oudshoorn K. mice multivariate imputation by chained equations in R. J Stat Softw. 2011. https:\/\/doi.org\/10.1863\/jss.v045.i03.","journal-title":"J Stat Softw"},{"key":"1060_CR47","doi-asserted-by":"crossref","unstructured":"Manning CD, Raghavan P, Sch\u00fctze H. Introduction to Information Retrieval [Internet]. Cambridge University Press; 2008. https:\/\/www.cambridge.org\/core\/product\/identifier\/9780511809071\/type\/book","DOI":"10.1017\/CBO9780511809071"},{"key":"1060_CR48","doi-asserted-by":"publisher","first-page":"104981","DOI":"10.1016\/j.envsoft.2021.104981","volume":"138","author":"G Lakshmi","year":"2021","unstructured":"Lakshmi G, Sudheer KP. Parameterization in hydrological models through clustering of the simulation time period and multi-objective optimization based calibration. Environ Model Softw. 2021;138:104981.","journal-title":"Environ Model Softw"},{"key":"1060_CR49","doi-asserted-by":"publisher","unstructured":"Hong LJ, Nelson BL, Xu J. Discrete Optimization via Simulation. In 2015. p. 9\u201344. Available from: https:\/\/link.springer.com\/https:\/\/doi.org\/10.1007\/978-1-4939-1384-8_2","DOI":"10.1007\/978-1-4939-1384-8_2"},{"issue":"1","key":"1060_CR50","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1007\/BF02136830","volume":"53","author":"MC Fu","year":"1994","unstructured":"Fu MC. Optimization via simulation: a review. Ann Oper Res. 1994;53(1):199\u2013247. https:\/\/doi.org\/10.1007\/BF02136830.","journal-title":"Ann Oper Res"},{"key":"1060_CR51","unstructured":"Moritz P, Nishihara R, Wang S, Tumanov A, Liaw R, Liang E, et al. Ray: A Distributed Framework for Emerging AI Applications. http:\/\/arxiv.org\/abs\/1712.05889. Accessed 15 Dec 2017"},{"key":"1060_CR52","unstructured":"Meert W, Hendrickx K, Van Craenendonck T, Robberechts P, Blockeel H, Davis J. DTAIDistance [Internet]. 2020. Available from: https:\/\/github.com\/wannesm\/dtaidistance"},{"key":"1060_CR53","unstructured":"Liaw R, Liang E, Nishihara R, Moritz P, Gonzalez JE, Stoica I. Tune: A Research Platform for Distributed Model Selection and Training. http:\/\/arxiv.org\/abs\/1807.05118. Accessed 13 Jul 2018"},{"key":"1060_CR54","doi-asserted-by":"publisher","DOI":"10.1017\/9781108348973","volume-title":"Bayesian optimization Bayesian optimization","author":"R Garnett","year":"2023","unstructured":"Garnett R. Bayesian optimization Bayesian optimization. Cambridge: Cambridge University Press; 2023."},{"key":"1060_CR55","unstructured":"Rapin J, Teytaud O. Nevergrad - A gradient-free optimization platform. GitHub repository. GitHub; 2018."},{"key":"1060_CR56","doi-asserted-by":"publisher","first-page":"1269","DOI":"10.1613\/jair.1.13643","volume":"74","author":"AI Cowen-Rivers","year":"2022","unstructured":"Cowen-Rivers AI, Lyu W, Tutunov R, Wang Z, Grosnit A, Rhys R, et al. HEBO: pushing the limits of sample-efficient hyperparameter optimisation. J Artif Intell Res. 2022;74:1269\u2013349.","journal-title":"J Artif Intell Res"},{"key":"1060_CR57","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1016\/j.aca.2012.11.007","volume":"760","author":"C Beleites","year":"2013","unstructured":"Beleites C, Neugebauer U, Bocklitz T, Krafft C, Popp J. Sample size planning for classification models. Anal Chim Acta. 2013;760:25\u201333.","journal-title":"Anal Chim Acta"},{"issue":"18","key":"1060_CR58","doi-asserted-by":"publisher","first-page":"2535","DOI":"10.3390\/w13182535","volume":"13","author":"R-S Wu","year":"2021","unstructured":"Wu R-S, Hussain F, Lin Y-C, Yeh T-Y, Yu K-C. Characterization of regional groundwater system based on aquifer response to recharge-discharge phenomenon and hierarchical clustering analysis. Water. 2021;13(18):2535.","journal-title":"Water"},{"issue":"6","key":"1060_CR59","doi-asserted-by":"publisher","first-page":"1894","DOI":"10.1007\/s10618-019-00651-1","volume":"33","author":"J Castro Gertrudes","year":"2019","unstructured":"Castro Gertrudes J, Zimek A, Sander J, Campello RJGB. A unified view of density-based methods for semi-supervised clustering and classification. Data Min Knowl Discov. 2019;33(6):1894\u2013952. https:\/\/doi.org\/10.1007\/s10618-019-00651-1.","journal-title":"Data Min Knowl Discov"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-025-01060-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-025-01060-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-025-01060-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T07:23:30Z","timestamp":1738740210000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-025-01060-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,5]]},"references-count":59,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["1060"],"URL":"https:\/\/doi.org\/10.1186\/s40537-025-01060-6","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,5]]},"assertion":[{"value":"17 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 January 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 February 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable. The work does not contain any animal or human trials.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable. No Individual data has not been included in the current work.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"25"}}