{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T14:27:45Z","timestamp":1768832865400,"version":"3.49.0"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2022,6,3]],"date-time":"2022-06-03T00:00:00Z","timestamp":1654214400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,6,3]],"date-time":"2022-06-03T00:00:00Z","timestamp":1654214400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Earth Sci Inform"],"published-print":{"date-parts":[[2022,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Biplot diagrams are traditionally used for rock discrimination using geochemical data from samples. However, this approach has limitations when facing a high number of variables. Machine learning has been proposed as an alternative to analyze multivariate data for more than 70 years. However, the application of machine learning by geoscientists is still complicated since there are no tools that propose a pipeline that can be followed from preparing the data to evaluating the models. Automated machine learning aims to face this issue by automating the creation and evaluation of machine learning models. The contribution of this work is twofold. First, we propose a methodology that follows a pipeline for the application of supervised and unsupervised learning to geochemical data. Both methods were applied to a dataset of granitic rock samples from 6 blocks in the Peninsular Ranges and the Transverse Ranges Provinces in Southern California. For supervised learning, the Decision Trees model offered the best values to classify the samples from this region: accuracy: 87%; precision: 89%; recall: 89%; and F-score: 81%. For unsupervised learning, 2 components were related to pressure effects, and another 2 could be related to water effects. As a second contribution, we propose a web application that follows the proposed methodology to analyze geochemical data using automated machine learning. It allows data preparation using techniques such as imputation and upsampling, the application of supervised and unsupervised learning, and the evaluation of the models. All this without the need to program.<\/jats:p>","DOI":"10.1007\/s12145-022-00821-8","type":"journal-article","created":{"date-parts":[[2022,6,3]],"date-time":"2022-06-03T17:02:37Z","timestamp":1654275757000},"page":"1683-1698","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Automated machine learning pipeline for geochemical analysis"],"prefix":"10.1007","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9668-1132","authenticated-orcid":false,"given":"Germ\u00e1n H.","family":"Alf\u00e9rez","sequence":"first","affiliation":[]},{"given":"Oscar A.","family":"Esteban","sequence":"additional","affiliation":[]},{"given":"Benjamin L.","family":"Clausen","sequence":"additional","affiliation":[]},{"given":"Ana Mar\u00eda Mart\u00ednez","family":"Ardila","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,6,3]]},"reference":[{"key":"821_CR1","unstructured":"Alf\u00e9rez GH, Rodr\u00edguez J, Pompe LR, Clausen B (2015) Interpreting the Geochemistry of Southern California Granitic Rocks Using Machine Learning. Proceedings of the 2015 International Conference on Artificial Intelligence (ICAI 2015), Las Vegas, NV, USA"},{"key":"821_CR2","unstructured":"Alpaydin E (2010) Introduction to Machine Learning. MIT press, ch. What is Machine Learning?, pp 1\u20133"},{"issue":"1","key":"821_CR3","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1016\/j.sedgeo.2005.02.004","volume":"177","author":"J Armstrong-Altrin","year":"2005","unstructured":"Armstrong-Altrin J, Verma SP (2005) Critical evaluation of six tectonic setting discrimination diagrams using geochemical data of Neogene sediments from known tectonic settings. Sediment Geol 177(1):115\u2013129","journal-title":"Sediment Geol"},{"key":"821_CR4","doi-asserted-by":"crossref","unstructured":"Baird AK, Miesch AT (1984) Batholithic rocks of southern california; a model for the petrochemical nature of their source materials. Tech. Rep., reportIt can be improved with the following information in RIS format:TY  - RPRT\nA3  - \nCY  - \nC6  - \nET  - -\nLA  - ENGLISH\nM3  - Report\nSN  - 1284\nSP  - \nT2  - Professional Paper\nVL  - \nAU  - Baird, A.K.\nAU  - Miesch, A.T.\nTI  - Batholithic rocks of Southern California; a model for the petrochemical nature of their source materials\nPY  - 1984\nDO  - 10.3133\/pp1284\nDB  - USGS Publications Warehouse\nUR  - http:\/\/pubs.er.usgs.gov\/publication\/pp1284ER-","DOI":"10.3133\/pp1284"},{"key":"821_CR5","doi-asserted-by":"crossref","unstructured":"Ding C, He X (2004) K-means clustering via principal component analysis. In: Proceedings, Twenty-First international conference on machine learning, ICML 2004, vol 1","DOI":"10.1145\/1015330.1015408"},{"key":"821_CR6","doi-asserted-by":"crossref","unstructured":"Dramsch JS (2020) Chapter one - 70 years of machine learning in geoscience in review. In: Moseley B, Krischer L (eds) Machine Learning in Geosciences, ser. Adv\u00a0 Geophys 61:1\u201355. Elsevier.\u00a0http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0065268720300054","DOI":"10.1016\/bs.agph.2020.08.002"},{"key":"821_CR7","doi-asserted-by":"publisher","first-page":"200","DOI":"10.1016\/j.apgeochem.2016.05.016","volume":"75","author":"KJ Ellefsen","year":"2016","unstructured":"Ellefsen KJ, Smith DB (2016) Manual hierarchical clustering of regional geochemical data using a bayesian finite mixture model. Appl Geochem 75:200\u2013210.\u00a0http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0883292716300920","journal-title":"Appl Geochem"},{"key":"821_CR8","doi-asserted-by":"crossref","unstructured":"Feurer M, Hutter F (2019) Hyperparameter Optimization.In: Hutter F, Kotthoff L, and Vanschoren J (eds). Automated Machine Learning: Methods, Systems, Challenges.Springer International Publishing Cham. 3\u201333","DOI":"10.1007\/978-3-030-05318-5_1"},{"key":"821_CR9","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","volume":"63","author":"P Geurts","year":"2006","unstructured":"Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3\u201342","journal-title":"Mach Learn"},{"key":"821_CR10","unstructured":"Goyal A (2019) A brief introduction to autoML.\u00a0https:\/\/becominghuman.ai\/a-brief-introduction-to-automl-fa6b598d408"},{"key":"821_CR11","unstructured":"Grinberg M (2014) Flask Web development: Developing Web Applications with Python, 1st ed, O\u2019Reilly Media Inc."},{"issue":"1","key":"821_CR12","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1093\/petrology\/28.1.75","volume":"28","author":"P Gromet","year":"1987","unstructured":"Gromet P, Silver LT (1987) REE variations across the peninsular ranges batholith: implications for batholithic petrogenesis and crustal growth in magmatic arcs. J Petrol 28(1):75\u2013125","journal-title":"J Petrol"},{"key":"821_CR13","unstructured":"Harrington P (2012) Machine Learning in Action. Manning"},{"key":"821_CR14","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1016\/j.cageo.2019.07.004","volume":"132","author":"D Hasterok","year":"2019","unstructured":"Hasterok D, Gard M, Bishop CMB, Kelsey D (2019) Chemical identification of metamorphic protoliths using machine learning methods. Comput Geosci 132:56\u201368","journal-title":"Comput Geosci"},{"key":"821_CR15","first-page":"12","volume":"41","author":"R Hildebrand","year":"2014","unstructured":"Hildebrand RS, Whalen JB (2014) Arc and slab-failure magmatism in cordilleran batholiths II - The cretaceous peninsular ranges batholith of Southern and Baja California. Geosci Can 41:12","journal-title":"Geosci Can"},{"key":"821_CR16","doi-asserted-by":"crossref","unstructured":"Hutter F, Kotthoff L, Vanschoren J (2019) Automated machine learning: methods, systems, challenges. Springer Nature","DOI":"10.1007\/978-3-030-05318-5"},{"key":"821_CR17","doi-asserted-by":"crossref","unstructured":"Itano K, Ueki K, Iizuka T, Kuwatani T (2020) Geochemical discrimination of monazite source rock based on machine learning techniques and multinomial logistic regression analysis. Geosciences 10(2)","DOI":"10.3390\/geosciences10020063"},{"key":"821_CR18","doi-asserted-by":"crossref","unstructured":"Jiang Y, Guo H, Jia Y, Cao Y, Hu C (2015) Principal component analysis and hierarchical cluster analyses of arsenic groundwater geochemistry in the Hetao basin, Inner Mongolia. Geochemistry 75(2):197\u2013205","DOI":"10.1016\/j.chemer.2014.12.002"},{"issue":"1","key":"821_CR19","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.gsf.2015.07.003","volume":"7","author":"DJ Lary","year":"2016","unstructured":"Lary DJ, Alavi AH, Gandomi AH, Walker AL (2016) Machine learning in geosciences and remote sensing. Geosci Front 7(1):3\u201310. special Issue: Progress of Machine Learning in Geosciences","journal-title":"Geosci Front"},{"key":"821_CR20","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1016\/j.lithos.2015.06.022","volume":"232","author":"C Li","year":"2015","unstructured":"Li C, Arndt NT, Tang Q, Ripley EM (2015) Trace element indiscrimination diagrams. Lithos 232:76\u201383","journal-title":"Lithos"},{"key":"821_CR21","unstructured":"MSV J (2018) Why do developers find it hard to learn machine learning?.\u00a0\u00a0https:\/\/www.forbes.com\/sites\/janakirammsv\/2018\/01\/01\/why-do-developers-find-it-hard-to-learn-machine-learning\/?sh=d47fe096bf6d"},{"key":"821_CR22","unstructured":"Marius P, Balas V, Perescu-Popescu L, Mastorakis N (2009) Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems, vol 8"},{"key":"821_CR23","doi-asserted-by":"publisher","first-page":"103284","DOI":"10.1016\/j.coal.2019.103284","volume":"214","author":"K Maxwell","year":"2019","unstructured":"Maxwell K, Rajabi M, Esterle J (2019) Automated classification of metamorphosed coal from geophysical log data using supervised machine learning techniques. Int J Coal Geol 214:103284","journal-title":"Int J Coal Geol"},{"key":"821_CR24","doi-asserted-by":"crossref","unstructured":"Mohammed M, Khan M, Bashier E (2017) Machine Learning: Algorithms and Applications. CRC Press","DOI":"10.1201\/9781315371658"},{"issue":"3","key":"821_CR25","doi-asserted-by":"publisher","first-page":"128","DOI":"10.14445\/22312803\/IJCTT-V48P126","volume":"48","author":"F Osisanwo","year":"2017","unstructured":"Osisanwo F, Akinsola J, Awodele O, Hinmikaiye J, Olakanmi O, Akinjobi J (2017) Supervised machine learning algorithms: classification and comparison. International Journal of Computer Trends and Technology (IJCTT) 48(3):128\u2013138","journal-title":"International Journal of Computer Trends and Technology (IJCTT)"},{"key":"821_CR26","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1016\/0012-821X(73)90129-5","volume":"19","author":"J Pearce","year":"1973","unstructured":"Pearce J, Cann J (1973) Tectonic setting of basic volcanic rocks determined using trace element analyses. Earth Planet Sci Lett 19:290\u2013300","journal-title":"Earth Planet Sci Lett"},{"issue":"10","key":"821_CR27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s00410-016-1292-2","volume":"171","author":"M Petrelli","year":"2016","unstructured":"Petrelli M, Perugini D (2016) Solving petrological problems through machine learning: the study case of tectonic discrimination using geochemical and isotopic data. Contrib Mineral Petrol 171(10):1\u201315","journal-title":"Contrib Mineral Petrol"},{"key":"821_CR28","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1016\/j.jappgeo.2018.06.012","volume":"155","author":"CM Saporetti","year":"2018","unstructured":"Saporetti CM, da Fonseca LG, Pereira E, de Oliveira LC (2018) Machine learning approaches for petrographic classification of carbonatesiliciclastic rocks using well logs and textural information. J Appl Geophys 155:217\u2013225","journal-title":"J Appl Geophys"},{"key":"821_CR29","unstructured":"Scott B, Steenkamp NC (2019) Machine learning in geology.\u00a0\u00a0https:\/\/www.africanmining.co.za\/2019\/07\/29\/machine-learning-in-geology\/"},{"key":"821_CR30","doi-asserted-by":"publisher","first-page":"1327","DOI":"10.1029\/2017GC007401","volume":"19","author":"K Ueki","year":"2018","unstructured":"Ueki K, Hino H, Kuwatani T (2018) Geochemical discrimination and characteristics of magmatic tectonic settings; a machine learning-based approach. Geochem Geophys Geosyst 19:1327\u20131347","journal-title":"Geochem Geophys Geosyst"},{"key":"821_CR31","doi-asserted-by":"crossref","unstructured":"Vieira S, Garcia-Dias R, Pinaya W (2019) Machine Learning Methods and Applications to Brain Disorders, ch. A step-by-step tutorial on how to build a machine learning model, pp 343\u2013370","DOI":"10.1016\/B978-0-12-815739-8.00019-5"},{"key":"821_CR32","unstructured":"Yao Q, Wang M, Escalante HJ, Guyon I, Hu Y, Li Y, Tu W, Yang Q, Yu Y (2018) Taking human out of learning applications: A survey on automated machine learning.arXiv:abs\/1810.13306"}],"container-title":["Earth Science Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12145-022-00821-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s12145-022-00821-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12145-022-00821-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,20]],"date-time":"2022-08-20T10:17:02Z","timestamp":1660990622000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s12145-022-00821-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,3]]},"references-count":32,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,9]]}},"alternative-id":["821"],"URL":"https:\/\/doi.org\/10.1007\/s12145-022-00821-8","relation":{},"ISSN":["1865-0473","1865-0481"],"issn-type":[{"value":"1865-0473","type":"print"},{"value":"1865-0481","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,3]]},"assertion":[{"value":"12 April 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 May 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 June 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"There are no conflict of interest in this work.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Conflict of Interests"}}]}}