{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:02:00Z","timestamp":1760058120303,"version":"build-2065373602"},"reference-count":53,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,3,14]],"date-time":"2025-03-14T00:00:00Z","timestamp":1741910400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>By distributing the training process, local approximation reduces the cost of the standard Gaussian process. An ensemble method aggregates predictions from local Gaussian experts, each trained on different data partitions, under the assumption of perfect diversity among them. While this assumption ensures tractable aggregation, it is frequently violated in practice. Although ensemble methods provide consistent results by modeling dependencies among experts, they incur a high computational cost, scaling cubically with the number of experts. Implementing an expert-selection strategy reduces the number of experts involved in the final aggregation step, thereby improving efficiency. However, selection approaches that assign a fixed set of experts to each data point cannot account for the unique properties of individual data points. This paper introduces a flexible expert-selection approach tailored to the characteristics of individual data points. To achieve this, we frame the selection task as a multi-label classification problem in which experts define the labels, and each data point is associated with specific experts. We discuss in detail the prediction quality, efficiency, and asymptotic properties of the proposed solution. We demonstrate the efficiency of the proposed method through extensive numerical experiments on synthetic and real-world datasets. This strategy is easily extendable to distributed learning scenarios and multi-agent models, regardless of Gaussian assumptions regarding the experts.<\/jats:p>","DOI":"10.3390\/e27030307","type":"journal-article","created":{"date-parts":[[2025,3,14]],"date-time":"2025-03-14T08:46:46Z","timestamp":1741942006000},"page":"307","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Multilabel Classification for Entry-Dependent Expert Selection in Distributed Gaussian Processes"],"prefix":"10.3390","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1190-5652","authenticated-orcid":false,"given":"Hamed","family":"Jalali","sequence":"first","affiliation":[{"name":"Center for Plant Molecular Biology (ZMBP), University of T\u00fcbingen, 72076 Tuebingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3123-7268","authenticated-orcid":false,"given":"Gjergji","family":"Kasneci","sequence":"additional","affiliation":[{"name":"School of Social Sciences and Technology, Technical University of Munich, 80333 Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,3,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Rasmussen, C.E., and Williams, C.K. (2006). Gaussian Processes for Machine Learning, MIT Press.","DOI":"10.7551\/mitpress\/3206.001.0001"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"10073","DOI":"10.1021\/acs.chemrev.1c00022","article-title":"Gaussian Process Regression for Materials and Molecules","volume":"121","author":"Deringer","year":"2021","journal-title":"Chem. Rev."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"101054","DOI":"10.1016\/j.jobe.2019.101054","article-title":"Prediction of building electricity usage using Gaussian Process Regression","volume":"28","author":"Zeng","year":"2020","journal-title":"J. Build. Eng."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"094114","DOI":"10.1063\/1.5017103","article-title":"Gaussian process regression for geometry optimization","volume":"148","author":"Denzel","year":"2018","journal-title":"J. Chem. Phys."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Gramacy, R. (2020). Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences, Chapman and Hall\/CRC.","DOI":"10.1201\/9780367815493"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Le, T., Nguyen, K., Nguyen, V., Nguyen, T.D., and Phung, D. (2017, January 18\u201321). GoGP: Fast online regression with Gaussian processes. Proceedings of the IEEE International Conference on Data Mining, New Orleans, LA, USA.","DOI":"10.1109\/ICDM.2017.35"},{"key":"ref_7","unstructured":"Tobar, F., Bui, T.D., and Turner, R.E. (2015, January 7\u201312). Learning stationary time series using Gaussian processes with nonparametric kernels. Proceedings of the 29th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_8","unstructured":"Deisenroth, M.P., and Ng, J.W. (2015, January 6\u201311). Distributed Gaussian processes. Proceedings of the International Conference on Machine Learning, Lille, France."},{"key":"ref_9","unstructured":"Seeger, M., Williams, C., and Lawrence, N. (2003, January 3\u20136). Fast Forward Selection to Speed Up Sparse Gaussian Process Regression. Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, Key West, FL, USA."},{"key":"ref_10","unstructured":"Titsias, M.K. (2009, January 16\u201318). Variational learning of inducing variables in sparse Gaussian processes. Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, Clearwater Beach, FL, USA."},{"key":"ref_11","unstructured":"Hensman, J., Fusi, N., and Lawrence, N.D. (2013). Gaussian processes for big data. Uncertainty in Artificial Intelligence, AUAI Press."},{"key":"ref_12","unstructured":"Cheng, C., and Boots, B. (2017, January 4\u20139). Variational Inference for Gaussian Process Models with Linear Complexity. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_13","unstructured":"Burt, D.R., Rasmussen, C.E., and van der Wilk, M. (2019, January 9\u201315). Rates of Convergence for Sparse Variational Gaussian Process Regression. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_14","unstructured":"Bui, T.D., and Turner, R.E. (2014, January 8\u201313). Tree-structured Gaussian process approximations. Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_15","unstructured":"Moore, D., and Russell, S.J. (2015, January 7\u201312). Gaussian process random fields. Proceedings of the 29th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s13748-012-0035-5","article-title":"A survey of methods for distributed machine learning","volume":"2","year":"2013","journal-title":"Prog. Artif. Intell."},{"key":"ref_17","first-page":"30","article-title":"A Survey on Distributed Machine Learning","volume":"53","author":"Zhang","year":"2020","journal-title":"ACM Comput. Surv."},{"key":"ref_18","unstructured":"Liu, J., Huang, J., Zhou, Y., Li, X., Ji, S., Xiong, H., and Dou, D. (2021). From distributed machine learning to federated learning: A survey. arXiv."},{"key":"ref_19","unstructured":"Yang, Z., Dai, X., Dubey, A., Hirche, S., and Hattab, G. (2024, January 6\u201310). Whom to Trust? Elective Learning for Distributed Gaussian Process Regression. Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, Auckland, New Zealand."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"4405","DOI":"10.1109\/TNNLS.2019.2957109","article-title":"When Gaussian Process Meets Big Data: A Review of Scalable GPs","volume":"31","author":"Liu","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_21","first-page":"1","article-title":"An asymptotic analysis of distributed nonparametric methods","volume":"20","year":"2019","journal-title":"J. Mach. Learn. Res."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1007\/s11222-017-9766-2","article-title":"Nested Kriging predictions for datasets with a large number of observations","volume":"28","author":"Durrande","year":"2018","journal-title":"Stat. Comput."},{"key":"ref_23","unstructured":"Bachoc, F., Durrande, N., Rulli\u00e8re, D., and Chevalier, C. (2017). Some Properties of Nested Kriging Predictors. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Jalali, H., and Kasneci, G. (2022, January 17\u201320). Aggregating the Gaussian Experts\u2019 Predictions via Undirected Graphical Models. Proceedings of the IEEE International Conference on Big Data and Smart Computing (BigComp), Daegu, Republic of Korea.","DOI":"10.1109\/BigComp54360.2022.00014"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Jalali, H., Pawelczyk, M., and Kasneci, G. (2021, January 15\u201318). Model Selection in Local Approximation Gaussian Processes: A Markov Random Fields Approach. Proceedings of the IEEE International Conference on Big Data (Big Data), Orlando, FL, USA.","DOI":"10.1109\/BigData52589.2021.9672077"},{"key":"ref_26","unstructured":"Jalali, H., Pawelczyk, M., and Kasneci, G. (2021). Gaussian Experts Selection using Graphical Models. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Herrera, F., Charte, F., Rivera, A.J., and del Jesus, M.J. (2016). Multilabel Classification. Multilabel Classification: Problem Analysis, Metrics and Techniques, Springer International Publishing.","DOI":"10.1007\/978-3-319-41111-8"},{"key":"ref_28","first-page":"1","article-title":"Patchwork kriging for large-scale Gaussian process regression","volume":"19","author":"Park","year":"2018","journal-title":"J. Mach. Learn. Res."},{"key":"ref_29","unstructured":"Blanchard, P., Mhamdi, E.E., Guerraoui, R., and Stainer, J. (2017, January 4\u20139). Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_30","unstructured":"Data, D., and Diggavi, S. (2021, January 18\u201324). Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data. Proceedings of the 38th International Conference on Machine Learning, Virtual."},{"key":"ref_31","unstructured":"Liu, H., Cai, J., Ong, Y., and Wang, Y. (2018, January 10\u201315). Generalized robust Bayesian committee machine for large-scale Gaussian process regression. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1771","DOI":"10.1162\/089976602760128018","article-title":"Training products of experts by minimizing contrastive divergence","volume":"14","author":"Hinton","year":"2002","journal-title":"Neural Comput."},{"key":"ref_33","unstructured":"Cao, Y., and Fleet, D.J. (2014). Generalized product of experts for automatic and principled fusion of Gaussian process predictions. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2719","DOI":"10.1162\/089976600300014908","article-title":"A Bayesian committee machine","volume":"12","author":"Tresp","year":"2000","journal-title":"Neural Comput."},{"key":"ref_35","first-page":"10","article-title":"Ensemble approaches for regression: A survey","volume":"45","author":"Soares","year":"2012","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1253","DOI":"10.1073\/pnas.1219097111","article-title":"Ranking and combining multiple predictors without labeled data","volume":"111","author":"Parisi","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_37","unstructured":"Jaffe, A., Fetaya, E., Nadler, B., Jiang, T., and Kluger, Y. (2016, January 9\u201311). Unsupervised ensemble learning with dependent classifiers. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, Cadiz, Spain."},{"key":"ref_38","first-page":"1323","article-title":"Unsupervised supervised learning I: Estimating classification and regression errors with-out labels","volume":"11","author":"Donmez","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_39","unstructured":"Platanios, E., Blum, A., and Mitchell, T. (July, January 23\u2013). Estimating accuracy from unlabeled data. Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, Quebec City, QC, Canada."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1093\/biostatistics\/kxm045","article-title":"Sparse inverse covariance estimation with the graphical lasso","volume":"9","author":"Friedman","year":"2008","journal-title":"Biostatistics"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"892","DOI":"10.1198\/jcgs.2011.11051a","article-title":"New insights and faster computations for the graphical lasso","volume":"20","author":"Witten","year":"2011","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Jalali, H., and Kasneci, G. (2021, January 10\u201315). Aggregating Dependent Gaussian Experts in Local Approximation. Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.","DOI":"10.1109\/ICPR48806.2021.9413079"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1080\/00031305.1992.10475879","article-title":"An Introduction to Kernel and Nearest-Neighbor Non-parametric Regression","volume":"46","author":"Altman","year":"1992","journal-title":"Am. Stat."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Everitt, B., Landau, S., Leese, M., and Stahl, D. (2011). Miscellaneous clustering methods. Cluster Analysis, John Wiley & Sons.","DOI":"10.1002\/9780470977811"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Read, J., and Perez-Cruz, F. (2014). Deep learning for multi-label classification. arXiv.","DOI":"10.4018\/978-1-4666-5202-6.ch142"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"1279","DOI":"10.1093\/jamia\/ocz085","article-title":"ML-Net: Multi-label classification of biomedical texts with deep neural networks","volume":"26","author":"Du","year":"2019","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1214\/12-STS391","article-title":"Sparse Nonparametric Graphical Models","volume":"27","author":"Lafferty","year":"2012","journal-title":"Stat. Sci."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1214\/19-BA1159","article-title":"Bayesian Inference in Nonparanormal Graphical Models","volume":"15","author":"Mulgrave","year":"2020","journal-title":"Bayesian Anal."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"1637","DOI":"10.1080\/01621459.2017.1356726","article-title":"A Nonparametric Graphical Model for Functional Data with Application to Brain Networks Based on fMRI","volume":"113","author":"Li","year":"2018","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Solea, E., and Dette, H. (2021). Nonparametric and high-dimensional functional graphical models. arXiv.","DOI":"10.1214\/22-EJS2087"},{"key":"ref_51","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, Curran Associates Inc."},{"key":"ref_52","first-page":"28742","article-title":"Self-attention between datapoints: Going beyond individual input-output pairs in deep learning","volume":"Volume 34","author":"Kossen","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_53","unstructured":"Nguyen, T., and Bonilla, E. (2014, January 21\u201326). Fast Allocation of Gaussian Process Experts. Proceedings of the 31st International Conference on Machine Learning, Beijing, China."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/3\/307\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:53:51Z","timestamp":1760028831000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/3\/307"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,14]]},"references-count":53,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["e27030307"],"URL":"https:\/\/doi.org\/10.3390\/e27030307","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2025,3,14]]}}}