{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T02:35:52Z","timestamp":1773455752696,"version":"3.50.1"},"reference-count":62,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2023,9,14]],"date-time":"2023-09-14T00:00:00Z","timestamp":1694649600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Research Council of Finland","award":["#289500"],"award-info":[{"award-number":["#289500"]}]},{"name":"Research Council of Finland","award":["#319274"],"award-info":[{"award-number":["#319274"]}]},{"name":"Research Council of Finland","award":["#345804"],"award-info":[{"award-number":["#345804"]}]},{"name":"Research Council of Finland","award":["#345805"],"award-info":[{"award-number":["#345805"]}]},{"name":"Research Council of Finland","award":["DP190100580"],"award-info":[{"award-number":["DP190100580"]}]},{"name":"Australian Government through the Australian Research Council\u2019s Discovery Projects funding scheme","award":["#289500"],"award-info":[{"award-number":["#289500"]}]},{"name":"Australian Government through the Australian Research Council\u2019s Discovery Projects funding scheme","award":["#319274"],"award-info":[{"award-number":["#319274"]}]},{"name":"Australian Government through the Australian Research Council\u2019s Discovery Projects funding scheme","award":["#345804"],"award-info":[{"award-number":["#345804"]}]},{"name":"Australian Government through the Australian Research Council\u2019s Discovery Projects funding scheme","award":["#345805"],"award-info":[{"award-number":["#345805"]}]},{"name":"Australian Government through the Australian Research Council\u2019s Discovery Projects funding scheme","award":["DP190100580"],"award-info":[{"award-number":["DP190100580"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>In this paper, a new nonsmooth optimization-based algorithm for solving large-scale regression problems is introduced. The regression problem is modeled as fully-connected feedforward neural networks with one hidden layer, piecewise linear activation, and the L1-loss functions. A modified version of the limited memory bundle method is applied to minimize this nonsmooth objective. In addition, a novel constructive approach for automated determination of the proper number of hidden nodes is developed. Finally, large real-world data sets are used to evaluate the proposed algorithm and to compare it with some state-of-the-art neural network algorithms for regression. The results demonstrate the superiority of the proposed algorithm as a predictive tool in most data sets used in numerical experiments.<\/jats:p>","DOI":"10.3390\/a16090444","type":"journal-article","created":{"date-parts":[[2023,9,15]],"date-time":"2023-09-15T04:06:13Z","timestamp":1694750773000},"page":"444","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Nonsmooth Optimization-Based Hyperparameter-Free Neural Networks for Large-Scale Regression"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8747-4836","authenticated-orcid":false,"given":"Napsu","family":"Karmitsa","sequence":"first","affiliation":[{"name":"Department of Computing, University of Turku, FI-20014 Turku, Finland"}]},{"given":"Sona","family":"Taheri","sequence":"additional","affiliation":[{"name":"School of Science, RMIT University, Melbourne 3000, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0269-8276","authenticated-orcid":false,"given":"Kaisa","family":"Joki","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, University of Turku, FI-20014 Turku, Finland"}]},{"given":"Pauliina","family":"Paasivirta","sequence":"additional","affiliation":[{"name":"Siili Solutions Oyj, FI-60100 Sein\u00e4joki, Finland"}]},{"given":"Adil M.","family":"Bagirov","sequence":"additional","affiliation":[{"name":"Centre for Smart Analytics, Federation University Australia, Ballarat 3350, Australia"}]},{"given":"Marko M.","family":"M\u00e4kel\u00e4","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, University of Turku, FI-20014 Turku, Finland"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1016\/j.econmod.2020.06.008","article-title":"Artificial neural network regression models in a panel setting: Predicting economic growth","volume":"91","author":"Malte","year":"2020","journal-title":"Econ. Model."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1007\/s10898-017-0535-8","article-title":"Performance of global random search algorithms for large dimensions","volume":"71","author":"Pepelyshev","year":"2018","journal-title":"J. Glob. Optim."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1007\/s10107-006-0728-2","article-title":"Globally Convergent Limited Memory Bundle Method for Large-Scale Nonsmooth Optimization","volume":"109","author":"Haarala","year":"2007","journal-title":"Math. Program."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Bagirov, A.M., Gaudioso, M., Karmitsa, N., M\u00e4kel\u00e4, M.M., and Taheri, S. (2020). Numerical Nonsmooth Optimization: State of the Art Algorithms, Springer.","DOI":"10.1007\/978-3-030-34910-3"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bagirov, A.M., Karmitsa, N., and Taheri, S. (2020). Partitional Clustering via Nonsmooth Optimization: Clustering via Optimization, Springer.","DOI":"10.1007\/978-3-030-34910-3"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Halkola, A., Joki, K., Mirtti, T., M\u00e4kel\u00e4, M.M., Aittokallio, T., and Laajala, T. (2023). OSCAR: Optimal subset cardinality regression using the L0-pseudonorm with applications to prognostic modelling of prostate cancer. PLoS Comput. Biol., 19.","DOI":"10.1371\/journal.pcbi.1010333"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Tuovinen, T., Periaux, J., and Neittaanm\u00e4ki, P. (2022). Computational Sciences and Artificial Intelligence in Industry, Springer.","DOI":"10.1007\/978-3-030-70787-3"},{"key":"ref_8","first-page":"1889","article-title":"Missing value imputation via clusterwise linear regression","volume":"34","author":"Karmitsa","year":"2022","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3374","DOI":"10.1109\/TNNLS.2017.2727545","article-title":"Fast Kronecker product kernel methods via generalized vec trick","volume":"29","author":"Airola","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1109\/TNNLS.2013.2278427","article-title":"Neural network for nonsmooth, nonconvex constrained minimization via smooth approximation","volume":"25","author":"Bian","year":"2014","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1016\/j.neucom.2021.08.060","article-title":"Learning with smooth Hinge losses","volume":"463","author":"JunRu","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., and Sciacca, V. (2019). Machine Learning, Optimization, and Data Science. LOD 2019, Springer.","DOI":"10.1007\/978-3-030-37599-7"},{"key":"ref_13","unstructured":"Griewank, A., and Rojas, A. (2020, January 14\u201317). Generalized Abs-Linear Learning by Mixed Binary Quadratic Optimization. In Proceedings of African Conference on Research in Computer Science CARI 2020, Thes, Senegal. Available online: https:\/\/hal.science\/hal-02945038."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"369","DOI":"10.1007\/s10994-014-5436-1","article-title":"An efficient primal dual prox method for non-smooth optimization","volume":"98","author":"Yang","year":"2015","journal-title":"Mach. Learn."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1080\/10556780512331318254","article-title":"Ellipsoidal separation for classification problems","volume":"20","author":"Astorino","year":"2005","journal-title":"Optim. Methods Softw."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1289","DOI":"10.1080\/10556788.2020.1855171","article-title":"Robust piecewise linear L1-regression via nonsmooth DC optimization","volume":"37","author":"Bagirov","year":"2022","journal-title":"Optim. Methods Softw."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"5071","DOI":"10.1007\/s00500-019-04255-1","article-title":"Classification in the multiple instance learning framework via spherical separation","volume":"24","author":"Gaudioso","year":"2020","journal-title":"Soft Comput."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1039","DOI":"10.1007\/s10957-013-0458-6","article-title":"Support vector machine polyhedral separability in semisupervised learning","volume":"164","author":"Astorino","year":"2015","journal-title":"J. Optim. Theory Appl."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"966","DOI":"10.1109\/TNNLS.2015.2430935","article-title":"The proximal trajectory algorithm in SVM cross validation","volume":"27","author":"Astorino","year":"2015","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/j.ejor.2020.04.032","article-title":"Clusterwise support vector linear regression","volume":"287","author":"Joki","year":"2020","journal-title":"Eur. J. Oper. Res."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1109\/TNN.2002.1000141","article-title":"Neural-network approximation of piecewise continuous functions: Application to friction compensation","volume":"13","author":"Selmic","year":"2002","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_22","unstructured":"Imaizumi, M., and Fukumizu, K. (2019, January 16\u201318). Deep Neural Networks Learn Non-Smooth Functions Effectively. Proceedings of the Machine Learning Research, Naha, Okinawa, Japan."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1007\/s10208-018-09409-5","article-title":"Stochastic subgradient method converges on tame functions","volume":"20","author":"Davies","year":"2020","journal-title":"Found. Comput. Math."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Aggarwal, C. (2018). Neural Networks and Deep Learning, Springer.","DOI":"10.1007\/978-3-319-94463-0"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1988","journal-title":"Nature"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1109\/TNN.2003.809401","article-title":"Learning capability and storage capacity of two-hidden-layer feedforward networks","volume":"14","author":"Huang","year":"2003","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Reed, R., and Marks, R.J. (1998). Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks, The MIT Press.","DOI":"10.7551\/mitpress\/4937.001.0001"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Vicoveanu, P., Vasilache, I., Scripcariu, I., Nemescu, D., Carauleanu, A., Vicoveanu, D., Covali, A., Filip, C., and Socolov, D. (2022). Use of a feed-forward back propagation network for the prediction of small for gestational age newborns in a cohort of pregnant patients with thrombophilia. Diagnostics, 12.","DOI":"10.3390\/diagnostics12041009"},{"key":"ref_29","unstructured":"Broomhead, D., and Lowe, D. (1988). Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks, Royals Signals and Radar Establishment."},{"key":"ref_30","first-page":"100190","article-title":"A machine learning prediction of academic performance of secondary school students using radial basis function neural network","volume":"22","author":"Olusola","year":"2022","journal-title":"Trends Neurosci. Educ."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"887","DOI":"10.1109\/TR.2020.3001232","article-title":"Hybrid learning algorithm of radial basis function networks for reliability analysis","volume":"70","author":"Zhang","year":"2021","journal-title":"IEEE Trans. Reliab."},{"key":"ref_32","unstructured":"Haykin, S. (2007). Neural Networks: A Comprehensive Foundation, Prentice Hall."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"2901","DOI":"10.1007\/s13042-018-00913-2","article-title":"Automatic selection of hidden neurons and weights in neural networks using grey wolf optimizer based on a hybrid encoding scheme","volume":"10","author":"Faris","year":"2019","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2099","DOI":"10.1109\/TNN.2008.2004370","article-title":"A constructive hybrid structure optimization methodology for radial basis probabilistic neural networks","volume":"19","author":"Huang","year":"2008","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_35","first-page":"7","article-title":"An improved approach for hidden nodes selection in artificial neural network","volume":"12","author":"Odikwa","year":"2020","journal-title":"Int. J. Appl. Inf. Syst."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1109\/TNN.2002.804317","article-title":"Tuning of the structure and parameters of a neural network using an improved genetic algorithm","volume":"11","author":"Leung","year":"2003","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2133","DOI":"10.1080\/01431160802549278","article-title":"How many hidden layers and nodes?","volume":"30","author":"Stathakis","year":"2009","journal-title":"Int. J. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1109\/TNN.2005.860885","article-title":"Tuning the structure and parameters of a neural network by using hybrid Taguchi-genetic algorithm","volume":"17","author":"Tsai","year":"2006","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Bagirov, A.M., Karmitsa, N., and M\u00e4kel\u00e4, M.M. (2014). Introduction to Nonsmooth Optimization: Theory, Practice and Software, Springer.","DOI":"10.1007\/978-3-319-08114-4"},{"key":"ref_40","unstructured":"Clarke, F.H. (1983). Optimization and Nonsmooth Analysis, Wiley-Interscience."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wilamowski, B.M. (2011). The Industrial Electronics Handbook, CRC Press.","DOI":"10.1201\/NOE1439802892"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"837","DOI":"10.1162\/089976604322860721","article-title":"Robust formulations for training multilayer perceptrons","volume":"16","author":"Heikkola","year":"2004","journal-title":"Neural Comput."},{"key":"ref_43","unstructured":"Karmitsa, N., Taheri, S., Joki, K., M\u00e4kinen, P., Bagirov, A., and M\u00e4kel\u00e4, M.M. (2020). Hyperparameter-Free NN Algorithm for Large-Scale Regression Problems; TUCS Technical Report, No. 1213, Turku Centre for Computer Science. Available online: https:\/\/napsu.karmitsa.fi\/publications\/lmbnnr_tucs.pdf."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1043","DOI":"10.1137\/S1052623403428208","article-title":"A nonmonotone line search technique and its application to unconstrained optimization","volume":"14","author":"Zhang","year":"2004","journal-title":"SIAM J. Optim."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1007\/BF01582063","article-title":"Representations of quasi-Newton matrices and their use in limited memory methods","volume":"63","author":"Byrd","year":"1994","journal-title":"Math. Program."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Kiwiel, K.C. (1985). Methods of Descent for Nondifferentiable Optimization, Springer. Lecture Notes in Mathematics 1133.","DOI":"10.1007\/BFb0074500"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1007\/BF00938396","article-title":"Optimization of upper semidifferentiable functions","volume":"4","author":"Bihain","year":"1984","journal-title":"J. Optim. Theory Appl."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1109\/TSMCB.2011.2168604","article-title":"Extreme learning machine for regression and multiclass classification","volume":"42","author":"Huang","year":"2011","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Duch, W., Kacprzyk, J., Oja, E., and Zadro\u017any, S. (2005). Artificial Neural Networks: Formal Models and Their Applications\u2014ICANN 2005, Springer.","DOI":"10.1007\/11550907"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1016\/j.ijepes.2014.02.027","article-title":"Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods","volume":"60","year":"2014","journal-title":"Int. J. Electr. Power Energy Syst."},{"key":"ref_51","unstructured":"Kaya, H., T\u00fcfekci, P., and G\u00fcrgen, S.F. (2012, January 24\u201325). Local and Global Learning Methods for Predicting Power of a Combined Gas & Steam Turbine. Proceedings of the International Conference on Emerging Trends in Computer and Electronics Engineering ICETCEE 2012, Dubai, United Arab Emirates."},{"key":"ref_52","unstructured":"Dua, D., and Karra Taniskidou, E. (2020, November 25). UCI Machine Learning Repository. Available online: http:\/\/archive.ics.uci.edu\/ml."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"1797","DOI":"10.1016\/S0008-8846(98)00165-3","article-title":"Modeling of strength of high performance concrete using artificial neural networks","volume":"28","author":"Yeh","year":"1998","journal-title":"Cem. Concr. Res."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/0095-0696(78)90006-2","article-title":"Hedonic prices and the demand for clean air","volume":"5","author":"Harrison","year":"1978","journal-title":"J. Environ. Econ. Manag."},{"key":"ref_55","unstructured":"Paredes, E., and Ballester-Ripoll, R. (2023, September 10). SGEMM GPU kernel performance (2018). In UCI Machine Learning Repository. Available online: https:\/\/doi.org\/10.24432\/C5MK70."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Nugteren, C., and Codreanu, V. (2015, January 23\u201325). CLTune: A Generic Auto-Tuner for OpenCL Kernels. Proceedings of the MCSoC: 9th International Symposium on Embedded Multicore\/Many-core Systems-on-Chip, Turin, Italy.","DOI":"10.1109\/MCSoC.2015.10"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Fernandes, K., Vinagre, P., and Cortez, P. (2015, January 8\u201311). A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News. Proceedings of the 17th EPIA 2015\u2014Portuguese Conference on Artificial Intelligence, Coimbra, Portugal.","DOI":"10.1007\/978-3-319-23485-4_53"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"04015066","DOI":"10.1061\/(ASCE)CO.1943-7862.0001047","article-title":"A novel machine learning model for estimation of sale prices of real estate units","volume":"142","author":"Rafiei","year":"2015","journal-title":"ASCE J. Constr. Eng. Manag."},{"key":"ref_59","unstructured":"Buza, K. (2014). Data Analysis, Machine Learning and Knowledge Discovery, Springer International Publishing."},{"key":"ref_60","unstructured":"Krizhevsky, A. (2021, November 14). Learning Multiple Layers of Features from Tiny Images. Available online: https:\/\/www.cs.toronto.edu\/~kriz\/cifar.html."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"121","DOI":"10.5194\/gi-4-121-2015","article-title":"Designing optimal greenhouse gas observing networks that consider performance and cost","volume":"4","author":"Lucas","year":"2015","journal-title":"Geosci. Instrum. Methods Data Syst."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"424","DOI":"10.1137\/21M1428601","article-title":"Optimal convergence rates for the proximal bundle method","volume":"33","author":"Diaz","year":"2023","journal-title":"SIAM J. Optim."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/9\/444\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:51:16Z","timestamp":1760129476000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/9\/444"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,14]]},"references-count":62,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2023,9]]}},"alternative-id":["a16090444"],"URL":"https:\/\/doi.org\/10.3390\/a16090444","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,14]]}}}