{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:12:51Z","timestamp":1760235171959,"version":"build-2065373602"},"reference-count":24,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2021,8,3]],"date-time":"2021-08-03T00:00:00Z","timestamp":1627948800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Linear regression (LR) is a core model in supervised machine learning performing a regression task. One can fit this model using either an analytic\/closed-form formula or an iterative algorithm. Fitting it via the analytic formula becomes a problem when the number of predictors is greater than the number of samples because the closed-form solution contains a matrix inverse that is not defined when having more predictors than samples. The standard approach to solve this issue is using the Moore\u2013Penrose inverse or the L2 regularization. We propose another solution starting from a machine learning model that, this time, is used in unsupervised learning performing a dimensionality reduction task or just a density estimation one\u2014factor analysis (FA)\u2014with one-dimensional latent space. The density estimation task represents our focus since, in this case, it can fit a Gaussian distribution even if the dimensionality of the data is greater than the number of samples; hence, we obtain this advantage when creating the supervised counterpart of factor analysis, which is linked to linear regression. We also create its semisupervised counterpart and then extend it to be usable with missing data. We prove an equivalence to linear regression and create experiments for each extension of the factor analysis model. The resulting algorithms are either a closed-form solution or an expectation\u2013maximization (EM) algorithm. The latter is linked to information theory by optimizing a function containing a Kullback\u2013Leibler (KL) divergence or the entropy of a random variable.<\/jats:p>","DOI":"10.3390\/e23081012","type":"journal-article","created":{"date-parts":[[2021,8,4]],"date-time":"2021-08-04T02:16:07Z","timestamp":1628043367000},"page":"1012","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["A Factor Analysis Perspective on Linear Regression in the \u2018More Predictors than Samples\u2019 Case"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1873-3563","authenticated-orcid":false,"given":"Sebastian","family":"Ciobanu","sequence":"first","affiliation":[{"name":"Faculty of Computer Science, Alexandru Ioan Cuza University of Ia\u015fi, 700506 Ia\u015fi, Romania"}]},{"given":"Liviu","family":"Ciortuz","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science, Alexandru Ioan Cuza University of Ia\u015fi, 700506 Ia\u015fi, Romania"}]}],"member":"1968","published-online":{"date-parts":[[2021,8,3]]},"reference":[{"key":"ref_1","unstructured":"Mitchell, T. (2017). Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression. (Additional Chapter to Machine Learning; McGraw-Hill: New York, NY, USA, 1997.) Published Online. Available online: https:\/\/bit.ly\/39Ueb4o."},{"key":"ref_2","unstructured":"Murphy, K. (2012). Machine Learning: A Probabilistic Perspective, MIT Press."},{"key":"ref_3","unstructured":"Ng, A. (2021, July 31). Machine Learning Course, Lecture Notes, Mixtures of Gaussians and the EM Algorithm. Available online: http:\/\/cs229.stanford.edu\/notes2020spring\/cs229-notes7b.pdf."},{"key":"ref_4","unstructured":"Singh, A. (2021, July 31). Machine Learning Course, Homework 4, pr 1.1; CMU: Pittsburgh, PA, USA, 2010; p. 528 in Ciortuz, L.; Munteanu, A.; B\u0103d\u0103r\u0103u, E. Machine Learning Exercise Book (In Romanian), Available online: https:\/\/bit.ly\/320ZuIk."},{"key":"ref_5","unstructured":"Ng, A. (2021, July 31). Machine Learning Course, Lecture Notes, Part X. Available online: http:\/\/cs229.stanford.edu\/notes2020spring\/cs229-notes9.pdf."},{"key":"ref_6","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press. Available online: http:\/\/www.deeplearningbook.org."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1111\/1467-9868.00196","article-title":"Probabilistic Principal Component Analysis","volume":"61","author":"Tipping","year":"1999","journal-title":"J. R. Stat. Soc. Ser. (Stat. Methodol.)"},{"key":"ref_8","unstructured":"Ciobanu, S. (2019). Exploiting a New Probabilistic Model: Simple-Supervised Factor Analysis. [Master\u2019s Thesis, Alexandru Ioan Cuza University of Ia\u0219i]. Available online: https:\/\/bit.ly\/31UsBx6."},{"key":"ref_9","unstructured":"Ng, A. (2021, July 31). Machine Learning Course, Lecture Notes, Part XI. Available online: http:\/\/cs229.stanford.edu\/notes2020spring\/cs229-notes10.pdf."},{"key":"ref_10","unstructured":"Lawrence, N.D. (2004). Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data. Adv. Neural Inf. Process. Syst., 329\u2013336. Available online: https:\/\/papers.nips.cc\/paper\/2540-gaussian-process-latent-variable-models-for-visualisation-of-high-dimensional-data.pdf."},{"key":"ref_11","first-page":"425","article-title":"Supervised Gaussian Process Latent Variable Model for Dimensionality Reduction","volume":"41","author":"Gao","year":"2010","journal-title":"IEEE Trans. Syst. Man, Cybern. Part (Cybern.)"},{"key":"ref_12","unstructured":"Mitchell, T., Xing, E., and Singh, A. (2021, July 31). Machine Learning Course, Midterm Exam, pr. 5.3; CMU: Pittsburgh, PA, USA, 2010; p. 565 Ciortuz, L.; Munteanu, A.; B\u0103d\u0103r\u0103u, E. Machine Learning Exercise Book (In Romanian), Available online: https:\/\/bit.ly\/320ZuIk."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"538","DOI":"10.1016\/j.snb.2014.09.001","article-title":"Bioinspired early detection through gas flow modulation in chemo-sensory systems","volume":"206","author":"Ziyatdinov","year":"2015","journal-title":"Sens. Actuators Chem."},{"key":"ref_14","unstructured":"Spyromitros-Xioufis, E., TSOUMAKAS, G., WILLIAM, G., and Vlahavas, I. (2014). Drawing parallels between multi-label classification and multi-target regression. arXiv."},{"key":"ref_15","unstructured":"Xiaojin, Z., and Zoubin, G. (2002). Learning from Labeled and Unlabeled Data with Label Propagation, Carnegie Mellon University. Technical Report CMU-CALD-02\u2013107."},{"key":"ref_16","unstructured":"Wang, J. (2021, July 31). SSL: Semi-Supervised Learning, Available online: https:\/\/CRAN.R-project.org\/package=SSL."},{"key":"ref_17","unstructured":"Oliver, A., Odena, A., Raffel, C., Cubuk, E.D., and Goodfellow, I.J. (2018). Realistic evaluation of deep semi-supervised learning algorithms. arXiv."},{"key":"ref_18","first-page":"1","article-title":"Mice: Multivariate Imputation by Chained Equations in R","volume":"45","year":"2011","journal-title":"J. Stat. Softw."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v045.i07","article-title":"Amelia II: A Program for Missing Data","volume":"45","author":"Honaker","year":"2011","journal-title":"J. Stat. Softw."},{"key":"ref_20","unstructured":"Ghahramani, Z., and Hinton, G.E. (1996). The EM Algorithm for Mixtures of Factor Analyzers, University of Toronto. Available online: http:\/\/mlg.eng.cam.ac.uk\/zoubin\/papers\/tr-96-1.pdf."},{"key":"ref_21","unstructured":"Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer Science + Business Media."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1145\/2934664","article-title":"Apache Spark: A Unified Engine for Big Data Processing","volume":"59","author":"Zaharia","year":"2016","journal-title":"Commun. ACM"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. Ser. (Methodol.)"},{"key":"ref_24","unstructured":"Kingma, D.P., and Welling, M. (2013). Auto-encoding variational bayes. arXiv."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/8\/1012\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:39:59Z","timestamp":1760164799000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/8\/1012"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,3]]},"references-count":24,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2021,8]]}},"alternative-id":["e23081012"],"URL":"https:\/\/doi.org\/10.3390\/e23081012","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2021,8,3]]}}}