{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T11:31:40Z","timestamp":1780054300029,"version":"3.54.0"},"reference-count":290,"publisher":"Emerald","issue":"1-2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,2,5]]},"abstract":"<jats:p>This monograph develops unifying perspectives on the problem of identifying universal low-dimensional features from high-dimensional data for inference tasks in settings involving learning. For such problems, natural notions of universality are introduced, and a local equivalence among them is established. The analysis is naturally expressed via information geometry, which provides both conceptual and computational insights. The development reveals the complementary roles of the singular value decomposition, Hirschfeld-Gebelein-R\u00e9nyi maximal correlation, the canonical correlation and principle component analyses of Hotelling and Pearson, Tishby\u2019s information bottleneck, Wyner\u2019s and G\u00e1cs-K\u00f6rner common information, Ky Fan k-norms, and Breiman and Friedman\u00f6s alternating conditional expectations algorithm. Among other uses, the framework facilitates understanding and optimizing aspects of learning systems, including multinomial logistic (softmax) regression and neural network architecture, matrix factorization methods for collaborative filtering and other applications, rank-constrained multivariate linear regression, and forms of semi-supervised learning.<\/jats:p>","DOI":"10.1561\/0100000107","type":"journal-article","created":{"date-parts":[[2024,2,5]],"date-time":"2024-02-05T06:31:45Z","timestamp":1707114705000},"page":"1-299","source":"Crossref","is-referenced-by-count":12,"title":["Universal Features for High-Dimensional Learning and Inference"],"prefix":"10.1108","volume":"21","author":[{"given":"Shao-Lun","family":"Huang","sequence":"first","affiliation":[{"name":"Tsinghua-Berkeley Shenzhen Institute ,","place":["China"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Anuran","family":"Makur","sequence":"additional","affiliation":[{"name":"Purdue University ,","place":["USA"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Gregory W.","family":"Wornell","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology ,","place":["USA"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lizhong","family":"Zheng","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology ,","place":["USA"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"140","published-online":{"date-parts":[[2024,2,5]]},"reference":[{"issue":"(2)","key":"2026032712173409700_ref001","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1109\/TIT.2011.2169536","article-title":"A coordinate system for Gaussian networks","volume":"58","author":"Abbe","year":"(2012)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(1)","key":"2026032712173409700_ref002","first-page":"1947","article-title":"Emergence of invariance and disentangling in deep representations","volume":"19","author":"Achille","year":"(2018)","journal-title":"J. Mach. Learn. Res."},{"issue":"(6)","key":"2026032712173409700_ref003","first-page":"925","article-title":"Spreading of sets in product spaces and hypercontraction of the Markov operator","volume":"4","author":"Ahlswede","year":"(1976)","journal-title":"Ann. Prob."},{"key":"2026032712173409700_ref004","article-title":"On common information and related characteristics of correlated information sources","author":"Ahlswede","year":"(1974)","journal-title":"Proc. Prague Conf. Inform. Theory"},{"key":"2026032712173409700_ref005","doi-asserted-by":"crossref","first-page":"664","DOI":"10.1007\/11889342_41","volume-title":"General Theory of Information Transfer and Combinatorics","author":"Ahlswede","year":"(2006)"},{"key":"2026032712173409700_ref006","article-title":"A kernel method for canonical correlation analysis","author":"Akaho","year":"(2001)","journal-title":"Proc. Int. Meeting Psychometric Soc. (IMPS)"},{"key":"2026032712173409700_ref007","article-title":"Deep variational information bottleneck","author":"Alemi","year":"(2017)","journal-title":"Proc. Int. Conf. Learning Repr. (ICLR)"},{"key":"2026032712173409700_ref008","doi-asserted-by":"crossref","DOI":"10.1007\/978-4-431-55978-8","volume-title":"Information Geometry and Its Applications","author":"Amari","year":"(2016)"},{"key":"2026032712173409700_ref009","volume-title":"Methods of Information Geometry","author":"Amari","year":"(2000)"},{"key":"2026032712173409700_ref010","article-title":"On hypercontractivity and a data processing inequality","author":"Anantharam","year":"(2014)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref011","article-title":"On hypercontractivity and the mutual information between Boolean functions","author":"Anantharam","year":"(2013)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"key":"2026032712173409700_ref012","unstructured":"Anantharam\n              V.\n            , A. A.Gohari, S.Kamath, and C.Nair, \u201cOn hypercontractivity and the data processing inequality studied by Erkip and Cover\u201d, CoRR, vol. abs\/1304.6133, (2013). arXiv: 1304.6133. http:\/\/arxiv.org\/abs\/1304.6133."},{"key":"2026032712173409700_ref013","volume-title":"An Introduction to Multivariate Statistical Analysis","author":"Anderson","year":"(2003)"},{"key":"2026032712173409700_ref014","first-page":"1247","article-title":"Deep canonical correlation analysis","author":"Andrew","year":"(2013)","journal-title":"Proc. Int. Conf. Machine Learning (ICML)"},{"key":"2026032712173409700_ref015","first-page":"34","article-title":"Kernel CCA for multi-view learning of acoustic features using articulatory measurements","author":"Arora","year":"(2012)","journal-title":"Proc. Symp. Mach. Learning Speech, Lang. Process."},{"key":"2026032712173409700_ref016","article-title":"Convex sparse matrix factorizations","author":"Bach","year":"(2008)","journal-title":"CNRS, Paris, France, Tech. Rep. HAL-00345747"},{"key":"2026032712173409700_ref017","first-page":"1","article-title":"Kernel independent component analysis","volume":"3","author":"Bach","year":"(2002)","journal-title":"J. Mach. Learn. Res."},{"key":"2026032712173409700_ref018","article-title":"A probabilistic interpretation of canonical correlation analysis","author":"Bach","year":"(2005)","journal-title":"Dept. Statist., Univ. Calif., Berkeley, CA, Tech. Rep. 688"},{"key":"2026032712173409700_ref019","article-title":"R\u00e9nyi fair inference","author":"Baharlouei","year":"(2020)","journal-title":"Proc. Int. Conf. Learning Repr. (ICLR)"},{"key":"2026032712173409700_ref020","first-page":"911","article-title":"Sparse additive functional and kernel CCA","author":"Balakrishnan","year":"(2012)","journal-title":"Proc. Int. Conf. Machine Learning (ICML)"},{"issue":"(1)","key":"2026032712173409700_ref021","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0893-6080(89)90014-2","article-title":"Neural networks and principal component analysis: Learning from examples without local minima","volume":"2","author":"Baldi","year":"(1989)","journal-title":"Neural Netw."},{"key":"2026032712173409700_ref022","article-title":"An information-theoretic approach to transferability in task transfer learning","author":"Bao","year":"(2019)","journal-title":"Proc. Int. Conf. Image Processing (ICIP)"},{"key":"2026032712173409700_ref023","volume-title":"Fairness and Machine Learning: Limitations and Opportunities","author":"Barocas","year":"(2019)"},{"issue":"(3)","key":"2026032712173409700_ref024","doi-asserted-by":"crossref","first-page":"930","DOI":"10.1109\/18.256500","article-title":"Universal approximation bounds for superpositions of a sigmoidal function","volume":"39","author":"Barron","year":"(1993)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(3)","key":"2026032712173409700_ref025","first-page":"1347","article-title":"Approximation of density functions by sequences of exponential families","volume":"19","author":"Barron","year":"(1991)","journal-title":"Ann. Stat."},{"key":"2026032712173409700_ref026","doi-asserted-by":"crossref","DOI":"10.1002\/9780470316894","volume-title":"Statistical Factor Analysis and Related Methods","author":"Bastlevsky","year":"(1994)"},{"key":"2026032712173409700_ref027","doi-asserted-by":"crossref","first-page":"73","DOI":"10.7551\/mitpress\/6173.003.0008","volume-title":"Semi-Supervised Learning","author":"Basu","year":"(2006)"},{"key":"2026032712173409700_ref028","article-title":"The Netflix Prize","author":"Bennett","year":"(2007)","journal-title":"Proc. KDD Cup and Workshop"},{"key":"2026032712173409700_ref029","first-page":"1","article-title":"Deep generalized canonical correlation analysis","author":"Benton","year":"(2019)","journal-title":"Proc. Workshop Represent. Learning NLP (RePL4NLP)"},{"key":"2026032712173409700_ref030","volume-title":"L\u2019Analyse des Donn\u00e9es, T\u00f4me 2: L\u2019Analyse des Correspondances","author":"Benz\u00e9cri","year":"(1973)"},{"key":"2026032712173409700_ref031","doi-asserted-by":"crossref","DOI":"10.1201\/9780585363035","volume-title":"Correspondence Analysis Handbook","author":"Benz\u00e9cri","year":"(1992)"},{"key":"2026032712173409700_ref032","doi-asserted-by":"crossref","DOI":"10.1137\/1.9781611971262","volume-title":"Nonnegative Matrices in the Mathematical Sciences","author":"Berman","year":"(1994)"},{"key":"2026032712173409700_ref033","first-page":"371","volume-title":"Learning in Graphical Models","author":"Bishop","year":"(1999)"},{"key":"2026032712173409700_ref034","volume-title":"Pattern Recognition and Machine Learning","author":"Bishop","year":"(2006)"},{"key":"2026032712173409700_ref035","unstructured":"Biswal\n              P.\n            \n          , \u201cHypercontractivity and its applications\u201d, CoRR, vol. abs\/1101.2913, (2011). arXiv: 1101.2913. http:\/\/arxiv.org\/abs\/1101.2913."},{"key":"2026032712173409700_ref036","article-title":"Euclidean information theory","author":"Borade","year":"(2008)","journal-title":"Proc. Int. Zurich Seminar Commun. (IZS)"},{"issue":"(2)","key":"2026032712173409700_ref037","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.jat.2004.04.010","article-title":"Bernstein polynomials and learning theory","volume":"128","author":"Braess","year":"(2004)","journal-title":"J. Approx. Theory"},{"issue":"(391)","key":"2026032712173409700_ref038","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1080\/01621459.1985.10478157","article-title":"Estimating optimal transformations for multiple regression and correlation","volume":"80","author":"Breiman","year":"(1985)","journal-title":"J. Am. Stat. Assoc."},{"key":"2026032712173409700_ref039","volume-title":"Time Series: Data Analysis and Theory","author":"Brillinger","year":"(1975)"},{"key":"2026032712173409700_ref040","first-page":"6076","article-title":"Fair selective classification via sufficiency","author":"Bu","year":"(2021)","journal-title":"Proc. Int. Conf. Machine Learning (ICML)"},{"key":"2026032712173409700_ref041","article-title":"SDP methods for sensitivity-constrained privacy funnel and information bottleneck problems","author":"Bu","year":"(2021)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref042","article-title":"Theory of bivariate ACE","author":"Buja","year":"(1985)","journal-title":"Dept. Statistics, University of Washington, Seattle, WA, Tech. Rep. 74"},{"issue":"(3)","key":"2026032712173409700_ref043","first-page":"1032","article-title":"Remarks on functional canonical variates, alternating least squares methods and ACE","volume":"18","author":"Buja","year":"(1990)","journal-title":"Ann. Stat."},{"issue":"(1)","key":"2026032712173409700_ref044","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1111\/j.2044-8317.1983.tb00765.x","article-title":"Nonlinear canonical correlation","volume":"36","author":"van der Burg","year":"(1983)","journal-title":"Br. J. Math. Stat. Psychol."},{"issue":"(1)","key":"2026032712173409700_ref045","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1007\/BF02869528","article-title":"Estimation of a multivariate density","volume":"18","author":"Cacoullos","year":"(1966)","journal-title":"Ann. Inst. Statist. Math."},{"key":"2026032712173409700_ref046","article-title":"Fundamental limits of perfect privacy","author":"Calmon","year":"(2015)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"issue":"(8)","key":"2026032712173409700_ref047","doi-asserted-by":"crossref","first-page":"5011","DOI":"10.1109\/TIT.2017.2700857","article-title":"Principal inertia components and applications","volume":"63","author":"Calmon","year":"(2017)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref048","article-title":"An exploration of the role of principal inertia components in information theory","author":"Calmon","year":"(2014)","journal-title":"Proc. Inform. Theory Workshop (ITW)"},{"key":"2026032712173409700_ref049","article-title":"Bounds on inference","author":"Calmon","year":"(2013)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"issue":"(6)","key":"2026032712173409700_ref050","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1109\/JPROC.2009.2035722","article-title":"Matrix completion with noise","volume":"98","author":"Cand\u00e8s","year":"(2010)","journal-title":"Proc. IEEE"},{"issue":"(11)","key":"2026032712173409700_ref051","doi-asserted-by":"crossref","first-page":"7235","DOI":"10.1109\/TIT.2011.2161794","article-title":"A probabilistic and RIPless theory of compressed sensing","volume":"57","author":"Cand\u00e8s","year":"(2011)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(6)","key":"2026032712173409700_ref052","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1007\/s10208-009-9045-5","article-title":"Exact matrix completion via convex optimization","volume":"9","author":"Cand\u00e8s","year":"(2009)","journal-title":"Foundations Comput. Math."},{"issue":"(5)","key":"2026032712173409700_ref053","first-page":"2054","article-title":"The power of convex relaxation: Near-optimal matrix completion","volume":"56","author":"Cand\u00e8s","year":"(2010)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref054","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/9780262033589.001.0001","volume-title":"Semi-Supervised Learning","author":"Chapelle","year":"(2006)"},{"key":"2026032712173409700_ref055","first-page":"165","article-title":"Information bottleneck for Gaussian variables","volume":"6","author":"Chechik","year":"(2005)","journal-title":"J. Mach. Learn. Res."},{"key":"2026032712173409700_ref056","first-page":"195","volume-title":"Problems in Analysis","author":"Cheeger","year":"(1970)"},{"issue":"(4)","key":"2026032712173409700_ref057","first-page":"1","article-title":"Maximally correlated principal component analysis based on deep parameterization learning","volume":"13","author":"Chen","year":"(2019)","journal-title":"ACM Trans. Knowl. Discov. Data"},{"issue":"(1)","key":"2026032712173409700_ref058","doi-asserted-by":"crossref","first-page":"67","DOI":"10.2307\/1403038","article-title":"Elliptically symmetric distributions: A review and bibliography","volume":"49","author":"Chmielewski","year":"(1981)","journal-title":"Int. Stat. Review"},{"key":"2026032712173409700_ref059","volume-title":"Spectral Graph Theory","author":"Chung","year":"(1997)"},{"issue":"(1)","key":"2026032712173409700_ref060","first-page":"22","article-title":"Word association norms, mutual information, and lexicography","volume":"16","author":"Church","year":"(1990)","journal-title":"J. Comput. Linguist."},{"issue":"(21)","key":"2026032712173409700_ref061","doi-asserted-by":"crossref","first-page":"7426","DOI":"10.1073\/pnas.0500334102","article-title":"Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps","volume":"102","author":"Coifman","year":"(2005)","journal-title":"Proc. Nat. Acad. Sci."},{"key":"2026032712173409700_ref062","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1023\/A:1022627411411","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"(1995)","journal-title":"Machine Learning"},{"key":"2026032712173409700_ref063","volume-title":"Elements of Information Theory","author":"Cover","year":"(2006)"},{"issue":"(2)","key":"2026032712173409700_ref064","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1111\/j.2517-6161.1958.tb00292.x","article-title":"The regression analysis of binary sequences (with discussion)","volume":"20","author":"Cox","year":"(1958)","journal-title":"J. Roy. Stat. Soc., Ser. B"},{"issue":"(3)","key":"2026032712173409700_ref065","first-page":"454","article-title":"Estimation of distributions using orthogonal expansions","volume":"2","author":"Crain","year":"(1974)","journal-title":"Ann. Stat."},{"key":"2026032712173409700_ref066","first-page":"325","article-title":"Contributions to the problem of maximal correlation","volume":"5","author":"Cs\u00e1ki","year":"(1960)","journal-title":"Publ. Math. Inst. Hung. Acad. Sci."},{"key":"2026032712173409700_ref067","first-page":"311","article-title":"On bivariate stochastic connection","volume":"5","author":"Cs\u00e1ki","year":"(1960)","journal-title":"Publ. Math. Inst. Hung. Acad. Sci."},{"key":"2026032712173409700_ref068","first-page":"27","article-title":"On the general notion of maximum correlation","volume":"8","author":"Cs\u00e1ki","year":"(1963)","journal-title":"Magyar Tud. Akad. Mat. Kutat\u00f3 Int K\u00f6zl"},{"issue":"(1\u20134)","key":"2026032712173409700_ref069","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1007\/BF02018661","article-title":"A class of measures of informativity of observation channels","volume":"2","author":"Csisz\u00e1r","year":"(1972)","journal-title":"Periodica Math. Hungarica"},{"key":"2026032712173409700_ref070","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511921889","volume-title":"Information Theory: Coding Theorems for Discrete Memoryless Systems","author":"Csisz\u00e1r","year":"(2011)"},{"issue":"(4)","key":"2026032712173409700_ref071","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1561\/0100000004","article-title":"Information theory and statistics: A tutorial","volume":"1","author":"Csisz\u00e1r","year":"(2004)","journal-title":"Foundations and Trends in Communications and Information Theory"},{"key":"2026032712173409700_ref072","volume-title":"Spectra of Graphs, Theory and Application","author":"Cvetkovic","year":"(1980)"},{"issue":"(4)","key":"2026032712173409700_ref073","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/BF02551274","article-title":"Approximation by superpositions of a sigmoidal function","volume":"2","author":"Cybenko","year":"(1989)","journal-title":"Math. Control, Signals, Systems"},{"key":"2026032712173409700_ref074","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1111\/j.2517-6161.1977.tb01623.x","article-title":"Spherical matrix distributions and a multivariate model","volume":"39","author":"Dawid","year":"(1977)","journal-title":"J. Roy. Stat. Soc., Ser. B"},{"issue":"(2)","key":"2026032712173409700_ref075","doi-asserted-by":"crossref","first-page":"343","DOI":"10.2307\/3318742","article-title":"Remarks on the maximum correlation coefficient","volume":"7","author":"Dembo","year":"(2001)","journal-title":"Bernoulli"},{"key":"2026032712173409700_ref076","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4612-5320-4","volume-title":"Large Deviations Techniques and Applications","author":"Dembo","year":"(1998)"},{"issue":"(1)","key":"2026032712173409700_ref077","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","author":"Dempster","year":"(1977)","journal-title":"J. Roy. Stat. Soc., B"},{"issue":"(1)","key":"2026032712173409700_ref078","first-page":"36","article-title":"Geometric bounds for eigenvalues of Markov chains","volume":"1","author":"Diaconis","year":"(1991)","journal-title":"Ann. Appl. Prob."},{"issue":"(4)","key":"2026032712173409700_ref079","doi-asserted-by":"crossref","first-page":"1949","DOI":"10.1109\/TIT.2019.2939472","article-title":"On the robustness of information-theoretic privacy measures and mechanisms","volume":"66","author":"Diaz","year":"(2020)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(5)","key":"2026032712173409700_ref080","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1109\/TIT.1962.1057738","article-title":"Information transmission with additional noise","volume":"8","author":"Dobrushin","year":"(1962)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref081","first-page":"3112","article-title":"On the importance of asymmetry and monotonicity constraints in maximal correlation analysis","author":"Domanovitz","year":"(2019)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref082","volume-title":"Pattern Classification","author":"Duda","year":"(2000)"},{"key":"2026032712173409700_ref083","unstructured":"Dziugaite\n              G. K.\n             and D. M.Roy, \u201cNeural network matrix factorization\u201d, CoRR, vol. abs\/1511.06443, (2015). arXiv: 1511.06443. http:\/\/arxiv.org\/abs\/1511.06443."},{"issue":"(3)","key":"2026032712173409700_ref084","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/BF02288367","article-title":"The approximation of one matrix by another of lower rank","volume":"1","author":"Eckart","year":"(1936)","journal-title":"Psychometrika"},{"issue":"(4)","key":"2026032712173409700_ref085","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1109\/TNSE.2017.2716966","article-title":"Network maximal correlation","volume":"4","author":"Feizi","year":"(2017)","journal-title":"IEEE Trans. Netw. Sci., Eng."},{"key":"2026032712173409700_ref086","unstructured":"Feizi\n              S.\n             and D.Tse, \u201cMaximally correlated principal component analysis\u201d, CoRR, vol. abs\/1702.05471, (2017). arXiv: 1702.05471. https:\/\/arxiv.org\/abs\/1702.05471."},{"issue":"(98)","key":"2026032712173409700_ref087","doi-asserted-by":"crossref","first-page":"298","DOI":"10.21136\/CMJ.1973.101168","article-title":"Algebraic connectivity of graphs","volume":"23","author":"Fiedler","year":"(1973)","journal-title":"Czechoslovak Math. J."},{"issue":"(100)","key":"2026032712173409700_ref088","doi-asserted-by":"crossref","first-page":"619","DOI":"10.21136\/CMJ.1975.101357","article-title":"A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory","volume":"25","author":"Fiedler","year":"(1975)","journal-title":"Czech. Math. J."},{"issue":"(2)","key":"2026032712173409700_ref089","first-page":"149","article-title":"Common information is far less than mutual information","volume":"2","author":"G\u00e1cs","year":"(1973)","journal-title":"Probl. Contr. Inform. Theory"},{"key":"2026032712173409700_ref090","article-title":"Dense error correction for low-rank matrices via principal component pursuit","author":"Ganesh","year":"(2010)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"issue":"(6)","key":"2026032712173409700_ref091","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1002\/zamm.19410210604","article-title":"Das statistische problem der korrelation als variations- und eigenwertproblem und sein zusammenhang mit der ausgleichsrechnung","volume":"21","author":"Gebelein","year":"(1941)","journal-title":"Z. Angewandte Math., Mech."},{"key":"2026032712173409700_ref092","volume-title":"Vector Quantization and Signal Compression","author":"Gersho","year":"(1991)"},{"issue":"(3)","key":"2026032712173409700_ref093","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1111\/j.1751-5823.2002.tb00178.x","article-title":"On choosing and bounding probability metrics","volume":"70","author":"Gibbs","year":"(2002)","journal-title":"Int. Stat. Rev."},{"key":"2026032712173409700_ref094","volume-title":"Nonlinear Multivariate Analysis","author":"Gifi","year":"(1990)"},{"issue":"(3)","key":"2026032712173409700_ref095","first-page":"293","article-title":"Iterative maximum likelihood estimation for discrete distributions","volume":"35","author":"Gokhale","year":"(1973)","journal-title":"Sankhy\u0101 B"},{"issue":"(12)","key":"2026032712173409700_ref096","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1145\/138859.138867","article-title":"Using collaborative filtering to weave an information tapestry","volume":"35","author":"Goldberg","year":"(1992)","journal-title":"Commun. ACM"},{"key":"2026032712173409700_ref097","doi-asserted-by":"crossref","DOI":"10.56021\/9781421407944","volume-title":"Matrix Computations","author":"Golub","year":"(2012)"},{"issue":"(3\u20134)","key":"2026032712173409700_ref098","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1093\/biomet\/40.3-4.237","article-title":"The population frequencies of species and the estimation of population parameters","volume":"40","author":"Good","year":"(1953)","journal-title":"Biometrika"},{"issue":"(3)","key":"2026032712173409700_ref099","doi-asserted-by":"crossref","first-page":"911","DOI":"10.1214\/aoms\/1177704014","article-title":"Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables","volume":"34","author":"Good","year":"(1963)","journal-title":"Ann. Math. Stat."},{"key":"2026032712173409700_ref100","volume-title":"Deep Learning","author":"Goodfellow","year":"(2017)"},{"key":"2026032712173409700_ref101","first-page":"2262","article-title":"Fairness-aware neural R\u00e9nyi minimization for continuous features","author":"Grari","year":"(2021)","journal-title":"Proc. Int. Joint Conf. Artif. Intell. (IJCAI-20)"},{"key":"2026032712173409700_ref102","volume-title":"Theory and Applications of Correspondence Analysis","author":"Greenacre","year":"(1984)"},{"key":"2026032712173409700_ref103","volume-title":"Correspondence Analysis in Practice","author":"Greenacre","year":"(2016)"},{"key":"2026032712173409700_ref104","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/0167-9473(85)90053-2","article-title":"Maximum likelihood methods for linear and loglinear models in categorical data","volume":"3","author":"Haber","year":"(1985)","journal-title":"Comp. Stat., Data Anal."},{"issue":"(2)","key":"2026032712173409700_ref105","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1137\/090771806","article-title":"Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions","volume":"53","author":"Halko","year":"(2011)","journal-title":"SIAM Rev."},{"key":"2026032712173409700_ref106","first-page":"339","volume-title":"Essays in Probability and Statistics","author":"Hall","year":"(1970)"},{"issue":"(3)","key":"2026032712173409700_ref107","doi-asserted-by":"crossref","first-page":"752","DOI":"10.1109\/18.256486","article-title":"Approximation theory of output statistics","volume":"39","author":"Han","year":"(1993)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(2)","key":"2026032712173409700_ref108","first-page":"229","article-title":"The general theory of canonical correlation and its relation to functional analysis","volume":"2","author":"Hannan","year":"(1960)","journal-title":"J. Aust. Math. Soc."},{"key":"2026032712173409700_ref109","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1007\/978-3-662-45171-7_16","volume-title":"Applied Multivariate Statistical Analysis","author":"H\u00e4rdle","year":"(2015)"},{"issue":"(12)","key":"2026032712173409700_ref110","doi-asserted-by":"crossref","first-page":"2639","DOI":"10.1162\/0899766042321814","article-title":"Canonical correlation analysis: An overview with application to learning methods","volume":"16","author":"Hardoon","year":"(2004)","journal-title":"Neural Comput."},{"key":"2026032712173409700_ref111","first-page":"770","article-title":"Deep residual learning for image recognition","author":"He","year":"(2016)","journal-title":"Proc. Conf. Comp. Vision, Pattern Recog. (CVPR)"},{"key":"2026032712173409700_ref112","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1017\/S0305004100013517","article-title":"A connection between correlation and contingency","volume":"31","author":"Hirschfeld","year":"(1935)","journal-title":"Proc. Cambridge Phil. Soc."},{"key":"2026032712173409700_ref113","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511840371","volume-title":"Topics in Matrix Analysis","author":"Horn","year":"(1991)"},{"key":"2026032712173409700_ref114","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139020411","volume-title":"Matrix Analysis","author":"Horn","year":"(2012)"},{"issue":"(5)","key":"2026032712173409700_ref115","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/0893-6080(89)90020-8","article-title":"Multilayer feedforward networks are universal approximators","volume":"2","author":"Hornik","year":"(1989)","journal-title":"Neural Netw."},{"key":"2026032712173409700_ref116","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1037\/h0071325","article-title":"Analysis of a complex of statistical variables into principal components","volume":"24","author":"Hotelling","year":"(1933)","journal-title":"J. Ed. Psych."},{"key":"2026032712173409700_ref117","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1093\/biomet\/28.3-4.321","article-title":"Relations between two sets of variates","volume":"28","author":"Hotelling","year":"(1936)","journal-title":"Biometrika"},{"key":"2026032712173409700_ref118","first-page":"531","article-title":"Generalizing bottleneck problems","author":"Hsu","year":"(2018)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref119","first-page":"2671","article-title":"Correspondence analysis using neural networks","author":"Hsu","year":"(2019)","journal-title":"Proc. Int. Conf. Artif. Intell., Stat. (AISTATS)"},{"issue":"(12)","key":"2026032712173409700_ref120","doi-asserted-by":"crossref","first-page":"9347","DOI":"10.1109\/TPAMI.2021.3127870","article-title":"Generalizing correspondence analysis for applications in machine learning","volume":"44","author":"Hsu","year":"(2022)","journal-title":"IEEE Trans. Pattern Anal. Machine Intell."},{"key":"2026032712173409700_ref121","article-title":"Communicating type classes through channels: An information geometric view","author":"Huang","year":"(2022)","journal-title":"Proc. Inform. Theory Workshop (ITW)"},{"key":"2026032712173409700_ref122","article-title":"An information-theoretic view of learning in high dimensions: Universal features, maximal correlations, bottlenecks, and common information","author":"Huang","year":"(2018)","journal-title":"Proc. Inform. Theory Appl. Workshop (ITA)"},{"key":"2026032712173409700_ref123","article-title":"An information-theoretic approach to universal feature selection in high-dimensional inference","author":"Huang","year":"(2017)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref124","article-title":"Gaussian universal features, canonical correlations, and common information","author":"Huang","year":"(2018)","journal-title":"Proc. Inform. Theory Workshop (ITW)"},{"key":"2026032712173409700_ref125","article-title":"On the robustness of noisy ACE algorithm and multi-layer residual learning","author":"Huang","year":"(2019)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref126","article-title":"On the sample complexity of HGR maximal correlation functions","author":"Huang","year":"(2019)","journal-title":"Proc. Inform. Theory Workshop (ITW)"},{"issue":"(3)","key":"2026032712173409700_ref127","doi-asserted-by":"crossref","first-page":"1951","DOI":"10.1109\/TIT.2020.3044622","article-title":"On the sample complexity of HGR maximal correlation functions for large datasets","volume":"67","author":"Huang","year":"(2021)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(1)","key":"2026032712173409700_ref128","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/JSAIT.2020.2981538","article-title":"An information-theoretic approach to unsupervised feature selection for high-dimensional data","volume":"1","author":"Huang","year":"(2020)","journal-title":"IEEE J. Select. Areas Inform. Theory"},{"key":"2026032712173409700_ref129","article-title":"An information theoretic interpretation to deep neural networks","author":"Huang","year":"(2019)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref130","article-title":"A local characterization for Wyner common information","author":"Huang","year":"(2020)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"issue":"(1)","key":"2026032712173409700_ref131","article-title":"An information theoretic interpretation to deep neural networks","volume":"24","author":"Huang","year":"(2022)","journal-title":"Entropy"},{"key":"2026032712173409700_ref132","article-title":"Linear information coupling problems","author":"Huang","year":"(2012)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"issue":"(7)","key":"2026032712173409700_ref133","doi-asserted-by":"crossref","first-page":"2162","DOI":"10.1016\/j.jspi.2008.10.011","article-title":"Nonlinear measures of association with kernel canonical correlation analysis and applications","volume":"139","author":"Huang","year":"(2009)","journal-title":"J. Stat. Planning, Inference"},{"key":"2026032712173409700_ref134","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1093\/biomet\/55.1.179","article-title":"Contingency tables with given marginals","volume":"55","author":"Ireland","year":"(1968)","journal-title":"Biometrika"},{"key":"2026032712173409700_ref135","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1016\/0047-259X(75)90042-1","article-title":"Reduced-rank regression for the multivariate linear model","volume":"5","author":"Izenman","year":"(1975)","journal-title":"J. Multivariate Anal."},{"key":"2026032712173409700_ref136","first-page":"361","article-title":"Estimation with quadratic loss","author":"James","year":"(1961)","journal-title":"Proc. Berkeley Symp. Math. Statist. Prob."},{"key":"2026032712173409700_ref137","volume-title":"Theory of Probability","author":"Jeffreys","year":"(1948)"},{"key":"2026032712173409700_ref138","first-page":"381","article-title":"Interpolated estimation of Markov source parameters from sparse data","author":"Jelinek","year":"(1980)","journal-title":"Proc. Workshop, Patt. Recogn. Practice"},{"issue":"(164)","key":"2026032712173409700_ref139","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1093\/mind\/XLI.164.409","article-title":"Probability: Deductive and inductive problems","volume":"41","author":"Johnson","year":"(1932)","journal-title":"Mind"},{"key":"2026032712173409700_ref140","volume-title":"Principal Component Analysis","author":"Jolliffe","year":"(2002)"},{"key":"2026032712173409700_ref141","article-title":"A new dual to the G\u00e1cs-K\u00f6rner common information defined via the Gray-Wyner system","author":"Kamath","year":"(2010)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"key":"2026032712173409700_ref142","article-title":"Non-interactive simulation of joint distributions: The Hirschfeld-Gebelein-R\u00e9nyi maximal correlation and the hypercontractivity ribbon","author":"Kamath","year":"(2012)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"issue":"(1)","key":"2026032712173409700_ref143","first-page":"56","article-title":"A new data processing inequality and its applications in distributed source and channel coding","volume":"57","author":"Kang","year":"(2010)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(3)","key":"2026032712173409700_ref144","doi-asserted-by":"crossref","first-page":"486","DOI":"10.1109\/72.572090","article-title":"A class of neural networks for independent component analysis","volume":"8","author":"Karhunen","year":"(1997)","journal-title":"IEEE Trans. Neural Netw."},{"issue":"(3)","key":"2026032712173409700_ref145","first-page":"400","article-title":"Estimation of probabilities from sparse data for the language model component of a speech recognizer","volume":"35","author":"Katz","year":"(1984)","journal-title":"IEEE Trans. Acoust., Speech, Signal Processing"},{"key":"2026032712173409700_ref146","first-page":"305","article-title":"Canonical correlation analysis using a neural network","author":"Kay","year":"(1992)","journal-title":"Proc. Symp. Comp. Stat. (COMPSTAT)"},{"issue":"(6)","key":"2026032712173409700_ref147","doi-asserted-by":"crossref","first-page":"2980","DOI":"10.1109\/TIT.2010.2046205","article-title":"Matrix completion from a few entries","volume":"56","author":"Keshavan","year":"(2010)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(1)","key":"2026032712173409700_ref148","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1017\/S0305004100022313","article-title":"An extension of a theorem of Mehler\u2019s on Hermite polynomials","volume":"41","author":"Kibble","year":"(1945)","journal-title":"Proc. Cambridge Phil. Soc."},{"issue":"(4)","key":"2026032712173409700_ref149","first-page":"895","article-title":"Monotone dependence","volume":"6","author":"Kimeldorf","year":"(1978)","journal-title":"Ann. Stat."},{"issue":"(12)","key":"2026032712173409700_ref150","first-page":"497","article-title":"Uber die auflosung der gleichungen, auf welche man bei der untersuchung der linearen verteilung galvanischer strome gefuhrt wird","volume":"72","author":"Kirchoff","year":"(1847)","journal-title":"Ann. Phys. Chem."},{"issue":"(30)","key":"2026032712173409700_ref151","first-page":"965","article-title":"Bayesian canonical correlation analysis","volume":"14","author":"Klami","year":"(2013)","journal-title":"J. Mach. Learn. Res."},{"key":"2026032712173409700_ref152","first-page":"181","article-title":"Improved backing-off for M-gram language modeling","author":"Kneser","year":"(1995)","journal-title":"Proc. Int. Conf. Acoust. Speech, Signal Processing (ICASSP)"},{"issue":"(8)","key":"2026032712173409700_ref153","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1109\/MC.2009.263","article-title":"Matrix factorization techniques for recommender systems","volume":"42","author":"Koren","year":"(2009)","journal-title":"Computer"},{"issue":"(2)","key":"2026032712173409700_ref154","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1002\/aic.690370209","article-title":"Nonlinear principal component analysis using autoassociative neural networks","volume":"37","author":"Kramer","year":"(1991)","journal-title":"Am. Inst. Chem. Eng. (AIChE) J."},{"key":"2026032712173409700_ref155","article-title":"Which Boolean functions are most informative?","author":"Kumar","year":"(2013)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref156","unstructured":"Kumar\n              M.\n            , A.Gramfort, and J.Nothman, Machine Learning in Python code for BIRCH scikit-learn.sklearn.cluster.birch. https:\/\/github.com\/scikit-learn\/scikit-learn\/blob\/a24c8b46\/sklearn\/cluster\/birch.py."},{"issue":"(5)","key":"2026032712173409700_ref157","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1142\/S012906570000034X","article-title":"Kernel and nonlinear canonical correlation analysis","volume":"10","author":"Lai","year":"(2000)","journal-title":"Int. J. Neural Syst."},{"issue":"(1\u20132)","key":"2026032712173409700_ref158","first-page":"1","article-title":"A reconciliation of ? 2 , considered from metrical and enumerative aspects","volume":"13","author":"Lancaster","year":"(1953)","journal-title":"Sankhy\u0101"},{"issue":"(1\u20132)","key":"2026032712173409700_ref159","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1093\/biomet\/44.1-2.289","article-title":"Some properties of the bivariate normal distribution considered in the form of a contingency table","volume":"44","author":"Lancaster","year":"(1957)","journal-title":"Biometrika"},{"key":"2026032712173409700_ref160","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1214\/aoms\/1177706532","article-title":"The structure of bivariate distributions","volume":"29","author":"Lancaster","year":"(1958)","journal-title":"Ann. Math. Stat."},{"key":"2026032712173409700_ref161","volume-title":"The Chi-Squared Distribution","author":"Lancaster","year":"(1969)"},{"issue":"(3)","key":"2026032712173409700_ref162","doi-asserted-by":"crossref","first-page":"434","DOI":"10.1111\/j.2517-6161.1975.tb01558.x","article-title":"Joint probability distributions in the Meixner classes","volume":"37","author":"Lancaster","year":"(1975)","journal-title":"J. Roy. Stat. Soc., Ser. B"},{"key":"2026032712173409700_ref163","volume-title":"Essai Philosophique sur les Probabilit\u00e9s","author":"Laplace","year":"(1814)"},{"key":"2026032712173409700_ref164","volume-title":"Geometric Data Analysis: From Correspondence Analysis to Structured Data","author":"Le Roux","year":"(2004)"},{"key":"2026032712173409700_ref165","volume-title":"Multivariate Descriptive Statistical Analysis","author":"Lebart","year":"(1984)"},{"issue":"(11)","key":"2026032712173409700_ref166","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"(1998)","journal-title":"Proc. IEEE"},{"key":"2026032712173409700_ref167","unstructured":"LeCun\n              Y.\n            , C.Cortes, and C. J. C.Burges, MNIST handwritten digit database. http:\/\/yann.lecun.com\/exdb\/mnist."},{"key":"2026032712173409700_ref168","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1016\/S0047-259X(03)00096-4","article-title":"A well-conditioned estimator for large-dimensional covariance matrices","volume":"88","author":"Ledoit","year":"(2004)","journal-title":"J. Multivariate Anal."},{"issue":"(4)","key":"2026032712173409700_ref169","article-title":"A maximal correlation framework for fair machine learning","volume":"24","author":"Lee","year":"(2022)","journal-title":"Entropy"},{"key":"2026032712173409700_ref170","article-title":"A maximal correlation framework to imposing fairness in machine learning","author":"Lee","year":"(2022)","journal-title":"Proc. Int. Conf. Acoust. Speech, Signal Processing (ICASSP)"},{"key":"2026032712173409700_ref171","article-title":"Learning new tricks from old dogs: Multi-source transfer learning from pretrained networks","author":"Lee","year":"(2019)","journal-title":"Advances Neural Inform. Process. Syst. (NeurIPS)"},{"issue":"(11)","key":"2026032712173409700_ref172","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0898-1221(00)00101-2","article-title":"A unifying information-theoretic framework for independent component analysis","volume":"39","author":"Lee","year":"(2000)","journal-title":"Comput., Math., Appl."},{"key":"2026032712173409700_ref173","first-page":"2177","article-title":"Neural word embedding as implicit matrix factorization","author":"Levy","year":"(2014)","journal-title":"Advances Neural Inform. Process. Syst. (NIPS)"},{"issue":"(1\/2)","key":"2026032712173409700_ref174","first-page":"173","article-title":"The convex analysis of unitarily invariant matrix functions","volume":"2","author":"Lewis","year":"(1995)","journal-title":"J. Convex Anal."},{"key":"2026032712173409700_ref175","article-title":"Maximal correlation embedding network for multilabel learning with missing labels","author":"Li","year":"(2019)","journal-title":"Proc. Int. Conf. Multimedia, Expo (ICME)"},{"key":"2026032712173409700_ref176","article-title":"Semantically supervised maximal correlation for cross-modal retrieval","author":"Li","year":"(2020)","journal-title":"Proc. Int. Conf. Image Processing (ICIP)"},{"key":"2026032712173409700_ref177","article-title":"Dual feature distributional regularization for defending against adversarial attacks","author":"Li","year":"(2021)","journal-title":"Proc. Int. Conf. Neural Inform. Process. (ICONIP)"},{"key":"2026032712173409700_ref178","first-page":"362","article-title":"Joint mobility pattern mining with urban region partitions","author":"Lian","year":"(2018)","journal-title":"Proc. Int. Conf. Mobile Ubiquit. Syst. (MobiQuitous)"},{"key":"2026032712173409700_ref179","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1007\/s11036-019-01309-4","article-title":"Mining regional mobility patterns for urban dynamic analytics","volume":"25","author":"Lian","year":"(2019)","journal-title":"Mobile Netw. Appl."},{"key":"2026032712173409700_ref180","article-title":"Mining mobility patterns with trip-based traffic analysis zones: A deep feature embedding approach","author":"Lian","year":"(2019)","journal-title":"Proc. Intell. Transport. Syst. Conf. (ITSC)"},{"key":"2026032712173409700_ref181","article-title":"Person recognition with HGR maximal correlation on multimodal data","author":"Liang","year":"(2021)","journal-title":"Int. Conf. Patt. Recogn. (ICPR)"},{"issue":"(3)","key":"2026032712173409700_ref182","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1109\/LGRS.2011.2172185","article-title":"Linear versus nonlinear PCA for the classification of hyperspectral data based on the extended morphological profiles","volume":"9","author":"Licciardi","year":"(2012)","journal-title":"IEEE Geosci., Remote Sensing Lett."},{"key":"2026032712173409700_ref183","first-page":"182","article-title":"Note on the general case of the Bayes-Laplace formula for inductive or a Posteriori probabilities","volume":"8","author":"Lidstone","year":"(1920)","journal-title":"Trans. Fac. Actuaries"},{"key":"2026032712173409700_ref184","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/BF02126799","article-title":"Ramanujan graphs","volume":"8","author":"Lubotzky","year":"(1988)","journal-title":"Combinatorica"},{"key":"2026032712173409700_ref185","article-title":"An efficient approach for audio-visual emotion recognition with missing labels and missing modalities","author":"Ma","year":"(2021)","journal-title":"Proc. Int. Conf. Multimedia, Expo (ICME)"},{"issue":"(1)","key":"2026032712173409700_ref186","article-title":"Data augmentation for audio-visual emotion recognition with an efficient multimodal conditional GAN","volume":"12","author":"Ma","year":"(2022)","journal-title":"Appl. Sci."},{"issue":"(20)","key":"2026032712173409700_ref187","article-title":"Learning better representations for audio-visual emotion recognition with common information","volume":"10","author":"Ma","year":"(2020)","journal-title":"Appl. Sci."},{"key":"2026032712173409700_ref188","article-title":"An end-to-end learning approach for multimodal emotion recognition: Extracting common and private information","author":"Ma","year":"(2019)","journal-title":"Proc. Int. Conf. Multimedia, Expo (ICME)"},{"key":"2026032712173409700_ref189","article-title":"Forgot your password: Correlation dilution","author":"Makhdoumi","year":"(2015)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref190","first-page":"501","article-title":"From the information bottleneck to the privacy funnel","author":"Makhdoumi","year":"(2014)","journal-title":"Proc. Inform. Theory Workshop (ITW)"},{"key":"2026032712173409700_ref191","author":"Makur","year":"(2019)"},{"key":"2026032712173409700_ref192","article-title":"An efficient algorithm for information decomposition and extraction","author":"Makur","year":"(2015)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"key":"2026032712173409700_ref193","article-title":"On estimation of modal decompositions","author":"Makur","year":"(2020)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref194","article-title":"Polynomial spectral decomposition of conditional expectation operators","author":"Makur","year":"(2016)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"issue":"(12)","key":"2026032712173409700_ref195","doi-asserted-by":"crossref","first-page":"7716","DOI":"10.1109\/TIT.2017.2760626","article-title":"Polynomial singular value decompositions of a family of source-channel models","volume":"63","author":"Makur","year":"(2017)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref196","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1134\/S0032946020020015","article-title":"Comparison of contraction coefficients for f-divergences","volume":"56","author":"Makur","year":"(2020)","journal-title":"Probl. Inf. Transm."},{"key":"2026032712173409700_ref197","first-page":"4382","article-title":"Fairness-aware learning for continuous attributes and treatments","author":"Mary","year":"(2019)","journal-title":"Proc. Int. Conf. Machine Learning (ICML)"},{"key":"2026032712173409700_ref198","first-page":"1","article-title":"On the convergence rate of Good-Turing estimators","author":"McAllester","year":"(2000)","journal-title":"Proc. Conf. Comput. Learning Theory (COLT)"},{"key":"2026032712173409700_ref199","doi-asserted-by":"crossref","first-page":"2404","DOI":"10.1063\/1.526446","article-title":"Maximum entropy in the problem of moments","volume":"25","author":"Mead","year":"(1984)","journal-title":"J. Math. Phys."},{"key":"2026032712173409700_ref200","first-page":"161","article-title":"Ueber die entwicklung einer function von beliebig vielen variabeln nach Laplaceschen functionen h\u00f6herer ordnung","volume":"66","author":"Mehler","year":"(1866)","journal-title":"J. Reine, Angewandte Math."},{"key":"2026032712173409700_ref201","doi-asserted-by":"crossref","first-page":"1056","DOI":"10.1007\/978-1-4899-7687-1_964","volume-title":"Encyclopedia of Machine Learning and Data Mining","author":"Melville","year":"(2017)"},{"key":"2026032712173409700_ref202","article-title":"Nonlinear feature extraction using generalized canonical correlation analysis","author":"Melzer","year":"(2001)","journal-title":"Proc. Int. Conf. Artif. Neural Netw. (ICANN)"},{"key":"2026032712173409700_ref203","first-page":"1967","article-title":"Nonparametric canonical correlation analysis","author":"Michaeli","year":"(2016)","journal-title":"Proc. Int. Conf. Machine Learning (ICML)"},{"issue":"(4)","key":"2026032712173409700_ref204","first-page":"307","article-title":"The Gifi system of descriptive multivariate analysis","volume":"13","author":"Michailidis","year":"(1998)","journal-title":"Stat. Sci."},{"key":"2026032712173409700_ref205","unstructured":"Mikolov\n              T.\n            , K.Chen, G.Corrado, and J.Dean, \u201cEfficient estimation of word representations in vector space\u201d, CoRR, vol. abs\/1301.3781, (2013). arXiv: 1301.3781. http:\/\/arxiv.org\/abs\/1301.3781."},{"key":"2026032712173409700_ref206","first-page":"591","article-title":"An information-theoretic learning algorithm for neural network classification","author":"Miller","year":"(1996)","journal-title":"Advances Neural Inform. Process. Syst. (NIPS)"},{"key":"2026032712173409700_ref207","volume-title":"Perceptrons","author":"Minsky","year":"(1969)"},{"issue":"(2)","key":"2026032712173409700_ref208","first-page":"1069","article-title":"Estimation of (near) low-rank matrices with noise and high-dimensional scaling","volume":"39","author":"Negahban","year":"(2011)","journal-title":"Ann. Stat."},{"issue":"(1)","key":"2026032712173409700_ref209","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1006\/csla.1994.1001","article-title":"On structuring probabilistic dependences in stochastic language modeling","volume":"8","author":"Ney","year":"(1994)","journal-title":"Comput., Speech, Lang."},{"issue":"(3\u20134)","key":"2026032712173409700_ref210","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1080\/03461238.1937.10404821","article-title":"Smooth\u2019 test for goodness of fit","volume":"20","author":"Neyman","year":"(1937)","journal-title":"Scand. Actuar. J."},{"key":"2026032712173409700_ref211","first-page":"239","article-title":"Contribution to the theory of the ? 2 test","author":"Neyman","year":"(1949)","journal-title":"Proc. Berkeley Symp. Math. Stat. Prob."},{"key":"2026032712173409700_ref212","doi-asserted-by":"crossref","DOI":"10.1201\/9781420011203","volume-title":"Multivariate Nonlinear Descriptive Analysis","author":"Nishisato","year":"(2006)"},{"issue":"(3)","key":"2026032712173409700_ref213","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1007\/BF00275687","article-title":"A simplified neuron model as a principal component analyzer","volume":"15","author":"Oja","year":"(1982)","journal-title":"J. Math. Biology"},{"issue":"(6)","key":"2026032712173409700_ref214","doi-asserted-by":"crossref","first-page":"927","DOI":"10.1016\/S0893-6080(05)80089-9","article-title":"Principal components, minor components, and linear neural networks","volume":"5","author":"Oja","year":"(1992)","journal-title":"Neural Netw."},{"issue":"(1)","key":"2026032712173409700_ref215","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/S0925-2312(97)00045-3","article-title":"The nonlinear PCA learning rule in independent component analysis","volume":"17","author":"Oja","year":"(1997)","journal-title":"Neurocomputing"},{"key":"2026032712173409700_ref216","article-title":"Competitive distribution estimation: Why is Good-Turing good","author":"Orlitsky","year":"(2015)","journal-title":"Advances Neural Inform. Process. Syst. (NeurIPS)"},{"key":"2026032712173409700_ref217","author":"Painsky","year":"(2016)"},{"issue":"(2)","key":"2026032712173409700_ref218","article-title":"Nonlinear canonical correlation analysis: A compressed representation approach","volume":"22","author":"Painsky","year":"(2020)","journal-title":"Entropy"},{"issue":"(2)","key":"2026032712173409700_ref219","doi-asserted-by":"crossref","first-page":"1038","DOI":"10.1109\/TIT.2015.2510657","article-title":"Generalized independent component analysis over finite alphabets","volume":"62","author":"Painsky","year":"(2016)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref220","article-title":"Variational minimax estimation of discrete distributions under KL loss","author":"Paninski","year":"(2004)","journal-title":"Advances Neural Inform. Process. Syst. (NeurIPS)"},{"issue":"(3)","key":"2026032712173409700_ref221","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1214\/aoms\/1177704472","article-title":"On estimation of a probability density function and mode","volume":"33","author":"Parzen","year":"(1962)","journal-title":"Ann. Math. Stat."},{"key":"2026032712173409700_ref222","first-page":"71","article-title":"Contributions to the mathematical theory of evolution","volume":"185","author":"Pearson","year":"(1894)","journal-title":"Phil. Trans. Roy. Soc. London, A"},{"issue":"(302)","key":"2026032712173409700_ref223","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1080\/14786440009463897","article-title":"On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling","volume":"50","author":"Pearson","year":"(1900)","journal-title":"Philos. Mag., Series 5"},{"issue":"(11)","key":"2026032712173409700_ref224","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1080\/14786440109462720","article-title":"On lines and planes of closest fit to systems of points in space","volume":"2","author":"Pearson","year":"(1901)","journal-title":"Phil. Mag."},{"key":"2026032712173409700_ref225","volume-title":"On the Theory of Contingency and Its Relation to Association and Normal Correlation","author":"Pearson","year":"(1904)"},{"key":"2026032712173409700_ref226","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/978-1-4939-7005-6_7","volume-title":"Convexity and Concentration","author":"Polyanskiy","year":"(2017)"},{"key":"2026032712173409700_ref227","article-title":"Probabilistic clustering using maximal matrix norm couplings","author":"Qiu","year":"(2018)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"issue":"(6)","key":"2026032712173409700_ref228","doi-asserted-by":"crossref","first-page":"3355","DOI":"10.1109\/TIT.2016.2549542","article-title":"Strong data processing inequalities and F-Sobolev inequalities for discrete channels","volume":"62","author":"Raginsky","year":"(2016)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref229","volume-title":"Linear Statistical Inference and its Applications","author":"Rao","year":"(1965)"},{"issue":"(3)","key":"2026032712173409700_ref230","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1137\/070697835","article-title":"Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization","volume":"52","author":"Recht","year":"(2010)","journal-title":"SIAM Rev."},{"issue":"(1\u20132)","key":"2026032712173409700_ref231","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1007\/BF02063300","article-title":"A new version of the probabilistic generalization of the large sieve","volume":"10","author":"R\u00e9nyi","year":"(1959)","journal-title":"Acta Mathematica Academiae Scientiarum Hungarica"},{"issue":"(3\u20134)","key":"2026032712173409700_ref232","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1007\/BF02024507","article-title":"On measures of dependence","volume":"10","author":"R\u00e9nyi","year":"(1959)","journal-title":"Acta Math. Acad. Sci. Hung."},{"issue":"(2)","key":"2026032712173409700_ref233","first-page":"887","article-title":"Estimation of high-dimensional low-rank matrices","volume":"39","author":"Rohde","year":"(2011)","journal-title":"Ann. Stat."},{"issue":"(3)","key":"2026032712173409700_ref234","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1214\/aoms\/1177728190","article-title":"Remarks on some nonparametric estimates of a density function","volume":"27","author":"Rosenblatt","year":"(1956)","journal-title":"Ann. Math. Stat."},{"key":"2026032712173409700_ref235","volume-title":"Principles of Mathematical Analysis","author":"Rudin","year":"(1976)"},{"issue":"(1)","key":"2026032712173409700_ref236","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1147\/rd.41.0066","article-title":"Information theoretical analysis of multivariate correlation","volume":"4","author":"Watanabe","year":"(1960)","journal-title":"IBM J. Res. Develop."},{"key":"2026032712173409700_ref237","doi-asserted-by":"crossref","DOI":"10.1137\/1.9781611970739","volume-title":"Numerical Methods for Large Eigenvalue Problems","author":"Saad","year":"(2011)"},{"issue":"(4)","key":"2026032712173409700_ref238","first-page":"52","article-title":"The maximum correlation coefficient (nonsymmetric case)","volume":"121","author":"Sarmanov","year":"(1958)","journal-title":"Dockl. Akad. Nauk SSSR"},{"issue":"(4)","key":"2026032712173409700_ref239","first-page":"715","article-title":"The maximum correlation coefficient (symmetric case)","volume":"120","author":"Sarmanov","year":"(1958)","journal-title":"Dockl. Akad. Nauk SSSR"},{"key":"2026032712173409700_ref240","first-page":"269","article-title":"Maximum coefficients of multiple correlation","volume":"121","author":"Sarmanov","year":"(1960)","journal-title":"Dokl. Akad. Nauk SSSR"},{"key":"2026032712173409700_ref241","article-title":"Gaussian secure source coding and Wyner\u2019s common information","author":"Satpathy","year":"(2015)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref242","article-title":"Chi-square information for invariant learning","author":"Sattigeri","year":"(2020)","journal-title":"Proc. ICML Workshop Uncert., Robust. Deep Learn. (ICML-UDL"},{"key":"2026032712173409700_ref243","doi-asserted-by":"crossref","first-page":"293","DOI":"10.7551\/mitpress\/6173.003.0022","volume-title":"Semi-Supervised Learning","author":"Saul","year":"(2006)"},{"key":"2026032712173409700_ref244","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1007\/BF01449770","article-title":"Zur theorie der linearen und nichtlinearen integralgleichungen. I. Teil: Entwicklung willk\u00fcrlicher funktionen nach systemen vorgeschriebener","volume":"63","author":"Schmidt","year":"(1907)","journal-title":"Math. Ann."},{"key":"2026032712173409700_ref245","first-page":"583","article-title":"Kernel principal component analysis","author":"Sch\u00f6lkopf","year":"(1997)","journal-title":"Proc. Int. Conf. Artif. Neural Netw. (ICANN)"},{"issue":"(5)","key":"2026032712173409700_ref246","doi-asserted-by":"crossref","first-page":"1299","DOI":"10.1162\/089976698300017467","article-title":"Nonlinear component analysis as a kernel eigenvalue problem","volume":"10","author":"Sch\u00f6lkopf","year":"(1998)","journal-title":"Neural Comput."},{"issue":"(3)","key":"2026032712173409700_ref247","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1109\/TIT.1965.1053799","article-title":"Probability of error of some adaptive pattern-recognition machines","volume":"11","author":"Scudder","year":"(1965)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref248","first-page":"19598","article-title":"Selective regression under fairness criteria","author":"Shah","year":"(2022)","journal-title":"Proc. Int. Conf. Machine Learning (ICML)"},{"issue":"(8)","key":"2026032712173409700_ref249","first-page":"888","article-title":"Normalized cuts and image segmentation","volume":"22","author":"Shi","year":"(2000)","journal-title":"IEEE Trans. Pattern Anal. Machine Intell."},{"key":"2026032712173409700_ref250","doi-asserted-by":"crossref","unstructured":"Shwartz-Ziv\n              R.\n             and N.Tishby, \u201cOpening the black box of deep neural networks via information\u201d, CoRR, vol. abs\/1703.00810, (2017). arXiv: 1703.00810. http:\/\/arxiv.org\/abs\/1703.00810.","DOI":"10.1002\/gbc.20460"},{"key":"2026032712173409700_ref251","volume-title":"Density Estimation for Statistics and Data Analysis","author":"Silverman","year":"(1986)"},{"issue":"(4)","key":"2026032712173409700_ref252","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1137\/0503060","article-title":"On the symmetrized Kronecker power of a matrix and extensions of Mehler\u2019s formula for Hermite polynomials","volume":"3","author":"Slepian","year":"(1972)","journal-title":"SIAM J. Math. Anal."},{"key":"2026032712173409700_ref253","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1145\/345508.345578","article-title":"Document clustering using word clusters via the information bottleneck method","author":"Slonim","year":"(2000)","journal-title":"Proc. Int. Conf. Res., Dev. Inform. Retrieval (ACM SIGIR)"},{"issue":"(2)","key":"2026032712173409700_ref254","doi-asserted-by":"crossref","first-page":"201","DOI":"10.2307\/1412107","article-title":"\u2018General intelligence,\u2019 objectively determined and measured","volume":"15","author":"Spearman","year":"(1904)","journal-title":"Amer. J. Psychol."},{"key":"2026032712173409700_ref255","author":"Srebro","year":"(2004)"},{"issue":"(4)","key":"2026032712173409700_ref256","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1137\/1035134","article-title":"On the early history of the singular value decomposition","volume":"35","author":"Stewart","year":"(1993)","journal-title":"SIAM Rev."},{"key":"2026032712173409700_ref257","first-page":"99","volume-title":"SVD and Signal Processing, II: Algorithms, Analysis and Applications","author":"Stewart","year":"(1991)"},{"key":"2026032712173409700_ref258","first-page":"368","article-title":"The information bottleneck method","author":"Tishby","year":"(1999)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"key":"2026032712173409700_ref259","first-page":"619","article-title":"Data clustering by Markovian relaxation and the information bottleneck method","author":"Tishby","year":"(2000)","journal-title":"Advances Neural Inform. Process. Syst. (NIPS)"},{"key":"2026032712173409700_ref260","article-title":"Deep learning and the information bottleneck principle","author":"Tishby","year":"(2015)","journal-title":"Proc. Inform. Theory Workshop (ITW)"},{"key":"2026032712173409700_ref261","unstructured":"Tong\n              X.\n            , J.Xu, and S.-L.Huang, \u201cAn information-theoretic method for collaborative distributed learning with limited communication\u201d, CoRR, vol. abs\/2205.06515, (2022). arXiv: 2205.06515. https:\/\/arxiv.org\/abs\/2205.06515."},{"key":"2026032712173409700_ref262","article-title":"On sample complexity of learning shared representations: The asymptotic regime","author":"Tong","year":"(2022)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"key":"2026032712173409700_ref263","article-title":"A mathematical framework for quantifying transferability in multi-source transfer learning","author":"Tong","year":"(2021)","journal-title":"Advances Neural Inform. Process. Syst. (NeurIPS)"},{"key":"2026032712173409700_ref264","doi-asserted-by":"crossref","DOI":"10.1137\/1.9780898719574","volume-title":"Numerical Linear Algebra","author":"Trefethen","year":"(1997)"},{"issue":"(4)","key":"2026032712173409700_ref265","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1007\/s10208-011-9099-z","article-title":"User-friendly tail bounds for sums of random matrices","volume":"12","author":"Tropp","year":"(2012)","journal-title":"Found. Comp. Math."},{"key":"2026032712173409700_ref266","first-page":"6383","article-title":"Large-scale sparse kernel canonical correlation analysis","author":"Uurtio","year":"(2019)","journal-title":"Proc. Conf. Comput. Learning Theory (COLT)"},{"issue":"(4)","key":"2026032712173409700_ref267","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1109\/MSP.2018.2826566","article-title":"Robust subspace learning: Robust PCA, robust subspace tracking, and robust subspace recovery","volume":"35","author":"Vaswani","year":"(2018)","journal-title":"IEEE Signal Processing Mag."},{"issue":"(2)","key":"2026032712173409700_ref268","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1016\/j.neunet.2003.05.001","article-title":"Generalized neural networks for spectral analysis: Dynamics and Liapunov functions","volume":"17","author":"Vegas","year":"(2004)","journal-title":"Neural Netw."},{"issue":"(3)","key":"2026032712173409700_ref269","doi-asserted-by":"crossref","first-page":"905","DOI":"10.1109\/TNN.2007.891186","article-title":"Variational Bayesian approach to canonical correlation analysis","volume":"18","author":"Wang","year":"(2007)","journal-title":"IEEE Trans. Neural Netw."},{"key":"2026032712173409700_ref270","doi-asserted-by":"crossref","first-page":"329","DOI":"10.6339\/JDS.2004.02(4).156","article-title":"Estimating optimal transformations for multiple regression using the ACE algorithm","volume":"2","author":"Wang","year":"(2004)","journal-title":"J. Data Science"},{"issue":"(12)","key":"2026032712173409700_ref271","doi-asserted-by":"crossref","first-page":"8025","DOI":"10.1109\/TIT.2019.2934414","article-title":"Privacy with estimation guarantees","volume":"65","author":"Wang","year":"(2019)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(12)","key":"2026032712173409700_ref272","doi-asserted-by":"crossref","first-page":"8025","DOI":"10.1109\/TIT.2019.2934414","article-title":"Privacy with estimation guarantees","volume":"65","author":"Wang","year":"(2019)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref273","first-page":"5281","article-title":"An efficient approach to informative feature extraction from multimodal data","author":"Wang","year":"(2019)","journal-title":"Proc. AAAI Conf. Artif. Intell. (AAAI)"},{"key":"2026032712173409700_ref274","first-page":"1083","article-title":"On deep multiview representation learning","author":"Wang","year":"(2015)","journal-title":"Proc. Int. Conf. Machine Learning (ICML)"},{"issue":"(3)","key":"2026032712173409700_ref275","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1111\/j.2517-6161.1974.tb01020.x","article-title":"Generalized linear models specified in terms of constraints","volume":"36","author":"Wedderburn","year":"(1974)","journal-title":"J. Roy. Stat. Soc., Ser. B"},{"issue":"(2)","key":"2026032712173409700_ref276","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1111\/j.2517-6161.1958.tb00298.x","article-title":"On the smoothing of probability density functions","volume":"20","author":"Whittle","year":"(1958)","journal-title":"J. Roy. Stat. Soc., Ser. B"},{"issue":"(1)","key":"2026032712173409700_ref277","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1137\/0128010","article-title":"On sequences of pairs of dependent random variables","volume":"28","author":"Witsenhausen","year":"(1975)","journal-title":"SIAM J. Appl. Math."},{"issue":"(4)","key":"2026032712173409700_ref278","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1109\/TIT.1970.1054469","article-title":"Transmission of noisy information to a noisy receiver with minimum distortion","volume":"16","author":"Wolf","year":"(1970)","journal-title":"IEEE Trans. Inform. Theory"},{"issue":"(2)","key":"2026032712173409700_ref279","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1109\/TIT.1975.1055346","article-title":"The common information of two dependent random variables","volume":"21","author":"Wyner","year":"(1975)","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2026032712173409700_ref280","article-title":"On the asymptotic sample complexity of HGR maximal correlation functions in semi-supervised learning","author":"Xu","year":"(2019)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"key":"2026032712173409700_ref281","doi-asserted-by":"crossref","first-page":"26591","DOI":"10.1109\/ACCESS.2020.2971386","article-title":"Maximal correlation regression","volume":"8","author":"Xu","year":"(2020)","journal-title":"IEEE Access"},{"key":"2026032712173409700_ref282","doi-asserted-by":"crossref","first-page":"102616","DOI":"10.1109\/ACCESS.2020.2998825","article-title":"On the optimal tradeoff between computational efficiency and generalizability of Oja\u2019s algorithm","volume":"8","author":"Xu","year":"(2020)","journal-title":"IEEE Access"},{"key":"2026032712173409700_ref283","article-title":"An information theoretic framework for distributed learning algorithms","author":"Xu","year":"(2021)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref284","article-title":"On the sample complexity of estimating small singular modes","author":"Xu","year":"(2020)","journal-title":"Proc. Int. Symp. Inform. Theory (ISIT)"},{"key":"2026032712173409700_ref285","article-title":"Multivariate feature extraction","author":"Xu","year":"(2022)","journal-title":"Proc. Allerton Conf. Commun., Contr., Computing"},{"key":"2026032712173409700_ref286","article-title":"A semi-supervised learning approach for visual question answering based on maximal correlation","author":"Yin","year":"(2021)","journal-title":"Proc. Int. Conf. Syst., Man, Cybern. (SMC)"},{"issue":"(1)","key":"2026032712173409700_ref287","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1007\/BF02288574","article-title":"Maximum likelihood estimation and factor analysis","volume":"6","author":"Young","year":"(1940)","journal-title":"Psychometrika"},{"issue":"(39)","key":"2026032712173409700_ref288","doi-asserted-by":"crossref","first-page":"18280","DOI":"10.1021\/acs.iecr.9b03069","article-title":"Accelerated kernel canonical correlation analysis with fault relevance for nonlinear process fault isolation","volume":"58","author":"Yu","year":"(2019)","journal-title":"Int. Eng. Chem. Res."},{"key":"2026032712173409700_ref289","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1145\/235968.233324","article-title":"BIRCH: An efficient data clustering method for very large databases","author":"Zhang","year":"(1996)","journal-title":"Proc. ACM Conf. Management Data (SIGMOD)"},{"key":"2026032712173409700_ref290","first-page":"396","article-title":"Multimodal emotion recognition by extracting common and modality-specific information","author":"Zhang","year":"(2018)","journal-title":"Proc. Conf. Embed. Netw. Sensor Syst. (SENSYS)"}],"container-title":["Foundations and Trends\u00ae in Communications and Information Theory"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/ftcit\/article-pdf\/21\/1-2\/1\/11158681\/0100000107en.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/www.emerald.com\/ftcit\/article-pdf\/21\/1-2\/1\/11158681\/0100000107en.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T14:10:36Z","timestamp":1777471836000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.emerald.com\/ftcit\/article\/21\/1-2\/1\/1332645\/Universal-Features-for-High-Dimensional-Learning"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,5]]},"references-count":290,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2024,2,5]]}},"URL":"https:\/\/doi.org\/10.1561\/0100000107","relation":{},"ISSN":["1567-2190","1567-2328"],"issn-type":[{"value":"1567-2190","type":"print"},{"value":"1567-2328","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,5]]}}}