{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T16:26:18Z","timestamp":1754151978903,"version":"3.41.2"},"reference-count":61,"publisher":"IOP Publishing","issue":"3","license":[{"start":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T00:00:00Z","timestamp":1753056000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T00:00:00Z","timestamp":1753056000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/100000893","name":"Simons Foundation","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100000893","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2025,9,30]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>We present the information-ordered bottleneck (IOB), a neural layer designed to adaptively compress data into latent variables ordered by likelihood maximization. Without retraining, IOB nodes can be truncated at any bottleneck width, capturing the most crucial information in the first latent variables. Unifying several prior approaches, we demonstrate that IOB models achieve efficient compression of essential information for a given encoding architecture, while also assigning a semantically meaningful ordering to latent representations. IOBs demonstrate a remarkable ability to compress embeddings of high-dimensional image and text data, leveraging the performance of SOTA architectures such as CNNs, transformers, and diffusion models. Moreover, we introduce a novel theory for estimating global intrinsic dimensionality with IOBs and show that they recover SOTA dimensionality estimates for complex synthetic data. Furthermore, we showcase the utility of these models for exploratory analysis through applications on heterogeneous datasets, enabling computer-aided discovery of dataset complexity.<\/jats:p>","DOI":"10.1088\/2632-2153\/ade94d","type":"journal-article","created":{"date-parts":[[2025,6,27]],"date-time":"2025-06-27T18:56:15Z","timestamp":1751050575000},"page":"035010","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Ordered embeddings and intrinsic dimensionalities with information-ordered bottlenecks"],"prefix":"10.1088","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3207-8868","authenticated-orcid":true,"given":"Matthew","family":"Ho","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8328-1447","authenticated-orcid":false,"given":"Xiaosheng","family":"Zhao","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5854-8269","authenticated-orcid":false,"given":"Benjamin D","family":"Wandelt","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2025,7,21]]},"reference":[{"article-title":"Deep variational information bottleneck","year":"2016","author":"Alemi","key":"mlstade94dbib1"},{"key":"mlstade94dbib2","doi-asserted-by":"publisher","first-page":"5","DOI":"10.3847\/1538-4365\/ab917f","article-title":"Speculator: emulating stellar population synthesis for fast and accurate galaxy spectra and photometry","volume":"249","author":"Alsing","year":"2020","journal-title":"Astrophys. J. Suppl. Ser."},{"key":"mlstade94dbib3","doi-asserted-by":"publisher","first-page":"1768","DOI":"10.1007\/s10618-018-0578-6","article-title":"Extreme-value-theoretic estimation of local intrinsic dimensionality","volume":"32","author":"Amsaleg","year":"2018","journal-title":"Data Min. Knowl. Discovery"},{"key":"mlstade94dbib4","first-page":"p 32","article-title":"Intrinsic dimension of data representations in deep neural networks","author":"Ansuini","year":"2019"},{"article-title":"S7 21318 on district line (training), bow road","year":"2013","author":"Aubrey","key":"mlstade94dbib5"},{"key":"mlstade94dbib6","doi-asserted-by":"publisher","first-page":"1368","DOI":"10.3390\/e23101368","article-title":"Scikit-dimension: a python package for intrinsic dimension estimation","volume":"23","author":"Bac","year":"2021","journal-title":"Entropy"},{"key":"mlstade94dbib7","doi-asserted-by":"publisher","first-page":"91","DOI":"10.3847\/1538-4357\/abfc4d","article-title":"Extracting the main trend in a data set: the sequencer algorithm","volume":"916","author":"Baron","year":"2021","journal-title":"Astrophys. J."},{"article-title":"Fondue: an algorithm to find the optimal dimensionality of the latent representations of variational autoencoders","year":"2022","author":"Bonheme","key":"mlstade94dbib8"},{"article-title":"How complex are galaxies? a non-parametric estimation of the intrinsic dimensionality of wide-band photometric data","year":"2024","author":"Cadiou","key":"mlstade94dbib9"},{"key":"mlstade94dbib10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1745-6150-2-2","article-title":"Component retention in principal component analysis with application to cdna microarray data","volume":"2","author":"Cangelosi","year":"2007","journal-title":"Biol. Direct"},{"article-title":"Gilles simon serving to alexandr dolgopolov. The boodles, stoke park 2012","year":"2013","author":"Carine06","key":"mlstade94dbib11"},{"article-title":"untroubled","year":"2008","author":"cloudcricket","key":"mlstade94dbib12"},{"article-title":"Giraffe","year":"2013","author":"de Jong","key":"mlstade94dbib13"},{"key":"mlstade94dbib14","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-017-11873-y","article-title":"Estimating the intrinsic dimension of datasets by a minimal neighborhood information","volume":"7","author":"Facco","year":"2017","journal-title":"Sci. Rep."},{"key":"mlstade94dbib15","first-page":"pp 265","article-title":"Manifold-adaptive dimension estimation","author":"Farahmand","year":"2007"},{"article-title":"Barquera vietnamita","year":"2011","author":"Garrido","key":"mlstade94dbib16"},{"key":"mlstade94dbib17","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1109\/JSAIT.2020.2991561","article-title":"The information bottleneck problem and its applications in machine learning","volume":"1","author":"Goldfeld","year":"2020","journal-title":"IEEE J. Sel. Areas Inf. Theory"},{"key":"mlstade94dbib18","first-page":"pp 241","article-title":"Medical image denoising using convolutional denoising autoencoders","author":"Gondara","year":"2016"},{"key":"mlstade94dbib19","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1016\/0167-2789(83)90298-1","article-title":"Measuring the strangeness of strange attractors","volume":"9","author":"Grassberger","year":"1983","journal-title":"Physica D"},{"key":"mlstade94dbib20","doi-asserted-by":"publisher","first-page":"358","DOI":"10.1007\/s11263-008-0144-6","article-title":"Translated poisson mixture model for stratification learning","volume":"80","author":"Haro","year":"2008","journal-title":"Int. J. Comput. Vis."},{"key":"mlstade94dbib21","first-page":"pp 16000","article-title":"Masked autoencoders are scalable vision learners","author":"He","year":"2022"},{"key":"mlstade94dbib22","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1146\/annurev-astro-081913-035722","article-title":"The coevolution of galaxies and supermassive black holes: insights from surveys of the contemporary Universe","volume":"52","author":"Heckman","year":"2014","journal-title":"Annu. Rev. Astron. Astrophys."},{"article-title":"Beta-vae: learning basic visual concepts with a constrained variational framework","year":"2016","author":"Higgins","key":"mlstade94dbib23"},{"key":"mlstade94dbib24","doi-asserted-by":"crossref","DOI":"10.33232\/001c.120559","article-title":"Ltu-ili: an all-in-one framework for implicit inference in astrophysics and cosmology","author":"Ho","year":"2024"},{"key":"mlstade94dbib25","first-page":"pp 12225","article-title":"Intrinsic dimensionality estimation using normalizing flows","volume":"vol 35","author":"Horvat","year":"2022"},{"key":"mlstade94dbib26","article-title":"Learning robust statistics for simulation-based inference under model misspecification","volume":"vol 36","author":"Huang","year":"2024"},{"key":"mlstade94dbib27","doi-asserted-by":"publisher","DOI":"10.1098\/rsta.2015.0202","article-title":"Principal component analysis: a review and recent developments","volume":"374","author":"Jolliffe","year":"2016","journal-title":"Phil. Trans. R. Soc. A"},{"article-title":"Adam: a method for stochastic optimization","year":"2014","author":"Kingma","key":"mlstade94dbib28"},{"article-title":"Auto-encoding variational bayes","year":"2013","author":"Kingma","key":"mlstade94dbib29"},{"key":"mlstade94dbib30","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"mlstade94dbib31","doi-asserted-by":"publisher","DOI":"10.1038\/srep25696","article-title":"Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data","volume":"6","author":"Lenz","year":"2016","journal-title":"Sci. Rep."},{"key":"mlstade94dbib32","first-page":"pp 740","article-title":"Microsoft coco: common objects in context","volume":"vol 13","author":"Lin","year":"2014"},{"key":"mlstade94dbib33","doi-asserted-by":"publisher","first-page":"1081","DOI":"10.1038\/s42003-024-06788-0","article-title":"Semantic redundancy-aware implicit neural compression for multidimensional biomedical image data","volume":"7","author":"Ma","year":"2024","journal-title":"Commun. Biol."},{"key":"mlstade94dbib34","first-page":"pp 4402","article-title":"Disentangling disentanglement in variational autoencoders","author":"Mathieu","year":"2019"},{"article-title":"Dsprites: disentanglement testing sprites dataset","year":"2017","author":"Matthey","key":"mlstade94dbib35"},{"key":"mlstade94dbib36","first-page":"p 31","article-title":"Joint autoregressive and hierarchical priors for learned image compression","author":"Minnen","year":"2018"},{"article-title":"Bookshelf","year":"2007","author":"Murch","key":"mlstade94dbib37"},{"key":"mlstade94dbib38","first-page":"p 32","article-title":"Pytorch: An imperative style, high-performance deep learning library","author":"Paszke","year":"2019"},{"key":"mlstade94dbib39","doi-asserted-by":"publisher","first-page":"569","DOI":"10.1007\/s10851-022-01077-z","article-title":"Pca-ae: Principal component analysis autoencoder for organising the latent space of generative networks","volume":"64","author":"Pham","year":"2022","journal-title":"J. Math. Imaging Vis."},{"key":"mlstade94dbib40","doi-asserted-by":"publisher","first-page":"45","DOI":"10.3847\/1538-3881\/ab9644","article-title":"Dimensionality reduction of sdss spectra with variational autoencoders","volume":"160","author":"Portillo","year":"2020","journal-title":"Astron. J."},{"key":"mlstade94dbib41","first-page":"pp 8748","article-title":"Learning transferable visual models from natural language supervision","author":"Radford","year":"2021"},{"article-title":"Hierarchical text-conditional image generation with clip latents","year":"2022","author":"Ramesh","key":"mlstade94dbib42"},{"key":"mlstade94dbib43","first-page":"pp 8821","article-title":"Zero-shot text-to-image generation","author":"Ramesh","year":"2021"},{"key":"mlstade94dbib44","doi-asserted-by":"publisher","first-page":"1946","DOI":"10.1093\/mnras\/stad1332","article-title":"Mapping the x-ray variability of grs 1915+ 105 with machine learning","volume":"523","author":"Ricketts","year":"2023","journal-title":"Mon. Not. R. Astron. Soc."},{"key":"mlstade94dbib45","first-page":"pp 1746","article-title":"Learning ordered representations with nested dropout","author":"Rippel","year":"2014"},{"key":"mlstade94dbib46","first-page":"pp 10684","article-title":"High-resolution image synthesis with latent diffusion models","author":"Rombach","year":"2022"},{"key":"mlstade94dbib47","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1007\/s10994-012-5294-7","article-title":"Novel high intrinsic dimensionality estimators","volume":"89","author":"Rozza","year":"2012","journal-title":"Mach. Learn."},{"article-title":"Sparsity-inducing categorical prior improves robustness of the information bottleneck","year":"2022","author":"Samaddar","key":"mlstade94dbib48"},{"key":"mlstade94dbib49","doi-asserted-by":"publisher","DOI":"10.1088\/1742-5468\/ab3985","article-title":"On the information bottleneck theory of deep learning","author":"Saxe","year":"2019","journal-title":"J. Stat. Mech."},{"key":"mlstade94dbib50","first-page":"pp 583","article-title":"Kernel principal component analysis","author":"Sch\u00f6lkopf","year":"2005"},{"key":"mlstade94dbib51","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1088\/0004-6256\/146\/2\/32","article-title":"The multi-object, fiber-fed spectrographs for the sloan digital sky survey and the baryon oscillation spectroscopic survey","volume":"146","author":"Smee","year":"2013","journal-title":"Astron. J."},{"article-title":"Triangular dropout: variable network width without retraining","year":"2022","author":"Staley","key":"mlstade94dbib52"},{"key":"mlstade94dbib53","doi-asserted-by":"publisher","first-page":"16777","DOI":"10.1109\/ACCESS.2019.2895022","article-title":"Bayesian compressed vector autoregression for financial time-series analysis and forecasting","volume":"7","author":"Taveeapiradeecharoen","year":"2019","journal-title":"IEEE Access"},{"key":"mlstade94dbib54","first-page":"pp 21205","article-title":"Lidl: Local intrinsic dimension estimation using approximate likelihood","author":"Tempczyk","year":"2022"},{"key":"mlstade94dbib55","first-page":"p 07","article-title":"The information bottleneck method","volume":"vol 49","author":"Tishby","year":"2001"},{"key":"mlstade94dbib56","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1214\/aoms\/1177732360","article-title":"The large-sample distribution of the likelihood ratio for testing composite hypotheses","volume":"9","author":"Wilks","year":"1938","journal-title":"Ann. Math. Stat."},{"key":"mlstade94dbib57","doi-asserted-by":"publisher","first-page":"2603","DOI":"10.1086\/425626","article-title":"Spectral classification of quasars in the sloan digital sky survey: eigenspectra, redshift and luminosity effects","volume":"128","author":"Yip","year":"2004a","journal-title":"Astron. J."},{"key":"mlstade94dbib58","doi-asserted-by":"publisher","first-page":"585","DOI":"10.1086\/422429","article-title":"Distributions of galaxy spectral types in the sloan digital sky survey","volume":"128","author":"Yip","year":"2004b","journal-title":"Astron. J."},{"key":"mlstade94dbib59","doi-asserted-by":"publisher","first-page":"1579","DOI":"10.1086\/301513","article-title":"The sloan digital sky survey: technical summary","volume":"120","author":"York","year":"2000","journal-title":"Astron. J."},{"article-title":"P7240356","year":"2010","author":"Yurasko","key":"mlstade94dbib60"},{"key":"mlstade94dbib61","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-021-89328-8","article-title":"On local intrinsic dimensionality of deformation in complex materials","volume":"11","author":"Zhou","year":"2021","journal-title":"Sci. Rep."}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T10:53:17Z","timestamp":1753095197000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ade94d"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,21]]},"references-count":61,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,7,21]]},"published-print":{"date-parts":[[2025,9,30]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ade94d","relation":{},"ISSN":["2632-2153"],"issn-type":[{"type":"electronic","value":"2632-2153"}],"subject":[],"published":{"date-parts":[[2025,7,21]]},"assertion":[{"value":"Ordered embeddings and intrinsic dimensionalities with information-ordered bottlenecks","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2025 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2024-04-01","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2025-06-27","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2025-07-21","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}