{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T12:17:11Z","timestamp":1777033031366,"version":"3.51.4"},"reference-count":26,"publisher":"MIT Press - Journals","issue":"8","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,7,14]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>The t-distributed stochastic neighbor embedding (t-SNE) method is one of the leading techniques for data visualization and clustering. This method finds lower-dimensional embedding of data points while minimizing distortions in distances between neighboring data points. By construction, t-SNE discards information about large-scale structure of the data. We show that adding a global cost function to the t-SNE cost function makes it possible to cluster the data while preserving global intercluster data structure. We test the new global t-SNE (g-SNE) method on one synthetic and two real data sets on flower shapes and human brain cells. We find that significant and meaningful global structure exists in both the plant and human brain data sets. In all cases, g-SNE outperforms t-SNE and UMAP in preserving the global structure. Topological analysis of the clustering result makes it possible to find an appropriate trade-off of data distribution across scales. We find differences in how data are distributed across scales between the two subjects that were part of the human brain data set. Thus, by striving to produce both accurate clustering and positioning between clusters, the g-SNE method can identify new aspects of data organization across scales.<\/jats:p>","DOI":"10.1162\/neco_a_01504","type":"journal-article","created":{"date-parts":[[2022,7,7]],"date-time":"2022-07-07T23:46:08Z","timestamp":1657237568000},"page":"1637-1651","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":31,"title":["Using Global t-SNE to Preserve Intercluster Data Structure"],"prefix":"10.1162","volume":"34","author":[{"given":"Yuansheng","family":"Zhou","sequence":"first","affiliation":[{"name":"Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, U.S.A."},{"name":"Division of Biological Sciences, University of California San Diego, La Jolla, CA 92037, U.S.A. yuz461@ucsd.edu"}]},{"given":"Tatyana O.","family":"Sharpee","sequence":"additional","affiliation":[{"name":"Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, U.S.A."},{"name":"Department of Physics, University of California San Diego, La Jolla, CA 92037, U.S.A. sharpee@salk.edu"}]}],"member":"281","published-online":{"date-parts":[[2022,7,14]]},"reference":[{"key":"2022071522332837300_B1","author":"Allen Institute for Brain Science","year":"2014","journal-title":"Allen human brain atlas."},{"issue":"6","key":"2022071522332837300_B2","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.2594","article-title":"viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia","volume":"31","author":"Amir","year":"2013","journal-title":"Nature Biotechnology"},{"key":"2022071522332837300_B3","volume-title":"Technical white paper: Microarray data normalization","author":"Atlas","year":"2013"},{"issue":"1","key":"2022071522332837300_B4","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.4314","article-title":"Dimensionality reduction for visualizing single-cell data using UMAP","volume":"37","author":"Becht","year":"2019","journal-title":"Nature Biotechnology"},{"key":"2022071522332837300_B5","author":"Belkina","year":"2018","journal-title":"Automated optimal parameters for t-distributed stochastic neighbor embedding improve visualization and allow analysis of large datasets."},{"key":"2022071522332837300_B6","article-title":"Perplexity-free t-SNE and twice student tt-SNE","volume-title":"Proceedings of the European Symposium on Artificial Neural Networks.","author":"De Bodt","year":"2018"},{"key":"2022071522332837300_B7","first-page":"488","article-title":"Visualizing class structure of multidimensional data","author":"Dhillon","year":"1998","journal-title":"Symposium on the Interface: Computing Science and Statistics"},{"issue":"1","key":"2022071522332837300_B8","doi-asserted-by":"crossref","DOI":"10.1038\/s41467-018-04368-5","article-title":"Interpretable dimensionality reduction of single cell transcriptome data with deep generative models","volume":"9","author":"Ding","year":"2018","journal-title":"Nature Communications"},{"issue":"2","key":"2022071522332837300_B9","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1111\/j.1469-1809.1936.tb02137.x","article-title":"The use of multiple measurements in taxonomic problems","volume":"7","author":"Fisher","year":"1936","journal-title":"Annals of Eugenics"},{"issue":"44","key":"2022071522332837300_B10","doi-asserted-by":"publisher","first-page":"13455","DOI":"10.1073\/pnas.1506407112","article-title":"Clique topology reveals intrinsic geometric structure in neural correlations","volume":"112","author":"Giusti","year":"2015","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"7416","key":"2022071522332837300_B11","doi-asserted-by":"publisher","DOI":"10.1038\/nature11405","article-title":"An anatomically comprehensive atlas of the adult human brain transcriptome","volume":"489","author":"Hawrylycz","year":"2012","journal-title":"Nature"},{"key":"2022071522332837300_B12","author":"Kobak","year":"2018","journal-title":"The art of using t-SNE for single-cell transcriptomics"},{"issue":"1","key":"2022071522332837300_B13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/BF02289565","article-title":"Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis","volume":"29","author":"Kruskal","year":"1964","journal-title":"Psychometrika"},{"issue":"3","key":"2022071522332837300_B14","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-018-0308-4","article-title":"Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data","volume":"16","author":"Linderman","year":"2019","journal-title":"Nature Methods"},{"issue":"November","key":"2022071522332837300_B15","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Maaten","year":"2008","journal-title":"Journal of Machine Learning Research"},{"issue":"4","key":"2022071522332837300_B16","doi-asserted-by":"publisher","first-page":"966","DOI":"10.1016\/j.celrep.2015.12.082","article-title":"Single-cell RNA-sequencing reveals a continuous spectrum of differentiation in hematopoietic cells","volume":"14","author":"Macaulay","year":"2016","journal-title":"Cell Reports"},{"key":"2022071522332837300_B17","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1016\/j.ymeth.2014.10.004","article-title":"Visualizing the spatial gene expression organization in the brain through non-linear similarity embeddings","volume":"73","author":"Mahfouz","year":"2015","journal-title":"Methods"},{"key":"2022071522332837300_B18","author":"McInnes","year":"2018","journal-title":"UMAP: Uniform manifold approximation and projection for dimension reduction"},{"issue":"6235","key":"2022071522332837300_B19","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1126\/science.aaa0355","article-title":"The human transcriptome across tissues and individuals","volume":"348","author":"Mel\u00e9","year":"2015","journal-title":"Science"},{"issue":"7540","key":"2022071522332837300_B20","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"2022071522332837300_B21","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1111\/cgf.12878","article-title":"Hierarchical stochastic neighbor embedding","volume":"35","author":"Pezzotti","year":"2016","journal-title":"Computer Graphics Forum"},{"key":"2022071522332837300_B22","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1038\/nbt.4103","article-title":"Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain","volume":"36","author":"Raj","year":"2018","journal-title":"Nature Biotechnology"},{"issue":"4468","key":"2022071522332837300_B23","doi-asserted-by":"publisher","first-page":"390","DOI":"10.1126\/science.210.4468.390","article-title":"Multidimensional scaling, tree-fitting, and clustering","volume":"210","author":"Shepard","year":"1980","journal-title":"Science"},{"issue":"10","key":"2022071522332837300_B24","doi-asserted-by":"crossref","DOI":"10.23915\/distill.00002","article-title":"How to use t-SNE effectively","volume":"1","author":"Wattenberg","year":"2016","journal-title":"Distill"},{"issue":"6","key":"2022071522332837300_B25","doi-asserted-by":"publisher","first-page":"656","DOI":"10.1016\/j.cels.2018.10.015","article-title":"Visualizing and interpreting single-cell gene expression datasets with similarity weighted nonnegative embedding","volume":"7","author":"Wu","year":"2018","journal-title":"Cell Systems"},{"issue":"8","key":"2022071522332837300_B26","doi-asserted-by":"crossref","DOI":"10.1126\/sciadv.aaq1458","article-title":"Hyperbolic geometry of the olfactory space","volume":"4","author":"Zhou","year":"2018","journal-title":"Science Advances"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/34\/8\/1637\/2034896\/neco_a_01504.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/34\/8\/1637\/2034896\/neco_a_01504.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,15]],"date-time":"2022-07-15T22:34:01Z","timestamp":1657924441000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/34\/8\/1637\/111786\/Using-Global-t-SNE-to-Preserve-Intercluster-Data"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,14]]},"references-count":26,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2022,7,14]]},"published-print":{"date-parts":[[2022,7,14]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01504","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,8]]},"published":{"date-parts":[[2022,7,14]]}}}