{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,1]],"date-time":"2026-01-01T13:50:00Z","timestamp":1767275400306,"version":"3.37.3"},"reference-count":21,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,12,4]],"date-time":"2022-12-04T00:00:00Z","timestamp":1670112000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,12,4]],"date-time":"2022-12-04T00:00:00Z","timestamp":1670112000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100005416","name":"Norges Forskningsr\u00e5d","doi-asserted-by":"publisher","award":["287284"],"award-info":[{"award-number":["287284"]}],"id":[{"id":"10.13039\/501100005416","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Stat Comput"],"published-print":{"date-parts":[[2023,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Neighbor embedding (NE) aims to preserve pairwise similarities between data items and has been shown to yield an effective principle for data visualization. However, even the best existing NE methods such as stochastic neighbor embedding (SNE) may leave large-scale patterns hidden, for example clusters, despite strong signals being present in the data. To address this, we propose a new cluster visualization method based on the Neighbor Embedding principle. We first present a family of Neighbor Embedding methods that generalizes SNE by using non-normalized Kullback\u2013Leibler divergence with a scale parameter. In this family, much better cluster visualizations often appear with a parameter value different from the one corresponding to SNE. We also develop an efficient software that employs asynchronous stochastic block coordinate descent to optimize the new family of objective functions. Our experimental results demonstrate that the method consistently and substantially improves the visualization of data clusters compared with the state-of-the-art NE approaches. The code of our method is publicly available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/rozyangno\/sce\">https:\/\/github.com\/rozyangno\/sce<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s11222-022-10186-z","type":"journal-article","created":{"date-parts":[[2022,12,4]],"date-time":"2022-12-04T11:02:27Z","timestamp":1670151747000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Stochastic cluster embedding"],"prefix":"10.1007","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8412-5684","authenticated-orcid":false,"given":"Zhirong","family":"Yang","sequence":"first","affiliation":[]},{"given":"Yuwei","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Denis","family":"Sedov","sequence":"additional","affiliation":[]},{"given":"Samuel","family":"Kaski","sequence":"additional","affiliation":[]},{"given":"Jukka","family":"Corander","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,12,4]]},"reference":[{"key":"10186_CR1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-5056-2","volume-title":"Differential-Geometrical Methods in Statistics","author":"S Amari","year":"1985","unstructured":"Amari, S.: Differential-Geometrical Methods in Statistics. Springer, Berlin (1985)"},{"issue":"5415","key":"10186_CR2","first-page":"1","volume":"10","author":"A Belkina","year":"2019","unstructured":"Belkina, A., Ciccolella, C., Anno, R., Halpert, R., Spidlen, J., Snyder-Cappione, J.: Automated optimized parameters for t-distributed stochastic neighbor embedding improve visualization and analysis of large datasets. Nat. Commun. 10(5415), 1\u201312 (2019)","journal-title":"Nat. Commun."},{"key":"10186_CR3","doi-asserted-by":"crossref","unstructured":"Borgo, R., Lee, B., Bach, B., Fabrikant, S., Jianu, R., Kerren, A., Kobourov, S., McGee, F., Micallef, L., von Landesberger, T., Ballweg, K., Diehl, S., Simonetto, P., Zhou, M.: Crowdsourcing for information visualization: Promises and pitfalls. In: Archambault, D., Purchase, H., Ho\u00dffeld, T. (Eds.) Evaluation in the Crowd. Crowdsourcing and Human-Centered Experiments, Cham, Springer International Publishing. pp. 96\u2013138 (2017). ISBN 978-3-319-66435-4","DOI":"10.1007\/978-3-319-66435-4_5"},{"key":"10186_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jpdc.2019.04.008","volume":"131","author":"DM Chan","year":"2019","unstructured":"Chan, D.M., Rao, R., Huang, F., Canny, J.F.: Gpu accelerated t-distributed stochastic neighbor embedding. J. Parallel Distrib. Comput. 131, 1\u201313 (2019)","journal-title":"J. Parallel Distrib. Comput."},{"issue":"1","key":"10186_CR5","doi-asserted-by":"publisher","first-page":"58","DOI":"10.3390\/rs9010058","volume":"9","author":"Y Chen","year":"2017","unstructured":"Chen, Y., Hakala, T., Karjalainen, M., Feng, Z., Tang, J., Litkey, P., Kukko, A., Jaakkola, A., Hyypp\u00e4, J.: Uav-borne profiling radar for forest research. Remote Sens. 9(1), 58 (2017)","journal-title":"Remote Sens."},{"key":"10186_CR6","unstructured":"Hinton, G., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems (NIPS), pp. 857\u2013864 (2003)"},{"key":"10186_CR7","doi-asserted-by":"crossref","unstructured":"Kangasr\u00e4\u00e4si\u00f6, A., Athukorala, K., Howes, A., Corander, J., Kaski, S., Oulasvirta, A.: Inferring cognitive models from data using approximate Bayesian computation. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI), pp. 1295\u20131306 (2017)","DOI":"10.1145\/3025453.3025576"},{"issue":"16","key":"10186_CR8","first-page":"1","volume":"19","author":"J Lintusaari","year":"2018","unstructured":"Lintusaari, J., Vuollekoski, H., Kangasr\u00e4\u00e4si\u00f6, A., Skyt\u00e9n, K., J\u00e4rvenp\u00e4\u00e4, M., Marttinen, P., Gutmann, M.U., Vehtari, A., Corander, J., Kaski, S.: Elfi: engine for likelihood-free inference. J. Mach. Learn. Res. 19(16), 1\u20137 (2018)","journal-title":"J. Mach. Learn. Res."},{"key":"10186_CR9","doi-asserted-by":"crossref","unstructured":"McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv e-prints (2018)","DOI":"10.21105\/joss.00861"},{"issue":"6","key":"10186_CR10","doi-asserted-by":"publisher","first-page":"1588","DOI":"10.1109\/TVCG.2017.2674978","volume":"23","author":"L Micallef","year":"2017","unstructured":"Micallef, L., Palmas, G., Oulasvirta, A., Weinkauf, T.: Towards perceptual optimization of the visual design of scatterplots. IEEE Trans. Vis. Comput. Gr. 23(6), 1588\u20131599 (2017)","journal-title":"IEEE Trans. Vis. Comput. Gr."},{"issue":"1\u20132","key":"10186_CR11","first-page":"1","volume":"144","author":"P Richt\u00e1rik","year":"2011","unstructured":"Richt\u00e1rik, P., Tak\u00e1\u010d, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144(1\u20132), 1\u201338 (2011)","journal-title":"Math. Program."},{"key":"10186_CR12","doi-asserted-by":"crossref","unstructured":"Schulz, A., Hinder, F., Hammer, B.: Deepview: visualizing classification boundaries of deep neural networks as scatter plots using discriminative dimensionality reduction. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI), pp. 2305\u20132311 (2020)","DOI":"10.24963\/ijcai.2020\/319"},{"key":"10186_CR13","unstructured":"Sj\u00e4lander, M., Jahre, M., Tufte, G., Reissmann, N.: EPIC: An energy-efficient, high-performance GPGPU computing research infrastructure (2019)"},{"key":"10186_CR14","volume-title":"Introduction to data mining","author":"P Tan","year":"2005","unstructured":"Tan, P., Steinbach, M., Karpatne, A., Kumar, V.: Introduction to data mining. Addison Wesley, Boston (2005)"},{"key":"10186_CR15","doi-asserted-by":"crossref","unstructured":"Tang, J., Liu, J., Zhang, M., Mei, Q.: Visualizing large-scale and high-dimensional data. In: Proceedings of International Conference on World Wide Web (WWW), pp. 287\u2013297 (2016)","DOI":"10.1145\/2872427.2883041"},{"key":"10186_CR16","first-page":"3221","volume":"15","author":"L van der Maaten","year":"2014","unstructured":"van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, 3221\u20133245 (2014)","journal-title":"J. Mach. Learn. Res."},{"key":"10186_CR17","unstructured":"van der Maaten, L., Hinton, G.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579\u20132605 (2008)"},{"key":"10186_CR18","unstructured":"Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11, 451\u2013490 (2010)"},{"key":"10186_CR19","unstructured":"Vladymyrov, M., Carreira-Perpi\u00f1\u00e1n, M.: Linear-time training of nonlinear low-dimensional embeddings. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 968\u2013977 (2014)"},{"key":"10186_CR20","unstructured":"Yang, Z., Peltonen, J., Kaski, S.: Scalable optimization of neighbor embedding for visualization. In: Proceedings of International Conference on Machine Learning (ICML), pp. 127\u2013135 (2013)"},{"key":"10186_CR21","unstructured":"Yang, Z., Peltonen, J., Kaski, S.: Optimization equivalence of divergences improves neighbor embedding. In: Proceedings of International Conference on Machine Learning (ICML), pp. 460\u2013468 (2014)"}],"container-title":["Statistics and Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-022-10186-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11222-022-10186-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-022-10186-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,13]],"date-time":"2023-02-13T22:47:42Z","timestamp":1676328462000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11222-022-10186-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,4]]},"references-count":21,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,2]]}},"alternative-id":["10186"],"URL":"https:\/\/doi.org\/10.1007\/s11222-022-10186-z","relation":{},"ISSN":["0960-3174","1573-1375"],"issn-type":[{"type":"print","value":"0960-3174"},{"type":"electronic","value":"1573-1375"}],"subject":[],"published":{"date-parts":[[2022,12,4]]},"assertion":[{"value":"7 October 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 November 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 December 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Potential conflicts of interests in the reviewing process: ntnu.no, aalto.fi, helsinki.fi, uio.no, nsl.fi, and sanger.ac.uk. The research involved a user study on the <i>s<\/i>-value choice in GSNE. All participants were informed about the tasks. We collected only results from those who consented to the tasks. No identifying information or personal privacy was recorded in the user study.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"12"}}