{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T03:42:33Z","timestamp":1772077353434,"version":"3.50.1"},"reference-count":33,"publisher":"Wiley","license":[{"start":{"date-parts":[[2022,1,6]],"date-time":"2022-01-06T00:00:00Z","timestamp":1641427200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["91948303"],"award-info":[{"award-number":["91948303"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Scientific Programming"],"published-print":{"date-parts":[[2022,1,6]]},"abstract":"<jats:p>Density peaks clustering (DPC) is a well-known density-based clustering algorithm that can deal with nonspherical clusters well. However, DPC has high computational complexity and space complexity in calculating local density <jats:inline-formula>\n                     <a:math xmlns:a=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M1\">\n                        <a:mi>\u03c1<\/a:mi>\n                     <\/a:math>\n                  <\/jats:inline-formula> and distance <jats:inline-formula>\n                     <c:math xmlns:c=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M2\">\n                        <c:mi>\u03b4<\/c:mi>\n                     <\/c:math>\n                  <\/jats:inline-formula>, which makes it suitable only for small-scale data sets. In addition, for clustering high-dimensional data, the performance of DPC still needs to be improved. High-dimensional data not only make the data distribution more complex but also lead to more computational overheads. To address the above issues, we propose an improved density peaks clustering algorithm, which combines feature reduction and data sampling strategy. Specifically, features of the high-dimensional data are automatically extracted by principal component analysis (PCA), auto-encoder (AE), and t-distributed stochastic neighbor embedding (t-SNE). Next, in order to reduce the computational overhead, we propose a novel data sampling method for the low-dimensional feature data. Firstly, the data distribution in the low-dimensional feature space is estimated by the Quasi-Monte Carlo (QMC) sequence with low-discrepancy characteristics. Then, the representative QMC points are selected according to their cell densities. Next, the selected QMC points are used to calculate <jats:inline-formula>\n                     <e:math xmlns:e=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M3\">\n                        <e:mi>\u03c1<\/e:mi>\n                     <\/e:math>\n                  <\/jats:inline-formula> and <jats:inline-formula>\n                     <g:math xmlns:g=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M4\">\n                        <g:mi>\u03b4<\/g:mi>\n                     <\/g:math>\n                  <\/jats:inline-formula> instead of the original data points. In general, the number of the selected QMC points is much smaller than that of the initial data set. Finally, a two-stage classification strategy based on the QMC points clustering results is proposed to classify the original data set. Compared with current works, our proposed algorithm can reduce the computational complexity from <jats:inline-formula>\n                     <i:math xmlns:i=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M5\">\n                        <i:mi>O<\/i:mi>\n                        <i:mfenced open=\"(\" close=\")\" separators=\"|\">\n                           <i:mrow>\n                              <i:msup>\n                                 <i:mrow>\n                                    <i:mi>n<\/i:mi>\n                                 <\/i:mrow>\n                                 <i:mrow>\n                                    <i:mn>2<\/i:mn>\n                                 <\/i:mrow>\n                              <\/i:msup>\n                           <\/i:mrow>\n                        <\/i:mfenced>\n                     <\/i:math>\n                  <\/jats:inline-formula> to <jats:inline-formula>\n                     <n:math xmlns:n=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M6\">\n                        <n:mi>O<\/n:mi>\n                        <n:mfenced open=\"(\" close=\")\" separators=\"|\">\n                           <n:mrow>\n                              <n:mi>N<\/n:mi>\n                              <n:mi>n<\/n:mi>\n                           <\/n:mrow>\n                        <\/n:mfenced>\n                     <\/n:math>\n                  <\/jats:inline-formula>, where <jats:inline-formula>\n                     <s:math xmlns:s=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M7\">\n                        <s:mi>N<\/s:mi>\n                     <\/s:math>\n                  <\/jats:inline-formula> denotes the number of selected QMC points and <jats:inline-formula>\n                     <u:math xmlns:u=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M8\">\n                        <u:mi>n<\/u:mi>\n                     <\/u:math>\n                  <\/jats:inline-formula> is the size of original data set, typically <jats:inline-formula>\n                     <w:math xmlns:w=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"M9\">\n                        <w:mi>N<\/w:mi>\n                        <w:mo>\u226a<\/w:mo>\n                        <w:mi>n<\/w:mi>\n                     <\/w:math>\n                  <\/jats:inline-formula>. Experimental results demonstrate that the proposed algorithm can effectively reduce the computational overhead and improve the model performance.<\/jats:p>","DOI":"10.1155\/2022\/8046620","type":"journal-article","created":{"date-parts":[[2022,1,6]],"date-time":"2022-01-06T20:50:10Z","timestamp":1641502210000},"page":"1-17","source":"Crossref","is-referenced-by-count":3,"title":["Density Peaks Clustering Based on Feature Reduction and Quasi-Monte Carlo"],"prefix":"10.1155","volume":"2022","author":[{"given":"Zhihui","family":"Hu","sequence":"first","affiliation":[{"name":"Artificial Intelligence Research Center, Defense Innovation Institute, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoran","family":"Wei","sequence":"additional","affiliation":[{"name":"Ocean College, Zhejiang University, Zhoushan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoxu","family":"Han","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7224-1274","authenticated-orcid":true,"given":"Guang","family":"Kou","sequence":"additional","affiliation":[{"name":"Artificial Intelligence Research Center, Defense Innovation Institute, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Artificial Intelligence Research Center, Defense Innovation Institute, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xueyi","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Science, China Jiliang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yefei","family":"Bai","sequence":"additional","affiliation":[{"name":"Ocean College, Zhejiang University, Zhoushan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","reference":[{"key":"1","doi-asserted-by":"publisher","DOI":"10.1007\/s41870-020-00427-7"},{"key":"2","first-page":"281","article-title":"Some methods for classification and analysis of multivariate observations","author":"J. Macqueen"},{"key":"3","first-page":"226","article-title":"A density-based algorithm for discovering clusters in large spatial databases with noise","volume":"96","author":"M. Ester","year":"1996","journal-title":"KDD"},{"key":"4","doi-asserted-by":"publisher","DOI":"10.1126\/science.1136800"},{"key":"5","doi-asserted-by":"publisher","DOI":"10.1126\/science.1242072"},{"key":"6","doi-asserted-by":"publisher","DOI":"10.7544\/issn1000-1239.2016.20150616"},{"key":"7","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2016.2628747"},{"key":"8","doi-asserted-by":"publisher","DOI":"10.1016\/j.physa.2019.03.012"},{"key":"9","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2016.03.011"},{"key":"10","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2019.06.064"},{"key":"11","doi-asserted-by":"publisher","DOI":"10.1126\/science.1127647"},{"issue":"11","key":"12","article-title":"Visualizing Data Using T-SNE","volume":"9","author":"L. Van der Maaten","year":"2008","journal-title":"Journal of Machine Learning Research"},{"key":"13","doi-asserted-by":"publisher","DOI":"10.1016\/j.aei.2020.101105"},{"key":"14","doi-asserted-by":"publisher","DOI":"10.1016\/j.forsciint.2020.110194"},{"key":"15","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbiomech.2020.110106"},{"key":"16","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.105454"},{"key":"17","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2016.02.001"},{"key":"18","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2018.03.031"},{"key":"19","doi-asserted-by":"publisher","DOI":"10.1109\/spac.2017.8304248"},{"key":"20","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.06.087"},{"key":"21","doi-asserted-by":"publisher","DOI":"10.1109\/access.2019.2926579"},{"key":"22","doi-asserted-by":"publisher","DOI":"10.1109\/access.2019.2957242"},{"issue":"1","key":"23","first-page":"981","article-title":"Sampling methods for the Nystr\u00f6m method","volume":"13","author":"S. Kumar","year":"2012","journal-title":"Journal of Machine Learning Research"},{"key":"24","doi-asserted-by":"publisher","DOI":"10.1016\/j.jco.2021.101587"},{"key":"25","doi-asserted-by":"publisher","DOI":"10.1109\/iccss52145.2020.9336844"},{"key":"26","doi-asserted-by":"publisher","DOI":"10.1007\/bf01386213"},{"key":"27","doi-asserted-by":"publisher","DOI":"10.4064\/aa-41-4-337-351"},{"key":"28","doi-asserted-by":"publisher","DOI":"10.1007\/bf01294651"},{"key":"29","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-8-3"},{"key":"30","volume-title":"UCI Machine Learning Repository","author":"K. Bache","year":"2013"},{"key":"31","doi-asserted-by":"publisher","DOI":"10.1145\/1217299.1217303"},{"key":"32","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2005.09.012"},{"key":"33","doi-asserted-by":"publisher","DOI":"10.1109\/tkde.2016.2609423"}],"container-title":["Scientific Programming"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/sp\/2022\/8046620.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/sp\/2022\/8046620.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/sp\/2022\/8046620.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,6]],"date-time":"2022-01-06T20:50:16Z","timestamp":1641502216000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/sp\/2022\/8046620\/"}},"subtitle":[],"editor":[{"given":"Jiangbo","family":"Qian","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2022,1,6]]},"references-count":33,"alternative-id":["8046620","8046620"],"URL":"https:\/\/doi.org\/10.1155\/2022\/8046620","relation":{},"ISSN":["1875-919X","1058-9244"],"issn-type":[{"value":"1875-919X","type":"electronic"},{"value":"1058-9244","type":"print"}],"subject":[],"published":{"date-parts":[[2022,1,6]]}}}