{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:27:53Z","timestamp":1760146073526,"version":"build-2065373602"},"reference-count":32,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2024,9,27]],"date-time":"2024-09-27T00:00:00Z","timestamp":1727395200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100011065","name":"Shahid Rajaee Teacher Training University","doi-asserted-by":"publisher","award":["4891","GIU23\/022","PID2021-126701OB-I00"],"award-info":[{"award-number":["4891","GIU23\/022","PID2021-126701OB-I00"]}],"id":[{"id":"10.13039\/501100011065","id-type":"DOI","asserted-by":"publisher"}]},{"name":"University of the Basque Country UPV\/EHU","award":["4891","GIU23\/022","PID2021-126701OB-I00"],"award-info":[{"award-number":["4891","GIU23\/022","PID2021-126701OB-I00"]}]},{"name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades, AEI","award":["4891","GIU23\/022","PID2021-126701OB-I00"],"award-info":[{"award-number":["4891","GIU23\/022","PID2021-126701OB-I00"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Recently, considerable attention has been directed toward graph-based semi-supervised learning (GSSL) as an effective approach for data labeling. Despite the progress achieved by current methodologies, several limitations persist. Firstly, many studies treat all samples equally in terms of weight and influence, disregarding the potential increased importance of samples near decision boundaries. Secondly, the detection of outlier-labeled data is crucial, as it can significantly impact model performance. Thirdly, existing models often struggle with predicting labels for unseen test data, restricting their utility in practical applications. Lastly, most graph-based algorithms rely on affinity matrices that capture pairwise similarities across all data points, thus limiting their scalability to large-scale databases. In this paper, we propose a novel GSSL algorithm tailored for large-scale databases, leveraging anchor points to mitigate the challenges posed by large affinity matrices. Additionally, our method enhances the influence of nodes near decision boundaries by assigning different weights based on their importance and using a mapping function from feature space to label space. Leveraging this mapping function enables direct label prediction for test samples without requiring iterative learning processes. Experimental evaluations on two extensive datasets (Norb and Covtype) demonstrate that our approach is scalable and outperforms existing GSSL methods in terms of performance metrics.<\/jats:p>","DOI":"10.3390\/info15100591","type":"journal-article","created":{"date-parts":[[2024,9,27]],"date-time":"2024-09-27T11:19:33Z","timestamp":1727435973000},"page":"591","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Graph-Based Semi-Supervised Learning with Bipartite Graph for Large-Scale Data and Prediction of Unseen Data"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-7152-5167","authenticated-orcid":false,"given":"Mohammad","family":"Alemi","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran 16785-163, Iran"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0372-6144","authenticated-orcid":false,"given":"Alireza","family":"Bosaghzadeh","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran 16785-163, Iran"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6581-9680","authenticated-orcid":false,"given":"Fadi","family":"Dornaika","sequence":"additional","affiliation":[{"name":"Faculty of Computer Engineering, University of the Basque Country, 20018 San Sebastian, Spain"},{"name":"IKERBASQUE, Basque Foundation for Science, 48009 Bilbao, Spain"}]}],"member":"1968","published-online":{"date-parts":[[2024,9,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"626","DOI":"10.1109\/TNNLS.2019.2908504","article-title":"Fast semisupervised learning with bipartite graph for large-scale data","volume":"31","author":"He","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2240","DOI":"10.1109\/TNNLS.2014.2308325","article-title":"Semi-supervised domain adaptation on manifolds","volume":"25","author":"Cheng","year":"2014","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2039","DOI":"10.1109\/TPAMI.2010.35","article-title":"Semi-supervised classification via local spline regression","volume":"32","author":"Xiang","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_4","unstructured":"Joachims, T. (1999, January 27\u201330). Transductive inference for text classification using support vector machines. Proceedings of the International Conference on Machine Learning (ICML), Bled, Slovenia."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Blum, A., and Mitchell, T. (1998, January 24\u201326). Combining labeled and unlabeled data with co-training. Proceedings of the Eleventh Annual Conference on Computational Learning Theory, Madison, WI, USA.","DOI":"10.1145\/279943.279962"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"549","DOI":"10.1007\/s00521-009-0305-8","article-title":"A general graph-based semi-supervised learning with novel class discovery","volume":"19","author":"Nie","year":"2010","journal-title":"Neural Comput. Appl."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1167","DOI":"10.1109\/TKDE.2019.2901853","article-title":"Semi-supervised learning with auto-weighting feature and adaptive graph","volume":"32","author":"Nie","year":"2019","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_8","first-page":"5257","article-title":"Semi-supervised learning via bipartite graph construction with adaptive neighbors","volume":"35","author":"Wang","year":"2022","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1016\/j.neunet.2021.11.015","article-title":"Multiple-view flexible semi-supervised classification through consistent graph construction and label propagation","volume":"146","author":"Ziraki","year":"2022","journal-title":"Neural Netw."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"8174","DOI":"10.1109\/TNNLS.2022.3155478","article-title":"Graph-based semi-supervised learning: A comprehensive review","volume":"34","author":"Song","year":"2022","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1921","DOI":"10.1109\/TIP.2010.2044958","article-title":"Flexible manifold embedding: A framework for semi-supervised and unsupervised dimension reduction","volume":"19","author":"Nie","year":"2010","journal-title":"IEEE Trans. Image Process."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Li, Y., Nie, F., Huang, H., and Huang, J. (2015, January 25\u201330). Large-scale multi-view spectral clustering via bipartite graph. Proceedings of the AAAI Conference on Artificial Intelligence, Austin, TX, USA.","DOI":"10.1609\/aaai.v29i1.9598"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2786","DOI":"10.1109\/TCSVT.2018.2869875","article-title":"Accelerating flexible manifold embedding for scalable semi-supervised learning","volume":"29","author":"Qiu","year":"2018","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_14","first-page":"3111","article-title":"Bipartite graph based multi-view clustering","volume":"34","author":"Li","year":"2020","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Aromal, M., Rasool, A., Dubey, A., and Roy, B. (2021, January 4\u20136). Optimized Weighted Samples Based Semi-supervised Learning. Proceedings of the 2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India.","DOI":"10.1109\/ICESC51422.2021.9532994"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.asoc.2019.03.005","article-title":"Weighted samples based semi-supervised classification","volume":"79","author":"Chen","year":"2019","journal-title":"Appl. Soft Comput."},{"key":"ref_17","unstructured":"Zhu, X., Ghahramani, Z., and Lafferty, J.D. (2003, January 21\u201324). Semi-supervised learning using gaussian fields and harmonic functions. Proceedings of the 20th International Conference on Machine Learning (ICML-03), Washington, DC, USA."},{"key":"ref_18","unstructured":"Zhou, D., Bousquet, O., Lal, T., Weston, J., and Sch\u00f6lkopf, B. (2003). Learning with local and global consistency. Adv. Neural Inf. Process. Syst., 16."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Nie, F., Cai, G., and Li, X. (2017, January 4\u20139). Multi-view clustering and semi-supervised classification with adaptive neighbours. Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.10909"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1109\/TKDE.2019.2903810","article-title":"GMC: Graph-based multi-view clustering","volume":"32","author":"Wang","year":"2019","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1016\/j.patrec.2018.06.024","article-title":"Fast spectral clustering learning with hierarchical bipartite graph for large-scale data","volume":"130","author":"Yang","year":"2020","journal-title":"Pattern Recognit. Lett."},{"key":"ref_22","unstructured":"Liu, W., He, J., and Chang, S.-F. (2010, January 21\u201324). Large graph construction for scalable semi-supervised learning. Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1864","DOI":"10.1109\/TKDE.2016.2535367","article-title":"Scalable semi-supervised learning by efficient anchor graph regularization","volume":"28","author":"Wang","year":"2016","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_24","unstructured":"Wang, Z., Wang, L., Chan, R., and Zeng, T. (2019). Large-scale semi-supervised learning via graph structure learning over high-dense points. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1016\/j.inffus.2020.09.007","article-title":"Joint auto-weighted graph fusion and scalable semi-supervised learning","volume":"66","author":"Bahrami","year":"2021","journal-title":"Inf. Fusion"},{"key":"ref_26","first-page":"29885","article-title":"Topology-imbalance learning for semi-supervised node classification","volume":"34","author":"Chen","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Sun, Q., Li, J., Yuan, H., Fu, X., Peng, H., Ji, C., Li, Q., and Yu, P.S. (2022, January 17\u201322). Position-aware structure learning for graph topology-imbalance by relieving under-reaching and over-squashing. Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA.","DOI":"10.1145\/3511808.3557419"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.patrec.2014.02.020","article-title":"Label propagation through minimax paths for scalable semi-supervised learning","volume":"45","author":"Kim","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1979","DOI":"10.1109\/TNNLS.2014.2363679","article-title":"MTC: A fast and robust graph-based transductive learning method","volume":"26","author":"Zhang","year":"2014","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_30","unstructured":"Sindhwani, V., Niyogi, P., Belkin, M., and Keerthi, S. (2005, January 7\u201311). Linear manifold regularization for large scale semi-supervised learning. Proceedings of the 22nd ICML Workshop on Learning with Partially Classified Training Data, Bonn, Germany."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2016\/6425257","article-title":"Mitigation of Effects of Occlusion on Object Recognition with Deep Neural Networks through Low-Level Image Completion","volume":"2016","author":"Chandler","year":"2016","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1016\/S0167-7152(96)00140-X","article-title":"Sparse spatial autoregressions","volume":"33","author":"Pace","year":"1997","journal-title":"Stat. Probab. Lett."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/10\/591\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:05:21Z","timestamp":1760112321000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/10\/591"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,27]]},"references-count":32,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2024,10]]}},"alternative-id":["info15100591"],"URL":"https:\/\/doi.org\/10.3390\/info15100591","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2024,9,27]]}}}