{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T04:23:46Z","timestamp":1769574226866,"version":"3.49.0"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2025,7,8]],"date-time":"2025-07-08T00:00:00Z","timestamp":1751932800000},"content-version":"vor","delay-in-days":7,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32130071"],"award-info":[{"award-number":["32130071"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["QNTD202510"],"award-info":[{"award-number":["QNTD202510"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Protein clustering and classification are critical for understanding protein functions and interactions, particularly within structure-based predictions. Traditional sequence-based clustering often overlooks the pivotal role of tertiary structure in determining protein function. Structural clustering remains limited and challenging, with existing methods struggling to achieve high accuracy and manage complex data. This study focuses on the tertiary structures of Verticillium dahliae proteins, employing deep learning techniques for effective clustering and classification. Using AlphaFold2, we predicted protein structures and generated C\u03b1 atom distance matrices. We introduced a novel Unique Nuclear Sequence Element (UNSE) neural network to enhance feature extraction, constructing weighted distance matrices by integrating C\u03b1 distances with Pfam annotations. This method effectively captures complex structural relationships. Additionally, Basic Local Alignment Search Tool (BLAST) sequence alignments validated the sequence similarity within protein families, ensuring the biological relevance of clustering results. We applied clustering algorithms to both raw and weighted matrices, comparing their performance against traditional sequence-based and other structure-based methods, including DeepGO and DeepFRI. Evaluation metrics such as Silhouette Score, ${F}_{max}$, and AUPR demonstrated that our weighted matrix approach significantly outperforms conventional methods in accuracy and robustness. These findings confirm that integrating deep learning with weighted distance matrices effectively captures structural and functional protein characteristics, providing a robust tool for structural biology.<\/jats:p>","DOI":"10.1093\/bib\/bbaf331","type":"journal-article","created":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T11:58:40Z","timestamp":1750420720000},"source":"Crossref","is-referenced-by-count":1,"title":["Deep learning\u2013enhanced clustering and classification of protein molecule tertiary structures using weighted distance matrices"],"prefix":"10.1093","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-8660-8992","authenticated-orcid":false,"given":"Junlong","family":"Liu","sequence":"first","affiliation":[{"name":"School of Technology, Beijing Forestry University , 35 Qinghua East Road, Haidian District, Beijing 100083 ,","place":["China"]}]},{"given":"Jiaming","family":"Xiao","sequence":"additional","affiliation":[{"name":"School of Technology, Beijing Forestry University , 35 Qinghua East Road, Haidian District, Beijing 100083 ,","place":["China"]}]},{"given":"Xunwen","family":"Su","sequence":"additional","affiliation":[{"name":"School of Technology, Beijing Forestry University , 35 Qinghua East Road, Haidian District, Beijing 100083 ,","place":["China"]},{"name":"National Facility Preservation Bank for Forestry and Grassland Germplasm Resources, Beijing Forestry University , 35 Qinghua East Road, Haidian District, Beijing 100083 ,","place":["China"]}]},{"given":"Yonglin","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Efficient Production of Forest Resources, College of Forestry, Beijing Forestry University , 35 Qinghua East Road, Haidian District, Beijing 100083 ,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,7,8]]},"reference":[{"key":"2025070800374425600_ref1","doi-asserted-by":"publisher","first-page":"D553","DOI":"10.1093\/nar\/gkab1054","article-title":"SCOPe: improvements to the structural classification of proteins-extended database to facilitate variant interpretation and machine learning","volume":"50","author":"Chandonia","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2025070800374425600_ref2","doi-asserted-by":"publisher","first-page":"2606","DOI":"10.1016\/j.febslet.2010.04.043","article-title":"A structural classification of substrate-binding proteins","volume":"584","author":"Berntsson","year":"2010","journal-title":"FEBS Lett"},{"key":"2025070800374425600_ref3","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1021\/jf00062a035","article-title":"Structure-function relationships in food proteins: subunit interactions in heat-induced gelation of 7S, 11S, and soy isolate proteins","volume":"33","author":"Utsumi","year":"1985","journal-title":"J Agric Food Chem"},{"key":"2025070800374425600_ref4","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1038\/s41579-019-0299-x","article-title":"Evolutionary classification of CRISPR\u2013Cas systems: a burst of class 2 and derived variants","volume":"18","author":"Makarova","year":"2020","journal-title":"Nat Rev Microbiol"},{"key":"2025070800374425600_ref5","doi-asserted-by":"publisher","first-page":"E29","DOI":"10.1093\/nar\/gkab1207","article-title":"Identification and classification of reverse transcriptases in bacterial genomes and metagenomes","volume":"50","author":"Sharifi","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2025070800374425600_ref6","doi-asserted-by":"publisher","first-page":"639","DOI":"10.1038\/s41580-024-00718-y","article-title":"Opportunities and challenges in design and optimization of protein function","volume":"25","author":"Listov","year":"2024","journal-title":"Nat Rev Mol Cell Biol"},{"key":"2025070800374425600_ref7","doi-asserted-by":"publisher","first-page":"1259","DOI":"10.1002\/prot.22030","article-title":"Fast protein tertiary structure retrieval based on global surface shape similarity","volume":"72","author":"Sael","year":"2008","journal-title":"Proteins Struct Funct Genet"},{"key":"2025070800374425600_ref8","doi-asserted-by":"publisher","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat Biotechnol"},{"key":"2025070800374425600_ref9","doi-asserted-by":"publisher","first-page":"931","DOI":"10.1038\/nmeth.3547","article-title":"Predicting effects of noncoding variants with deep learning\u2013based sequence model","volume":"12","author":"Zhou","year":"2015","journal-title":"Nat Methods"},{"key":"2025070800374425600_ref10","doi-asserted-by":"publisher","first-page":"990","DOI":"10.1101\/gr.200535.115","article-title":"Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks","volume":"26","author":"Kelley","year":"2016","journal-title":"Genome Res"},{"key":"2025070800374425600_ref11","doi-asserted-by":"publisher","first-page":"i121","DOI":"10.1093\/bioinformatics\/btw255","article-title":"Convolutional neural network architectures for predicting DNA-protein binding","volume":"32","author":"Zeng","year":"2016","journal-title":"Bioinformatics"},{"key":"2025070800374425600_ref12","doi-asserted-by":"publisher","first-page":"W20","DOI":"10.1093\/nar\/gkh435","article-title":"BLAST: At the core of a powerful and diverse set of sequence analysis tools","volume":"32","author":"McGinnis","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2025070800374425600_ref13","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1186\/s13059-019-1900-3","article-title":"Benchmarking principal component analysis for large-scale single-cell RNA-sequencing","volume":"21","author":"Tsuyuzaki","year":"2020","journal-title":"Genome Biol"},{"key":"2025070800374425600_ref14","doi-asserted-by":"publisher","first-page":"1503","DOI":"10.1093\/bioinformatics\/bty813","article-title":"High precision protein functional site detection using 3D convolutional neural networks","volume":"35","author":"Torng","year":"2019","journal-title":"Bioinformatics"},{"key":"2025070800374425600_ref15","doi-asserted-by":"publisher","first-page":"302","DOI":"10.1186\/s12859-017-1702-0","article-title":"3D deep convolutional neural networks for amino acid environment similarity analysis","volume":"18","author":"Torng","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2025070800374425600_ref16","doi-asserted-by":"publisher","first-page":"3182","DOI":"10.1016\/j.cell.2023.05.041","article-title":"Discovery of deaminase functions by structure-based protein clustering","volume":"186","author":"Huang","year":"2023","journal-title":"Cell"},{"key":"2025070800374425600_ref17","doi-asserted-by":"publisher","first-page":"2351","DOI":"10.1038\/s41467-023-37896-w","article-title":"Sequence-structure-function relationships in the microbial protein universe","volume":"14","author":"Koehler Leman","year":"2023","journal-title":"Nat Commun"},{"key":"2025070800374425600_ref18","doi-asserted-by":"publisher","first-page":"637","DOI":"10.1038\/s41586-023-06510-w","article-title":"Clustering predicted structures at the scale of the known protein universe","volume":"622","author":"Barrio-Hernandez","year":"2023","journal-title":"Nature"},{"key":"2025070800374425600_ref19","doi-asserted-by":"publisher","first-page":"2087","DOI":"10.13345\/j.cjb.230675","article-title":"Advances in bioinformatics-based protein function prediction","volume":"40","author":"He","year":"2024","journal-title":"Sheng Wu Gong Cheng Xue Bao"},{"key":"2025070800374425600_ref20","doi-asserted-by":"publisher","first-page":"975","DOI":"10.1038\/s41587-023-01917-2","article-title":"Protein remote homology detection and structural alignment using deep learning","volume":"42","author":"Hamamsy","year":"2024","journal-title":"Nat Biotechnol"},{"key":"2025070800374425600_ref21","doi-asserted-by":"publisher","first-page":"4086","DOI":"10.1111\/pce.15004","article-title":"Rhizobacterial bacillus enrichment in soil enhances smoke tree resistance to Verticillium wilt","volume":"47","author":"Guo","year":"2024","journal-title":"Plant Cell Environ"},{"key":"2025070800374425600_ref22","doi-asserted-by":"publisher","first-page":"950","DOI":"10.1038\/s41589-024-01638-w","article-title":"The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins","volume":"20","author":"Agarwal","year":"2024","journal-title":"Nat Chem Biol"},{"key":"2025070800374425600_ref23","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2025070800374425600_ref24","doi-asserted-by":"publisher","first-page":"2618","DOI":"10.1016\/j.csbj.2021.04.049","article-title":"Representations of protein structure for exploring the conformational space: a speed\u2013accuracy trade-off","volume":"19","author":"Postic","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"key":"2025070800374425600_ref25","doi-asserted-by":"publisher","first-page":"234","DOI":"10.1007\/978-3-319-24574-4_28","article-title":"U-net: convolutional networks for biomedical image segmentation","volume":"9351","author":"Ronneberger","year":"2015","journal-title":"Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)"},{"key":"2025070800374425600_ref26","first-page":"1776","article-title":"Review of medical image segmentation based on UNet","volume":"17","author":"Xu","year":"2023","journal-title":"J Front Comput Sci Technol"},{"key":"2025070800374425600_ref27","volume-title":"Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2019; 2019-September:1013\u20131017","author":"Lai"},{"key":"2025070800374425600_ref28","first-page":"316","article-title":"Spatial group-wise enhance: enhancing semantic feature learning in CNN","volume":"2023","author":"Li","year":"2022","journal-title":"Computer Vision \u2013 ACCV"},{"key":"2025070800374425600_ref29","doi-asserted-by":"publisher","first-page":"2011","DOI":"10.1109\/TPAMI.2019.2913372","article-title":"Squeeze-and-excitation networks","volume":"42","author":"Hu","year":"2020","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2025070800374425600_ref30","doi-asserted-by":"publisher","first-page":"6439","DOI":"10.1007\/s10462-022-10325-y","article-title":"Data clustering: application and trends","volume":"56","author":"Oyewole","year":"2023","journal-title":"Artif Intell Rev"},{"key":"2025070800374425600_ref31","article-title":"Instance-dependent confidence and early stopping for reinforcement learning","volume":"24","author":"Xia","journal-title":"Journal of Machine Learning Research"},{"key":"2025070800374425600_ref32","doi-asserted-by":"publisher","first-page":"14514","DOI":"10.1109\/TMC.2024.3447000","article-title":"FLrce: resource-efficient federated learning with early-stopping strategy","volume":"23","author":"Niu","year":"2024","journal-title":"IEEE Trans Mob Comput"},{"key":"2025070800374425600_ref33","doi-asserted-by":"publisher","first-page":"100","DOI":"10.1038\/s43586-022-00184-w","article-title":"Principal component analysis","volume":"2","author":"Greenacre","year":"2022","journal-title":"Nat Rev Methods Primers"},{"key":"2025070800374425600_ref34","doi-asserted-by":"publisher","first-page":"110","DOI":"10.1038\/s41592-023-02087-4","article-title":"AlphaFold predictions are valuable hypotheses and accelerate but do not replace experimental structure determination","volume":"21","author":"Terwilliger","year":"2024","journal-title":"Nat Methods"},{"key":"2025070800374425600_ref35","article-title":"Genome-wide identification, phylogeny and expression profile of vesicle fusion components in Verticillium dahliae","volume":"8","author":"Yang","year":"2013","journal-title":"PLoS One"},{"key":"2025070800374425600_ref38","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2302478120","article-title":"Feedback regulation of ubiquitination and phase separation of HECT E3 ligases","volume":"120","author":"Li","journal-title":"Proceedings of the National Academy of Sciences of the United States of America"},{"key":"2025070800374425600_ref36","doi-asserted-by":"publisher","DOI":"10.1038\/s41392-020-00315-3","article-title":"Recent advances in the development of protein\u2013protein interactions modulators: mechanisms and clinical trials","volume":"5","author":"Lu","journal-title":"Signal Transduct Target Ther"},{"key":"2025070800374425600_ref39","doi-asserted-by":"publisher","DOI":"10.1093\/plcell\/koab221","article-title":"Verticillium dahliae effector VDAL protects MYB6 from degradation by interacting with PUB25 and PUB26 E3 ligases to enhance Verticillium wilt resistance","volume":"33","author":"Ma","year":"2021","journal-title":"Plant Cell"},{"key":"2025070800374425600_ref40","first-page":"8","article-title":"The dynamic transcriptome and metabolomics profiling in Verticillium dahliae inoculated Arabidopsis thaliana","author":"Su","year":"2018","journal-title":"Sci Rep"},{"key":"2025070800374425600_ref37","doi-asserted-by":"publisher","DOI":"10.3390\/ijms24043630","article-title":"Hypothetical protein VDAG_07742 is required for Verticillium dahliae pathogenicity in potato","volume":"24","author":"Wang","year":"2023","journal-title":"Int J Mol Sci"},{"key":"2025070800374425600_ref41","first-page":"1588","article-title":"Functional analyses of the hypothetical protein VDAG_07165 of Verticillium dahliae","volume":"42","author":"Wang","year":"2023","journal-title":"Mycosystema"},{"key":"2025070800374425600_ref43","first-page":"1406","article-title":"Advances in the application of AlphaFold2: a protein structure prediction model","volume":"40","author":"Zhang","year":"2024","journal-title":"Shengwu Gongcheng Xuebao"},{"key":"2025070800374425600_ref44","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1016\/j.neucom.2023.01.043","article-title":"PDBI: a partitioning Davies-Bouldin index for clustering evaluation","volume":"528","author":"Ros","year":"2023","journal-title":"Neurocomputing"},{"key":"2025070800374425600_ref45","doi-asserted-by":"crossref","DOI":"10.1088\/1757-899X\/569\/5\/052024","article-title":"An improved index for clustering validation based on silhouette index and Calinski-Harabasz index","volume":"569","author":"Wang","year":"2019","journal-title":"IOP Conf Ser Mater Sci Eng"},{"key":"2025070800374425600_ref42","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1038\/s41579-022-00710-3","article-title":"Evasion of plant immunity by microbial pathogens","volume":"20","author":"Wang","year":"2022","journal-title":"Nat Rev Microbiol"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/4\/bbaf331\/63694197\/bbaf331.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/4\/bbaf331\/63694197\/bbaf331.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,8]],"date-time":"2025-07-08T04:37:55Z","timestamp":1751949475000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf331\/8191418"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":45,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,7,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf331","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,7]]},"article-number":"bbaf331"}}