{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T18:56:47Z","timestamp":1775069807686,"version":"3.50.1"},"reference-count":35,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T00:00:00Z","timestamp":1761091200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:sec>\n                    <jats:title>Introduction<\/jats:title>\n                    <jats:p>\n                      Accurate prediction of protein-protein interactions (PPIs) is crucial for understanding cellular functions and advancing the development of drugs. While existing\n                      <jats:italic>in-silico<\/jats:italic>\n                      methods leverage direct sequence embeddings from Protein Language Models (PLMs) or apply Graph Neural Networks (GNNs) to 3D protein structures, the main focus of this study is to investigate less computationally intensive alternatives. This work introduces a novel framework for the downstream task of PPI prediction via link prediction.\n                    <\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>\n                      We introduce a two-stage graph representation learning framework,\n                      <jats:italic>ProtGram-DirectGCN<\/jats:italic>\n                      . First, we developed\n                      <jats:italic>ProtGram<\/jats:italic>\n                      , a novel approach that models a protein's primary structure as a hierarchy of globally inferred n-gram graphs. In these graphs, residue transition probabilities, aggregated from a large sequence corpus, define the edge weights of a directed graph of paired residues. Second, we propose a custom directed graph convolutional neural network,\n                      <jats:italic>DirectGCN<\/jats:italic>\n                      , which features a unique convolutional layer that processes information through separate path-specific (incoming, outgoing, undirected) and shared transformations, combined via a learnable gating mechanism.\n                      <jats:italic>DirectGCN<\/jats:italic>\n                      is applied to the\n                      <jats:italic>ProtGram<\/jats:italic>\n                      graphs to learn residue-level embeddings, which are then pooled via an attention mechanism to generate protein-level embeddings for the prediction task.\n                    <\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>\n                      The efficacy of the\n                      <jats:italic>DirectGCN<\/jats:italic>\n                      model was first established on standard node classification benchmarks, where its performance is comparable to that of established methods on general datasets, while demonstrating specialization for complex, directed, and dense heterophilic graph structures. When applied to PPI prediction, the full\n                      <jats:italic>ProtGram-DirectGCN<\/jats:italic>\n                      framework achieves robust predictive power despite being trained on limited data.\n                    <\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion<\/jats:title>\n                    <jats:p>\n                      Our results suggest that a globally inferred, directed graph-based representation of sequence transitions offers a potent and computationally distinct alternative to resource-intensive PLMs for the task of PPI prediction. Future work will involve testing\n                      <jats:italic>ProtGram-DirectGCN<\/jats:italic>\n                      on a wider range of bioinformatics tasks.\n                    <\/jats:p>\n                  <\/jats:sec>","DOI":"10.3389\/fbinf.2025.1651623","type":"journal-article","created":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T05:32:46Z","timestamp":1761111166000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Inferred global dense residue transition graphs from primary structure sequences enable protein interaction prediction via directed graph convolutional neural networks"],"prefix":"10.3389","volume":"5","author":[{"given":"Islam Akef","family":"Ebeid","sequence":"first","affiliation":[]},{"given":"Haoteng","family":"Tang","sequence":"additional","affiliation":[]},{"given":"Pengfei","family":"Gu","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,10,22]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1126\/science.181.4096.223","article-title":"Principles that govern the folding of protein chains","volume":"181","author":"Anfinsen","year":"1973","journal-title":"Science"},{"key":"B2","doi-asserted-by":"publisher","first-page":"P10008","DOI":"10.1088\/1742-5468\/2008\/10\/p10008","article-title":"Fast unfolding of communities in large networks","volume":"2008","author":"Blondel","year":"2008","journal-title":"J. Stat. Mech."},{"key":"B3","doi-asserted-by":"publisher","first-page":"1901","DOI":"10.48550\/arXiv.2005.14165","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"B4","first-page":"257","article-title":"Cluster-GCN: an efficient algorithm for training deep and large graph convolutional networks","volume-title":"Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discov. & Data Min.","author":"Chiang","year":"2019"},{"key":"B5","doi-asserted-by":"crossref","first-page":"1724","DOI":"10.3115\/v1\/D14-1179","article-title":"Learning phrase representations using RNN encoder\u2013decoder for statistical machine translation","volume-title":"Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)","author":"Cho","year":"2014"},{"key":"B6","doi-asserted-by":"publisher","first-page":"D523","DOI":"10.1093\/nar\/gkac1052","article-title":"UniProt: the universal protein knowledgebase in 2023","volume":"51","author":"Consortium","year":"2023","journal-title":"Nucleic Acids Res."},{"key":"B7","first-page":"4171","article-title":"BERT: pre-Training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers)","author":"Devlin","year":"2019"},{"key":"B8","doi-asserted-by":"publisher","first-page":"1042","DOI":"10.1126\/science.1219021","article-title":"The protein-folding problem, 50 years on","volume":"338","author":"Dill","year":"2012","journal-title":"science"},{"key":"B9","doi-asserted-by":"publisher","first-page":"7112","DOI":"10.1109\/tpami.2021.3095381","article-title":"ProtTrans: toward understanding the language of life through self-supervised learning","volume":"44","author":"Elnaggar","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"B10","doi-asserted-by":"publisher","first-page":"e47","DOI":"10.1093\/nar\/gkn159","article-title":"Using support vector machine combined with auto covariance to predict protein-protein interactions from protein2 sequences","volume":"3","author":"Guo","year":"2008","journal-title":"Ex. Class. ML (SVM) feature Discuss."},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1706.02216","article-title":"Inductive representation learning on large graphs","author":"Hamilton","year":"2018"},{"key":"B12","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural comput."},{"key":"B13","doi-asserted-by":"crossref","first-page":"538","DOI":"10.1145\/775047.775126","article-title":"SimRank: a measure of structural-context similarity","volume-title":"Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining","author":"Jeh","year":"2002"},{"key":"B14","doi-asserted-by":"publisher","first-page":"8360","DOI":"10.1038\/s41598-022-12201-9","article-title":"Prediction of protein\u2013protein interaction using graph neural networks","volume":"12","author":"Jha","year":"2022","journal-title":"Sci. Rep."},{"key":"B15","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"B16","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1609.02907","article-title":"Semi-supervised classification with graph convolutional networks","author":"Kipf","year":"2017","journal-title":"arXiv:1609.02907 [cs.LG]"},{"key":"B17","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"B18","article-title":"Distributed representations of words and phrases and their compositionality","volume-title":"Advances in neural information processing systems","author":"Mikolov","year":"2013"},{"key":"B19","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1002\/pro.3978","article-title":"The BioGRID database: a comprehensive biomedical resource of curated protein, genetic, and chemical interactions","volume":"30","author":"Oughtred","year":"2021","journal-title":"Protein Sci."},{"key":"B20","doi-asserted-by":"publisher","first-page":"1233","DOI":"10.1016\/j.str.2010.08.007","article-title":"Transient protein-protein interactions: structural, functional, and network properties","volume":"18","author":"Perkins","year":"2010","journal-title":"Structure"},{"key":"B21","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1910.10683","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","author":"Raffel","year":"2023"},{"key":"B22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2014\/147648","article-title":"Protein-protein interaction detection: methods and analysis","volume":"2014","author":"Rao","year":"2014","journal-title":"Int. J. Proteomics"},{"key":"B23","doi-asserted-by":"crossref","DOI":"10.1101\/2020.12.15.422761","article-title":"Transformer protein language models are unsupervised structure learners","volume-title":"International conference on learning representations","author":"Rao","year":"2020"},{"key":"B24","article-title":"Edge directionality improves learning on heterophilic graphs","author":"Rossi","year":"2023"},{"key":"B25","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1038\/nrd3681","article-title":"Diagnosing the decline in pharmaceutical R&D efficiency","volume":"11","author":"Scannell","year":"2012","journal-title":"Nat. Rev. Drug Discov."},{"key":"B26","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1109\/tnn.2008.2005605","article-title":"The graph neural network model","volume":"20","author":"Scarselli","year":"2009","journal-title":"IEEE Trans. Neural Netw."},{"key":"B27","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1038\/nrd.2016.29","article-title":"Small molecules, big targets: drug discovery faces the protein\u2013protein interaction challenge","volume":"15","author":"Scott","year":"2016","journal-title":"Nat. Rev. Drug Discov."},{"key":"B28","doi-asserted-by":"publisher","first-page":"969","DOI":"10.1016\/j.cels.2021.08.010","article-title":"D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions","volume":"12","author":"Sledzieski","year":"2021","journal-title":"cels"},{"key":"B29","doi-asserted-by":"publisher","first-page":"343","DOI":"10.1016\/j.ymeth.2012.07.028","article-title":"Negative protein\u2013protein interaction datasets derived from large-scale two-hybrid experiments","volume":"58","author":"Trabuco","year":"2012","journal-title":"Methods"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1706.03762","article-title":"Attention is all you need","author":"Vaswani","year":"2017","journal-title":"arXiv:1706.03762"},{"key":"B31","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1710.10903","article-title":"Graph attention networks","author":"Veli\u010dkovi\u0107","year":"2018","journal-title":"arXiv:1710.10903 [stat.ML]"},{"key":"B32","doi-asserted-by":"publisher","first-page":"986","DOI":"10.1016\/j.cell.2011.02.016","article-title":"Interactome networks and human disease","volume":"144","author":"Vidal","year":"2011","journal-title":"Cell"},{"key":"B33","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1016\/0022-2836(78)90302-9","article-title":"Computer analysis of protein-protein interaction","volume":"124","author":"Wodak","year":"1978","journal-title":"J. Mol. Biol."},{"key":"B34","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1810.00826","article-title":"How powerful are graph neural networks?","author":"Xu","year":"2019","journal-title":"arXiv:1810.00826"},{"key":"B35","doi-asserted-by":"publisher","first-page":"6135","DOI":"10.3390\/molecules27186135","article-title":"Graph neural network for protein\u2013protein interaction prediction: a comparative study","volume":"27","author":"Zhou","year":"2022","journal-title":"Molecules"}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1651623\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T05:32:49Z","timestamp":1761111169000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1651623\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,22]]},"references-count":35,"alternative-id":["10.3389\/fbinf.2025.1651623"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1651623","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,22]]},"article-number":"1651623"}}