{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T18:15:40Z","timestamp":1773252940755,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2026,2,18]],"date-time":"2026-02-18T00:00:00Z","timestamp":1771372800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Science Foundation Major Research Instrumentation","award":["2117941"],"award-info":[{"award-number":["2117941"]}]},{"name":"Institute of Information & Communications Technology Planning & Evaluation"},{"name":"Korean government","award":["2021-0-01581"],"award-info":[{"award-number":["2021-0-01581"]}]},{"name":"Bio & Medical Technology Development Program","award":["RS-2024-00441423"],"award-info":[{"award-number":["RS-2024-00441423"]}]},{"name":"Basic Science Research Program","award":["RS-2023-00276255"],"award-info":[{"award-number":["RS-2023-00276255"]}]},{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Prediction of Compound\u2013Protein Interactions (CPI) in bacteria is crucial to advance various pharmaceutical and chemical engineering fields, including biocatalysis, drug discovery, and industrial processing. However, current CPI models cannot be applied for bacterial CPI prediction due to the lack of curated negative interaction samples.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We propose a novel Positive-Unlabeled (PU) learning framework, named BIN-PU, to address this limitation. BIN-PU generates pseudo positive and negative labels from known positive interaction data, enabling effective training of deep learning models for CPI prediction. We also propose a weighted positive loss function that weights to truly positive samples. We have validated BIN-PU coupled with multiple CPI backbone models, comparing the performance with the existing PU models using bacterial cytochrome P450 (CYP) data. Extensive experiments demonstrate the superiority of BIN-PU over the benchmark models in predicting CPIs with only truly positive samples. Furthermore, we have validated BIN-PU on additional bacterial proteins obtained from literature review, human CYP datasets, and uncurated data for its reproducibility. We have also validated the CPI prediction for the uncurated CYP data with biological and biophysical experiments. BIN-PU represents a significant advancement in CPI prediction for bacterial proteins, opening new possibilities for improving predictive models in related biological interaction tasks.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The source code and data are available at https:\/\/github.com\/datax-lab\/CYP.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btag067","type":"journal-article","created":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T12:49:37Z","timestamp":1770382177000},"source":"Crossref","is-referenced-by-count":0,"title":["Prediction of bacterial protein\u2013compound interactions with only positive samples"],"prefix":"10.1093","volume":"42","author":[{"given":"Ki-Hwa","family":"Kim","sequence":"first","affiliation":[{"name":"Genome-Based BioIT Convergence Institute , Asan, 31460,","place":["Republic of Korea"]}]},{"given":"Avinash","family":"Yaganapu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Nevada, Las Vegas , Las Vegas, NV, 89154,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4332-6217","authenticated-orcid":false,"given":"Sai","family":"Kosaraju","sequence":"additional","affiliation":[{"name":"Department of Computer Science, California State Polytechnic University , Pomona, CA, 91768,","place":["United States"]}]},{"given":"Aashish","family":"Bhatt","sequence":"additional","affiliation":[{"name":"Department of Biotechnology and Pharmaceutical Sciences, Western University of Health Sciences , Pomona, CA, 91766,","place":["United States"]}]},{"given":"Yun Lyna","family":"Luo","sequence":"additional","affiliation":[{"name":"Department of Biotechnology and Pharmaceutical Sciences, Western University of Health Sciences , Pomona, CA, 91766,","place":["United States"]}]},{"given":"Sai Phani","family":"Parsa","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Nevada, Las Vegas , Las Vegas, NV, 89154,","place":["United States"]}]},{"given":"Juyeon","family":"Park","sequence":"additional","affiliation":[{"name":"Division of Computer Science and Engineering, Sun Moon University , Asan, 31460,","place":["Republic of Korea"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0089-1002","authenticated-orcid":false,"given":"Hyun","family":"Lee","sequence":"additional","affiliation":[{"name":"Division of Computer Science and Engineering, Sun Moon University , Asan, 31460,","place":["Republic of Korea"]}]},{"given":"Jun Hyuck","family":"Lee","sequence":"additional","affiliation":[{"name":"Division of Life Sciences, Korea Polar Research Institute , Incheon, 21990,","place":["Republic of Korea"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8407-5039","authenticated-orcid":false,"given":"Tae-Jin","family":"Oh","sequence":"additional","affiliation":[{"name":"Genome-Based BioIT Convergence Institute , Asan, 31460,","place":["Republic of Korea"]},{"name":"Department of AI Biomedical Engineering, Graduate School, Sun Moon University , Asan, 31460,","place":["Republic of Korea"]},{"name":"Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University , Asan, 31460,","place":["Republic of Korea"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9565-9523","authenticated-orcid":false,"given":"Mingon","family":"Kang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Nevada, Las Vegas , Las Vegas, NV, 89154,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2026,2,18]]},"reference":[{"key":"2026031019335660300_btag067-B1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2206.01206"},{"key":"2026031019335660300_btag067-B2","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1016\/j.fbp.2014.09.011","article-title":"Microbial and enzymatic technologies used for the production of natural aroma compounds: synthesis, recovery modeling, and bioprocesses","volume":"94","author":"Ben Akacha","year":"2015","journal-title":"Food Bioprod Process"},{"key":"2026031019335660300_btag067-B3","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1007\/s10994-020-05877-5","article-title":"Learning from positive and unlabeled data: a survey","volume":"109","author":"Bekker","year":"2020","journal-title":"Mach Learn"},{"key":"2026031019335660300_btag067-B4","doi-asserted-by":"publisher","first-page":"2397","DOI":"10.1093\/bioinformatics\/btp433","article-title":"Supervised prediction of drug\u2013target interactions using bipartite local models","volume":"25","author":"Bleakley","year":"2009","journal-title":"Bioinformatics"},{"key":"2026031019335660300_btag067-B5","doi-asserted-by":"publisher","first-page":"D444","DOI":"10.1093\/nar\/gkae1082","article-title":"Interpro: the protein sequence classification resource in 2025","volume":"53","author":"Blum","year":"2025","journal-title":"Nucleic Acids Res"},{"key":"2026031019335660300_btag067-B6","doi-asserted-by":"publisher","article-title":"Chai-1: Decoding the molecular interactions of life","DOI":"10.1101\/2024.10.10.615955"},{"key":"2026031019335660300_btag067-B7","doi-asserted-by":"publisher","first-page":"4406","DOI":"10.1093\/bioinformatics\/btaa524","article-title":"Transformercpi: improving compound\u2013protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments","volume":"36","author":"Chen","year":"2020","journal-title":"Bioinformatics"},{"key":"2026031019335660300_btag067-B8","doi-asserted-by":"publisher","first-page":"3697","DOI":"10.18632\/oncotarget.1984","article-title":"Quantitative network mapping of the human kinome interactome reveals new clues for rational kinase inhibitor discovery and individualized cancer therapy","volume":"5","author":"Cheng","year":"2014","journal-title":"Oncotarget"},{"key":"2026031019335660300_btag067-B9","doi-asserted-by":"publisher","first-page":"2373","DOI":"10.1039\/c2mb25110h","article-title":"Prediction of chemical\u2013protein interactions: multitarget-qsar versus computational chemogenomic methods","volume":"8","author":"Cheng","year":"2012","journal-title":"Mol Biosyst"},{"key":"2026031019335660300_btag067-B10","doi-asserted-by":"publisher","first-page":"1832","DOI":"10.1109\/TCBB.2016.2570211","article-title":"Effectively identifying compound\u2013protein interactions by learning from positive and unlabeled examples","volume":"15","author":"Cheng","year":"2018","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2026031019335660300_btag067-B11","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1016\/j.neucom.2014.10.081","article-title":"A robust ensemble approach to learn from positive and unlabeled data using SVM base models","volume":"160","author":"Claesen","year":"2015","journal-title":"Neurocomputing (AMST)"},{"key":"2026031019335660300_btag067-B12","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1002\/1873-3468.13297","article-title":"Bacterial cyp154c8 catalyzes carbon\u2013carbon bond cleavage in steroids","volume":"593","author":"Dangi","year":"2019","journal-title":"FEBS Lett"},{"key":"2026031019335660300_btag067-B13","doi-asserted-by":"publisher","first-page":"1683","DOI":"10.1111\/febs.14729","article-title":"Characterization of two steroid hydroxylases from different Streptomyces spp. and their ligand-bound and -unbound crystal structures","volume":"286","author":"Dangi","year":"2019","journal-title":"FEBS J"},{"key":"2026031019335660300_btag067-B14","doi-asserted-by":"publisher","first-page":"1419","DOI":"10.1080\/17460913.2024.2398337","article-title":"Application of microbial enzymes in medicine and industry: current status and future perspectives","volume":"19","author":"Darbandi","year":"2024","journal-title":"Future Microbiol"},{"key":"2026031019335660300_btag067-B15","doi-asserted-by":"publisher","DOI":"10.1128\/aem.02186-22","article-title":"Improved 2\u03b1-hydroxylation efficiency of steroids by cyp154c2 using structure-guided rational design","volume":"89","author":"Gao","year":"2023","journal-title":"Appl Environ Microbiol"},{"key":"2026031019335660300_btag067-B16","first-page":"2255","author":"Hou","year":"2018"},{"key":"2026031019335660300_btag067-B17","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1080\/07388551.2022.2029344","article-title":"Efficient heterologous expression of cytochrome p450 enzymes in microorganisms for the biosynthesis of natural products","volume":"43","author":"Hu","year":"2023","journal-title":"Crit Rev Biotechnol"},{"key":"2026031019335660300_btag067-B18","doi-asserted-by":"publisher","first-page":"7806","DOI":"10.1609\/aaai.v35i9.16953","article-title":"Predictive adversarial learning from positive and unlabeled data","volume":"35","author":"Hu","year":"2021","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"2026031019335660300_btag067-B19","doi-asserted-by":"publisher","first-page":"852","DOI":"10.1002\/cbic.201500524","article-title":"Crystal structure of cyp106a2 in substrate-free and substrate-bound form","volume":"17","author":"Janocha","year":"2016","journal-title":"Chembiochem"},{"key":"2026031019335660300_btag067-B20","doi-asserted-by":"publisher","first-page":"4749668","DOI":"10.1155\/2024\/4749668","article-title":"Pu-gnn: a positive-unlabeled learning method for polypharmacy side-effects detection based on graph neural networks","volume":"2024","author":"Keshavarz","year":"2024","journal-title":"Int J Intell Syst"},{"key":"2026031019335660300_btag067-B21","doi-asserted-by":"publisher","first-page":"107451","DOI":"10.1016\/j.jmgm.2019.107451","article-title":"Docking structurally similar analogues: dealing with the false-positive","volume":"93","author":"Khanjiwala","year":"2019","journal-title":"J Mol Graph Model"},{"key":"2026031019335660300_btag067-B22","doi-asserted-by":"publisher","first-page":"387","DOI":"10.4014\/jmb.2211.11031","article-title":"Crystal structure and biochemical analysis of a cytochrome p450 steroid hydroxylase (bacyp106a6) from bacillus species","volume":"33","author":"Kim","year":"2023","journal-title":"J Microbiol Biotechnol"},{"key":"2026031019335660300_btag067-B23","doi-asserted-by":"publisher","first-page":"1472","DOI":"10.4014\/jmb.1706.06013","article-title":"Crystal structure and functional characterization of a cytochrome p450 (bacyp106a2) from bacillus sp. pamc 23377","volume":"27","author":"Kim","year":"2017","journal-title":"J Microbiol Biotechnol"},{"key":"2026031019335660300_btag067-B24","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1186\/s13321-018-0271-1","article-title":"A confidence predictor for logd using conformal regression and a support-vector machine","volume":"10","author":"Lapins","year":"2018","journal-title":"J Cheminform"},{"key":"2026031019335660300_btag067-B25","first-page":"S14","article-title":"Charter on CME in the European union","volume":"72","author":"Leibbrandt","year":"1996","journal-title":"Postgraduate Med J"},{"key":"2026031019335660300_btag067-B26","article-title":"Integrated protein\u2013ligand interaction database (v0.1)","author":"Lim","year":"2019","journal-title":"Data Set"},{"key":"2026031019335660300_btag067-B27","doi-asserted-by":"publisher","DOI":"10.1016\/B978-0-443-19059-9.00021-9"},{"key":"2026031019335660300_btag067-B28","doi-asserted-by":"publisher","first-page":"e4792","DOI":"10.1002\/pro.4792","article-title":"Ucsf chimerax: tools for structure building and analysis","volume":"32","author":"Meng EC, Goddard TD, Pettersen EF","year":"2023","journal-title":"Protein Sci Publ Protein Soc"},{"key":"2026031019335660300_btag067-B29","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1186\/1479-7364-4-1-59","article-title":"The cytochrome p450 homepage","volume":"4","author":"Nelson","year":"2009","journal-title":"Hum Genomics"},{"key":"2026031019335660300_btag067-B30","doi-asserted-by":"publisher","first-page":"1477","DOI":"10.1093\/bib\/bbad484","article-title":"Mulinforcpi: enhancing precision of compound\u2013protein interaction prediction through novel perspectives on multi-level information integration","volume":"25","author":"Nguyen","year":"2023","journal-title":"Brief Bioinform"},{"key":"2026031019335660300_btag067-B31","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1002\/9783527673261.ch10","article-title":"Structure-based methods for predicting the sites and products of metabolism","volume":"4","author":"Oostenbrink","year":"2014","journal-title":"Drug Metabolism Predict"},{"key":"2026031019335660300_btag067-B32","doi-asserted-by":"publisher","first-page":"e39777","DOI":"10.1016\/j.heliyon.2024.e39777","article-title":"Functional characterization and unraveling the structural determinants of novel steroid hydroxylase cyp154c7 from Streptomyces sp. pamc26508","volume":"10","author":"Paudel","year":"2024","journal-title":"Heliyon"},{"key":"2026031019335660300_btag067-B33","doi-asserted-by":"publisher","first-page":"108079","DOI":"10.1016\/j.biotechadv.2022.108079","article-title":"Folding of heterologous proteins in bacterial cell factories: cellular mechanisms and engineering strategies","volume":"63","author":"Rong","year":"2023","journal-title":"Biotechnol Adv"},{"key":"2026031019335660300_btag067-B34","doi-asserted-by":"publisher","first-page":"597","DOI":"10.3390\/biom3030597","article-title":"Microbial enzymes with special characteristics for biotechnological applications","volume":"3","author":"Singh Nigam","year":"2013","journal-title":"Biomolecules"},{"key":"2026031019335660300_btag067-B35","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1016\/j.bbapap.2017.07.011","article-title":"Cyp106a2-a versatile biocatalyst with high potential for biotechnological production of selectively hydroxylated steroid and terpenoid compounds","volume":"1866","author":"Schmitz","year":"2018","journal-title":"Biochim Biophys Acta Proteins Proteom"},{"key":"2026031019335660300_btag067-B36","doi-asserted-by":"publisher","first-page":"1839","DOI":"10.1016\/j.ygeno.2018.12.007","article-title":"Predicting drug\u2013target interactions using lasso with random Forest based on evolutionary information and chemical structure","volume":"111","author":"Shi","year":"2019","journal-title":"Genomics"},{"key":"2026031019335660300_btag067-B37","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1128\/CMR.00030-10","article-title":"Challenges of antibacterial discovery","volume":"24","author":"Silver","year":"2011","journal-title":"Clin Microbiol Rev"},{"key":"2026031019335660300_btag067-B38","doi-asserted-by":"publisher","first-page":"464","DOI":"10.4014\/jmb.2010.10020","article-title":"Enzymatic characterization and comparison of two steroid hydroxylases cyp154c3-1 and cyp154c3-2 from Streptomyces species","volume":"31","author":"Subedi","year":"2021","journal-title":"J Microbiol Biotechnol"},{"key":"2026031019335660300_btag067-B39","doi-asserted-by":"publisher","first-page":"S3","DOI":"10.1186\/1752-0509-7-S6-S3","article-title":"Scalable prediction of compound\u2013protein interactions using minwise hashing","volume":"7","author":"Tabei","year":"2013","journal-title":"BMC Syst Biol"},{"key":"2026031019335660300_btag067-B40","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1093\/bioinformatics\/bty535","article-title":"Compound\u2013protein interaction prediction with end-to-end learning of neural networks for graphs and sequences","volume":"35","author":"Tsubaki","year":"2019","journal-title":"Bioinformatics"},{"key":"2026031019335660300_btag067-B41","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1038\/nrmicro3508","article-title":"Bacterial protein networks: properties and functions","volume":"13","author":"Typas","year":"2015","journal-title":"Nat Rev Microbiol"},{"key":"2026031019335660300_btag067-B42","doi-asserted-by":"publisher","first-page":"1392","DOI":"10.3390\/ijms22031392","article-title":"Ssnet: a deep learning approach for protein\u2013ligand interaction prediction","volume":"22","author":"Verma","year":"2021","journal-title":"Int J Mol Sci"},{"key":"2026031019335660300_btag067-B43","doi-asserted-by":"publisher","first-page":"3090","DOI":"10.1016\/j.csbj.2024.08.002","article-title":"Investigation of in silico studies for cytochrome p450 isoforms specificity","volume":"23","author":"Wei","year":"2024","journal-title":"Comput Struct Biotechnol J"},{"key":"2026031019335660300_btag067-B44","doi-asserted-by":"publisher","first-page":"i232","DOI":"10.1093\/bioinformatics\/btn162","article-title":"Prediction of drug\u2013target interaction networks from the integration of chemical and genomic spaces","volume":"24","author":"Yamanishi","year":"2008","journal-title":"Bioinformatics (Oxford, England)"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag067\/66981731\/btag067.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btag067\/66981731\/btag067.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btag067\/66981731\/btag067.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T23:34:03Z","timestamp":1773185643000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btag067\/8489904"}},"subtitle":[],"editor":[{"given":"Jianlin","family":"Cheng","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2026,2,18]]},"references-count":44,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,2,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btag067","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,3]]},"published":{"date-parts":[[2026,2,18]]},"article-number":"btag067"}}