{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T10:56:58Z","timestamp":1781693818196,"version":"3.54.5"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2023,7,19]],"date-time":"2023-07-19T00:00:00Z","timestamp":1689724800000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001691","name":"JSPS","doi-asserted-by":"publisher","award":["JP21K12104"],"award-info":[{"award-number":["JP21K12104"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,9,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Accurately identifying phage\u2013host relationships from their genome sequences is still challenging, especially for those phages and hosts with less homologous sequences. In this work, focusing on identifying the phage\u2013host relationships at the species and genus level, we propose a contrastive learning based approach to learn whole-genome sequence embeddings that can take account of phage\u2013host interactions (PHIs). Contrastive learning is used to make phages infecting the same hosts close to each other in the new representation space. Specifically, we rephrase whole-genome sequences with frequency chaos game representation (FCGR) and learn latent embeddings that \u2018encapsulate\u2019 phages and host relationships through contrastive learning. The contrastive learning method works well on the imbalanced dataset. Based on the learned embeddings, a proposed pipeline named CL4PHI can predict known hosts and unseen hosts in training. We compare our method with two recently proposed state-of-the-art learning-based methods on their benchmark datasets. The experiment results demonstrate that the proposed method using contrastive learning improves the prediction accuracy on known hosts and demonstrates a zero-shot prediction capability on unseen hosts.<\/jats:p>\n               <jats:p>In terms of potential applications, the rapid pace of genome sequencing across different species has resulted in a vast amount of whole-genome sequencing data that require efficient computational methods for identifying phage\u2013host interactions. The proposed approach is expected to address this need by efficiently processing whole-genome sequences of phages and prokaryotic hosts and capturing features related to phage\u2013host relationships for genome sequence representation. This approach can be used to accelerate the discovery of phage\u2013host interactions and aid in the development of phage-based therapies for infectious diseases.<\/jats:p>","DOI":"10.1093\/bib\/bbad239","type":"journal-article","created":{"date-parts":[[2023,7,19]],"date-time":"2023-07-19T10:59:29Z","timestamp":1689764369000},"source":"Crossref","is-referenced-by-count":18,"title":["Zero-shot-capable identification of phage\u2013host relationships with whole-genome sequence representation by contrastive learning"],"prefix":"10.1093","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5598-2521","authenticated-orcid":false,"given":"Yao-zhong","family":"Zhang","sequence":"first","affiliation":[{"name":"The Institute of Medical Science, The University of Tokyo Division of Health Medical Intelligence, Human Genome Center, , Shirokanedai 4-6-1, Minato-ku, 108-8639 Tokyo , Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yunjie","family":"Liu","sequence":"additional","affiliation":[{"name":"The Institute of Medical Science, The University of Tokyo Division of Health Medical Intelligence, Human Genome Center, , Shirokanedai 4-6-1, Minato-ku, 108-8639 Tokyo , Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zeheng","family":"Bai","sequence":"additional","affiliation":[{"name":"The Institute of Medical Science, The University of Tokyo Division of Health Medical Intelligence, Human Genome Center, , Shirokanedai 4-6-1, Minato-ku, 108-8639 Tokyo , Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kosuke","family":"Fujimoto","sequence":"additional","affiliation":[{"name":"Department of Immunology and Genomics, Graduate School of Medicine, Osaka Metropolitan University , Asahi-machi 1-4-3, Abeno-ku, 545-8585 Osaka , Japan"},{"name":"Division of Metagenome Medicine, Human Genome Center, The Institute of Medical Science, The University of Tokyo , Shirokanedai 4-6-1, Minato-ku, 108-8639 Tokyo , Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Satoshi","family":"Uematsu","sequence":"additional","affiliation":[{"name":"Department of Immunology and Genomics, Graduate School of Medicine, Osaka Metropolitan University , Asahi-machi 1-4-3, Abeno-ku, 545-8585 Osaka , Japan"},{"name":"Division of Metagenome Medicine, Human Genome Center, The Institute of Medical Science, The University of Tokyo , Shirokanedai 4-6-1, Minato-ku, 108-8639 Tokyo , Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Seiya","family":"Imoto","sequence":"additional","affiliation":[{"name":"The Institute of Medical Science, The University of Tokyo Division of Health Medical Intelligence, Human Genome Center, , Shirokanedai 4-6-1, Minato-ku, 108-8639 Tokyo , Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2023,7,18]]},"reference":[{"issue":"1","key":"2023092216481811400_ref1","doi-asserted-by":"crossref","first-page":"4498","DOI":"10.1038\/ncomms5498","article-title":"A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes","volume":"5","author":"Dutilh","year":"2014","journal-title":"Nat Commun"},{"issue":"10","key":"2023092216481811400_ref2","doi-asserted-by":"crossref","first-page":"1985","DOI":"10.1101\/gr.138297.112","article-title":"Crispr targeting reveals a reservoir of common phages associated with the human gut microbiome","volume":"22","author":"Stern","year":"2012","journal-title":"Genome Res"},{"issue":"20","key":"2023092216481811400_ref3","doi-asserted-by":"crossref","first-page":"5839","DOI":"10.1093\/nar\/gkl732","article-title":"Phage_finder: automated identification and classification of prophage regions in complete bacterial genome sequences","volume":"34","author":"Fouts","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023092216481811400_ref4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2164-7-8","article-title":"Evidence of host-virus co-evolution in tetranucleotide usage patterns of bacteriophages and eukaryotic viruses","volume":"7","author":"Pride","year":"2006","journal-title":"BMC Genomics"},{"issue":"19","key":"2023092216481811400_ref5","doi-asserted-by":"crossref","first-page":"3113","DOI":"10.1093\/bioinformatics\/btx383","article-title":"Wish: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs","volume":"33","author":"Galiez","year":"2017","journal-title":"Bioinformatics"},{"key":"2023092216481811400_ref6","doi-asserted-by":"crossref","volume-title":"vhulk, a new tool for bacteriophage host prediction based on annotated genomic features and deep neural networks.","author":"Amgarten","DOI":"10.1101\/2020.12.06.413476"},{"issue":"2","key":"2023092216481811400_ref7","doi-asserted-by":"crossref","first-page":"543","DOI":"10.1093\/bioinformatics\/btab585","article-title":"Hophage: an ab initio tool for identifying hosts of phage fragments from metaviromes","volume":"38","author":"Tan","year":"2022","journal-title":"Bioinformatics"},{"issue":"7","key":"2023092216481811400_ref8","doi-asserted-by":"crossref","DOI":"10.1016\/j.patter.2021.100274","article-title":"Rafah: host prediction for viruses of bacteria and archaea based on protein content","volume":"2","author":"Coutinho","year":"2021","journal-title":"Patterns"},{"issue":"1","key":"2023092216481811400_ref9","doi-asserted-by":"crossref","first-page":"bbab385","DOI":"10.1093\/bib\/bbab385","article-title":"Deephost: phage host prediction with convolutional neural network","volume":"23","author":"Ruohan","year":"2022","journal-title":"Brief Bioinform"},{"key":"2023092216481811400_ref10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-10-421","article-title":"Blast+: architecture and applications","volume":"10","author":"Camacho","year":"2009","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"2023092216481811400_ref11","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1093\/nar\/gkw1002","article-title":"Alignment-free ${d}\\_2^{\\ast }$ oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences","volume":"45","author":"Ahlgren","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023092216481811400_ref12","first-page":"1","article-title":"Prokaryotic virus host predictor: a Gaussian model for host prediction of prokaryotic viruses in metagenomics","volume":"19","author":"Congyu","year":"2021","journal-title":"BMC Biol"},{"key":"2023092216481811400_ref13","doi-asserted-by":"crossref","first-page":"bbac182","DOI":"10.1093\/bib\/bbac182","article-title":"Cherry: a computational method for accurate prediction of virus-prokaryotic interactions using a graph encoder-decoder model","volume":"23","author":"Shang","year":"2022","journal-title":"Brief Bioinform"},{"issue":"10","key":"2023092216481811400_ref14","doi-asserted-by":"crossref","first-page":"1391","DOI":"10.1093\/oxfordjournals.molbev.a026048","article-title":"Genomic signature: characterization and classification of species assessed by chaos game representation of sequences","volume":"16","author":"Deschavanne","year":"1999","journal-title":"Mol Biol Evol"},{"key":"2023092216481811400_ref15","doi-asserted-by":"crossref","first-page":"6263","DOI":"10.1016\/j.csbj.2021.11.008","article-title":"Chaos game representation and its applications in bioinformatics","volume":"19","author":"L\u00f6chel","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"issue":"8","key":"2023092216481811400_ref16","doi-asserted-by":"crossref","first-page":"2163","DOI":"10.1093\/nar\/18.8.2163","article-title":"Chaos game representation of gene structure","volume":"18","author":"Joel","year":"1990","journal-title":"Nucleic Acids Res"},{"key":"2023092216481811400_ref17","first-page":"539","article-title":"Learning a similarity metric discriminatively, with application to face verification","volume-title":"2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905)","author":"Chopra","year":"2005"},{"key":"2023092216481811400_ref18","doi-asserted-by":"crossref","DOI":"10.3389\/fmicb.2020.579452","article-title":"Motley crew: overview of the currently available phage diversity","volume":"11","author":"Zrelovs","year":"2020","journal-title":"Front Microbiol"},{"key":"2023092216481811400_ref19","doi-asserted-by":"crossref","DOI":"10.3389\/fmicb.2022.946070","article-title":"Daily reports on phage-host interactions","volume":"13","author":"Albrycht","year":"2022","journal-title":"Front Microbiol"},{"issue":"3","key":"2023092216481811400_ref20","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1016\/j.chom.2020.06.005","article-title":"Metagenome data on intestinal phage-bacteria associations aids the development of phage therapy against pathobionts","volume":"28","author":"Fujimoto","year":"2020","journal-title":"Cell Host Microbe"},{"key":"2023092216481811400_ref21","doi-asserted-by":"crossref","first-page":"e985","DOI":"10.7717\/peerj.985","article-title":"Virsorter: mining viral signal from microbial genomic data","volume":"3","author":"Roux","year":"2015","journal-title":"PeerJ"},{"key":"2023092216481811400_ref22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40168-017-0283-5","article-title":"Virfinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data","volume":"5","author":"Ren","year":"2017","journal-title":"Microbiome"},{"key":"2023092216481811400_ref23","doi-asserted-by":"crossref","DOI":"10.7717\/peerj.1603","article-title":"Phylopythias+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes","volume":"4","author":"Gregor","year":"2016","journal-title":"PeerJ"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/5\/bbad239\/51710669\/bbad239.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/5\/bbad239\/51710669\/bbad239.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T17:01:00Z","timestamp":1695402060000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbad239\/7225997"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,18]]},"references-count":23,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,9,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbad239","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,9]]},"published":{"date-parts":[[2023,7,18]]},"article-number":"bbad239"}}