{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:59Z","timestamp":1772138039970,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2020,8,20]],"date-time":"2020-08-20T00:00:00Z","timestamp":1597881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DMS-1613338"],"award-info":[{"award-number":["DMS-1613338"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DBI-1846216"],"award-info":[{"award-number":["DBI-1846216"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Institutes of Health\/National Institute of General Medical Sciences","award":["R01GM120507"],"award-info":[{"award-number":["R01GM120507"]}]},{"name":"PhRMA Foundation Research Starter Grant in Informatics"},{"name":"Johnson and Johnson WiSTEM2D Award and Sloan Research Fellowship"},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DGE-1829071"],"award-info":[{"award-number":["DGE-1829071"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,6,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Gene clustering is a widely used technique that has enabled computational prediction of unknown gene functions within a species. However, it remains a challenge to refine gene function prediction by leveraging evolutionarily conserved genes in another species. This challenge calls for a new computational algorithm to identify gene co-clusters in two species, so that genes in each co-cluster exhibit similar expression levels in each species and strong conservation between the species.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here, we develop the bipartite tight spectral clustering (BiTSC) algorithm, which identifies gene co-clusters in two species based on gene orthology information and gene expression data. BiTSC novelly implements a formulation that encodes gene orthology as a bipartite network and gene expression data as node covariates. This formulation allows BiTSC to adopt and combine the advantages of multiple unsupervised learning techniques: kernel enhancement, bipartite spectral clustering, consensus clustering, tight clustering and hierarchical clustering. As a result, BiTSC is a flexible and robust algorithm capable of identifying informative gene co-clusters without forcing all genes into co-clusters. Another advantage of BiTSC is that it does not rely on any distributional assumptions. Beyond cross-species gene co-clustering, BiTSC also has wide applications as a general algorithm for identifying tight node co-clusters in any bipartite network with node covariates. We demonstrate the accuracy and robustness of BiTSC through comprehensive simulation studies. In a real data example, we use BiTSC to identify conserved gene co-clusters of Drosophila melanogaster and Caenorhabditis elegans, and we perform a series of downstream analysis to both validate BiTSC and verify the biological significance of the identified co-clusters.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The Python package BiTSC is open-access and available at https:\/\/github.com\/edensunyidan\/BiTSC.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa741","type":"journal-article","created":{"date-parts":[[2020,8,13]],"date-time":"2020-08-13T07:56:13Z","timestamp":1597305373000},"page":"1225-1233","source":"Crossref","is-referenced-by-count":5,"title":["Bipartite tight spectral clustering (BiTSC) algorithm for identifying conserved gene co-clusters in two species"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0727-7005","authenticated-orcid":false,"given":"Yidan Eden","family":"Sun","sequence":"first","affiliation":[{"name":"University of California Department of Statistics, , Los Angeles, CA 90095-1554, USA"}]},{"given":"Heather J","family":"Zhou","sequence":"additional","affiliation":[{"name":"University of California Department of Statistics, , Los Angeles, CA 90095-1554, USA"}]},{"given":"Jingyi Jessica","family":"Li","sequence":"additional","affiliation":[{"name":"University of California Department of Statistics, , Los Angeles, CA 90095-1554, USA"},{"name":"University of California Department of Human Genetics, , Los Angeles, CA 90095-7088, USA"},{"name":"University of California Department of Computational Medicine, , Los Angeles, CA 90095-1766, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,5,11]]},"reference":[{"key":"2023051800332017900_btaa741-B1","doi-asserted-by":"crossref","first-page":"e9","DOI":"10.1371\/journal.pbio.0020009","article-title":"Similarities and differences in genome-wide expression data of six organisms","volume":"2","author":"Bergmann","year":"2003","journal-title":"PLoS Biol"},{"key":"2023051800332017900_btaa741-B2","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-0450-1","volume-title":"Pattern Recognition with Fuzzy Objective Function Algorithms","author":"Bezdek","year":"1981"},{"key":"2023051800332017900_btaa741-B3","doi-asserted-by":"crossref","first-page":"e1000707","DOI":"10.1371\/journal.pcbi.1000707","article-title":"Modeling co-expression across species for complex traits: insights to the difference of human and mouse embryonic stem cells","volume":"6","author":"Cai","year":"2010","journal-title":"PLoS Comput. Biol"},{"key":"2023051800332017900_btaa741-B4","doi-asserted-by":"crossref","first-page":"e0164295","DOI":"10.1371\/journal.pone.0164295","article-title":"Cross-species analysis of gene expression and function in prefrontal cortex, hippocampus and striatum","volume":"11","author":"Chen","year":"2016","journal-title":"PLoS One"},{"key":"2023051800332017900_btaa741-B5","first-page":"1","article-title":"The igraph software package for complex network research","volume":"1695","author":"Csardi","year":"2006","journal-title":"InterJournal Complex Syst"},{"key":"2023051800332017900_btaa741-B6","first-page":"1","author":"Dede","year":"2013"},{"key":"2023051800332017900_btaa741-B7","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1145\/502512.502550","volume-title":"Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD \u201901","author":"Dhillon","year":"2001"},{"key":"2023051800332017900_btaa741-B8","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1080\/01969727308546046","article-title":"A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters","volume":"3","author":"Dunn","year":"1973","journal-title":"J. Cybern"},{"key":"2023051800332017900_btaa741-B9","doi-asserted-by":"crossref","first-page":"4029","DOI":"10.1093\/nar\/28.20.4029","article-title":"Automatic detection of conserved gene clusters in multiple genomes by graph comparison and P-quasi grouping","volume":"28","author":"Fujibuchi","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023051800332017900_btaa741-B10","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1038\/nature13424","article-title":"Comparative analysis of the transcriptome across distant species","volume":"512","author":"Gerstein","year":"2014","journal-title":"Nature"},{"key":"2023051800332017900_btaa741-B11","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1007\/BF02289588","article-title":"Hierarchical clustering schemes","volume":"32","author":"Johnson","year":"1967","journal-title":"Psychometrika"},{"key":"2023051800332017900_btaa741-B12","doi-asserted-by":"crossref","first-page":"016107","DOI":"10.1103\/PhysRevE.83.016107","article-title":"Stochastic blockmodels and community structure in networks","volume":"83","author":"Karrer","year":"2011","journal-title":"Phys. Rev. E"},{"key":"2023051800332017900_btaa741-B13","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1146\/annurev.genet.39.073003.114725","article-title":"Orthologs, paralogs, and evolutionary genomics","volume":"39","author":"Koonin","year":"2005","journal-title":"Annu. Rev. Genet"},{"key":"2023051800332017900_btaa741-B14","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1186\/1471-2105-14-70","article-title":"A novel method for cross-species gene expression analysis","volume":"14","author":"Kristiansson","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023051800332017900_btaa741-B15","doi-asserted-by":"crossref","first-page":"012805","DOI":"10.1103\/PhysRevE.90.012805","article-title":"Efficiently inferring community structure in bipartite networks","volume":"90","author":"Larremore","year":"2014","journal-title":"Phys. Rev. E"},{"key":"2023051800332017900_btaa741-B16","doi-asserted-by":"crossref","first-page":"2416","DOI":"10.1093\/bioinformatics\/btq451","article-title":"Cross-species queries of large gene expression databases","volume":"26","author":"Le","year":"2010","journal-title":"Bioinformatics"},{"key":"2023051800332017900_btaa741-B17","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.1101\/gr.1910904","article-title":"Coexpression analysis of human genes across many microarray data sets","volume":"14","author":"Lee","year":"2004","journal-title":"Genome Res"},{"key":"2023051800332017900_btaa741-B1405533","doi-asserted-by":"publisher","first-page":"1086","DOI":"10.1101\/gr.170100.113","article-title":"Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data","volume":"24","author":"Li","year":"2014","journal-title":"Genome Research"},{"key":"2023051800332017900_btaa741-B18","doi-asserted-by":"crossref","first-page":"D572","DOI":"10.1093\/nar\/gkj118","article-title":"Treefam: a curated database of phylogenetic trees of animal gene families","volume":"34","author":"Li","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023051800332017900_btaa741-B19","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1093\/bioinformatics\/bts670","article-title":"Drug-target interaction prediction by learning from local information and neighbors","volume":"29","author":"Mei","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051800332017900_btaa741-B20","doi-asserted-by":"crossref","first-page":"D419","DOI":"10.1093\/nar\/gky1038","article-title":"Panther version 14: more genomes, a new panther go-slim and improvements in enrichment analysis tools","volume":"47","author":"Mi","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023051800332017900_btaa741-B21","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/A:1023949509487","article-title":"Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data","volume":"52","author":"Monti","year":"2003","journal-title":"Mach. Learn"},{"key":"2023051800332017900_btaa741-B22","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1038\/nmeth.1226","article-title":"Mapping and quantifying mammalian transcriptomes by rna-seq","volume":"5","author":"Mortazavi","year":"2008","journal-title":"Nat. Methods"},{"key":"2023051800332017900_btaa741-B23","doi-asserted-by":"crossref","first-page":"1654","DOI":"10.1093\/bioinformatics\/btt202","article-title":"NETAL: a new graph-based method for global alignment of protein\u2013protein interaction networks","volume":"29","author":"Neyshabur","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051800332017900_btaa741-B24","first-page":"849","volume-title":"On Spectral Clustering: Analysis and an Algorithm","author":"Ng","year":"2001"},{"key":"2023051800332017900_btaa741-B25","first-page":"4129","volume-title":"Advances in Neural Information Processing Systems 30","author":"Nie","year":"2017"},{"key":"2023051800332017900_btaa741-B26","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-86659-3","volume-title":"Evolution by Gene Duplication","author":"Ohno","year":"1970"},{"key":"2023051800332017900_btaa741-B27","author":"Razaee","year":"2017"},{"key":"2023051800332017900_btaa741-B28","first-page":"1","article-title":"Matched bipartite block model with covariates","volume":"20","author":"Razaee","year":"2019","journal-title":"J. Mach. Learn. Res"},{"key":"2023051800332017900_btaa741-B29","doi-asserted-by":"crossref","DOI":"10.1186\/1752-0509-4-8","article-title":"A general co-expression network-based approach to gene expression analysis: comparison and applications","volume":"4","author":"Ruan","year":"2010","journal-title":"BMC Syst. Biol"},{"key":"2023051800332017900_btaa741-B30","doi-asserted-by":"crossref","first-page":"2931","DOI":"10.1093\/bioinformatics\/btu409","article-title":"MAGNA: maximizing accuracy in global network alignment","volume":"30","author":"Saraph","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051800332017900_btaa741-B31","doi-asserted-by":"crossref","first-page":"D922","DOI":"10.1093\/nar\/gkt1055","article-title":"Treefam v9: a new website, more species and orthology-on-the-fly","volume":"42","author":"Schreiber","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023051800332017900_btaa741-B32","doi-asserted-by":"crossref","first-page":"12763","DOI":"10.1073\/pnas.0806627105","article-title":"Global alignment of multiple protein interaction networks with application to functional orthology detection","volume":"105","author":"Singh","year":"2008","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051800332017900_btaa741-B33","doi-asserted-by":"crossref","first-page":"4725","DOI":"10.1093\/nar\/gkh815","article-title":"Gene co-regulation is highly conserved in the evolution of eukaryotes and prokaryotes","volume":"32","author":"Snel","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023051800332017900_btaa741-B34","doi-asserted-by":"crossref","first-page":"170185","DOI":"10.1038\/sdata.2017.185","article-title":"An rna-seq atlas of gene expression in mouse and rat normal tissues","volume":"4","author":"S\u00f6llner","year":"2017","journal-title":"Sci. Data"},{"key":"2023051800332017900_btaa741-B35","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1126\/science.1087447","article-title":"A gene-coexpression network for global discovery of conserved genetic modules","volume":"302","author":"Stuart","year":"2003","journal-title":"Science"},{"key":"2023051800332017900_btaa741-B36","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1186\/s13059-015-0853-4","article-title":"Meta-analysis of RNA-seq expression data across species, tissues and studies","volume":"16","author":"Sudmant","year":"2015","journal-title":"Genome Biol"},{"key":"2023051800332017900_btaa741-B37","doi-asserted-by":"crossref","first-page":"i137","DOI":"10.1093\/bioinformatics\/btw278","article-title":"A cross-species bi-clustering approach to identifying conserved co-regulated genes","volume":"32","author":"Sun","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051800332017900_btaa741-B38","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1007\/978-3-662-48221-6_2","volume-title":"Algorithms in Bioinformatics","author":"Sun","year":"2015"},{"key":"2023051800332017900_btaa741-B39","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1126\/science.278.5338.631","article-title":"A genomic perspective on protein families","volume":"278","author":"Tatusov","year":"1997","journal-title":"Science"},{"key":"2023051800332017900_btaa741-B40","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1016\/S0167-7799(02)02032-2","article-title":"Conservation of gene co-regulation in prokaryotes and eukaryotes","volume":"20","author":"Teichmann","year":"2002","journal-title":"Trends Biotechnol"},{"key":"2023051800332017900_btaa741-B41","doi-asserted-by":"crossref","first-page":"2405","DOI":"10.1093\/bioinformatics\/btl406","article-title":"Evaluation and comparison of gene clustering methods in microarray analysis","volume":"22","author":"Thalamuthu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023051800332017900_btaa741-B42","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1111\/j.0006-341X.2005.031032.x","article-title":"Tight clustering: a resampling-based approach for identifying stable and tight patterns in data","volume":"61","author":"Tseng","year":"2005","journal-title":"Biometrics"},{"key":"2023051800332017900_btaa741-B43","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1016\/S0168-9525(03)00056-8","article-title":"Predicting gene function by conserved co-expression","volume":"19","author":"van Noort","year":"2003","journal-title":"Trends Genet"},{"key":"2023051800332017900_btaa741-B44","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrg2484","article-title":"RNA-seq: a revolutionary tool for transcriptomics","volume":"10","author":"Wang","year":"2009","journal-title":"Nat. Rev. Genet"},{"key":"2023051800332017900_btaa741-B45","first-page":"817","author":"Whang","year":"2013"},{"key":"2023051800332017900_btaa741-B46","doi-asserted-by":"crossref","first-page":"R100","DOI":"10.1186\/gb-2014-15-8-r100","article-title":"Orthoclust: an orthology-based network framework for clustering data across multiple species","volume":"15","author":"Yan","year":"2014","journal-title":"Genome Biol"},{"key":"2023051800332017900_btaa741-B47","doi-asserted-by":"crossref","first-page":"2266","DOI":"10.1214\/12-AOS1036","article-title":"Consistency of community detection in networks under degree-corrected stochastic block models","volume":"40","author":"Zhao","year":"2012","journal-title":"Ann. Statist"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa741\/37910560\/btaa741.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/9\/1225\/50359700\/btaa741.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/9\/1225\/50359700\/btaa741.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T20:34:24Z","timestamp":1684355664000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/9\/1225\/5894545"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,5,1]]},"references-count":48,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2021,6,9]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa741","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/865378","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,5,1]]},"published":{"date-parts":[[2021,5,1]]}}}