{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T17:20:42Z","timestamp":1761844842827,"version":"3.37.3"},"reference-count":9,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2018,2,7]],"date-time":"2018-02-07T00:00:00Z","timestamp":1517961600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Estimation of the hidden population structure is an important step in many genetic studies. Often the aim is also to identify which sequence locations are the most discriminative between groups of samples for a given data partition. Automated discovery of interesting patterns that are present in the data can help to generate new biological hypotheses.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We introduce Kpax3, a Bayesian method for bi-clustering multiple sequence alignments. Influence of individual sites will be determined in a supervised manner by using informative prior distributions for the model parameters. Our inference method uses an implementation of both split-merge and Gibbs sampler type MCMC algorithms to traverse the joint posterior of partitions of samples and variables. We use a large Rotavirus sequence dataset to demonstrate the ability of Kpax3 to generate biologically important hypotheses about differential selective pressures across a virus protein.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Kpax3 is implemented as a Julia package and released under the MIT license. Source code and documentation are available at: https:\/\/github.com\/albertopessia\/Kpax3.jl.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty056","type":"journal-article","created":{"date-parts":[[2018,2,6]],"date-time":"2018-02-06T12:09:23Z","timestamp":1517918963000},"page":"2132-2133","source":"Crossref","is-referenced-by-count":6,"title":["Kpax3: Bayesian bi-clustering of large sequence datasets"],"prefix":"10.1093","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8607-9191","authenticated-orcid":false,"given":"Alberto","family":"Pessia","sequence":"first","affiliation":[{"name":"Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland"}]},{"given":"Jukka","family":"Corander","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland"},{"name":"Department of Biostatistics, University of Oslo, Oslo, Norway"},{"name":"Pathogen Genomics, Wellcome Trust Sanger Institute, CB10 1SA Hinxton, UK"}]}],"member":"286","published-online":{"date-parts":[[2018,2,7]]},"reference":[{"key":"2023012713393974700_bty056-B1","doi-asserted-by":"crossref","first-page":"S50","DOI":"10.1097\/INF.0b013e3181967bee","article-title":"Rotavirus overview","volume":"28","author":"Bernstein","year":"2009","journal-title":"Pediatric Infect. Dis. J"},{"key":"2023012713393974700_bty056-B2","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1137\/141000671","article-title":"Julia: a fresh approach to numerical computing","volume":"59","author":"Bezanson","year":"2017","journal-title":"SIAM Rev"},{"key":"2023012713393974700_bty056-B3","doi-asserted-by":"crossref","first-page":"D660","DOI":"10.1093\/nar\/gkt1268","article-title":"Virus variation resource \u2013 recent updates and future directions","volume":"42","author":"Brister","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023012713393974700_bty056-B4","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1038\/ng.2895","article-title":"Dense genomic sampling identifies highways of pneumococcal recombination","volume":"46","author":"Chewapreecha","year":"2014","journal-title":"Nat. Genet"},{"key":"2023012713393974700_bty056-B5","first-page":"5","article-title":"Rotavirus serotypes: classification and importance in epidemiology, immunity, and vaccine development","volume":"18","author":"Hoshino","year":"2000","journal-title":"J. Health Popul. Nutrit"},{"key":"2023012713393974700_bty056-B6","doi-asserted-by":"crossref","first-page":"10748","DOI":"10.1073\/pnas.162366899","article-title":"The total influenza vaccine failure of 1947 revisited: major intrasubtypic antigenic change can explain failure of vaccine in a post-World War II epidemic","volume":"99","author":"Kilbourne","year":"2002","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012713393974700_bty056-B7","doi-asserted-by":"crossref","first-page":"2466","DOI":"10.1093\/bioinformatics\/btl411","article-title":"Bayesian search of functionally divergent protein subgroups and their function specific residues","volume":"22","author":"Marttinen","year":"2006","journal-title":"Bioinformatics"},{"volume-title":"Mathematical Classification and Clustering, Volume 11 of Nonconvex Optimization and Its Applications","year":"1996","author":"Mirkin","key":"2023012713393974700_bty056-B8"},{"key":"2023012713393974700_bty056-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1099\/mgen.0.000025","article-title":"K-Pax2: Bayesian identification of cluster-defining amino acid positions in large sequence datasets","volume":"1","author":"Pessia","year":"2015","journal-title":"Microb. Genomics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/12\/2132\/48936028\/bioinformatics_34_12_2132.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/12\/2132\/48936028\/bioinformatics_34_12_2132.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T14:19:42Z","timestamp":1674829182000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/12\/2132\/4841711"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,2,7]]},"references-count":9,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2018,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty056","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2018,6,15]]},"published":{"date-parts":[[2018,2,7]]}}}