{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T03:46:36Z","timestamp":1762659996695,"version":"3.37.3"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"22-23","license":[{"start":{"date-parts":[[2020,12,1]],"date-time":"2020-12-01T00:00:00Z","timestamp":1606780800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Strategy Research Project","award":["9042348"],"award-info":[{"award-number":["9042348"]}]},{"DOI":"10.13039\/100007567","name":"CityU","doi-asserted-by":"publisher","award":["7005215"],"award-info":[{"award-number":["7005215"]}],"id":[{"id":"10.13039\/100007567","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The microbial community plays an essential role in human diseases and physiological activities. The functions of microbes can differ due to strain-level differences in the genome sequences. Shotgun metagenomic sequencing allows us to profile the strains in microbial communities practically. However, current methods are underdeveloped due to the highly similar sequences among strains. We observe that strains genotypes at the same single nucleotide variant (SNV) locus can be speculated by the genotype frequencies. Also, the variants in different loci covered by the same reads can provide evidence that they reside on the same strain.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>These insights inspire us to design PStrain, an optimization method that utilizes genotype frequencies and the reads which cover multiple SNV loci to profile strains iteratively based on SNVs in a set of MetaPhlAn2 marker genes. Compared to the state-of-art methods, PStrain, on average, improved the performance of inferring strains abundances and genotypes by 87.75% and 59.45%, respectively. We have applied the PStrain package to the dataset with two cohorts of colorectal cancer (CRC) and found that the sequences of Bacteroides coprocola strains are significantly different between CRC and control samples, which is the first time to report the potential role of B.coprocola in the gut microbiota of CRC.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availabilityand implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/wshuai294\/PStrain.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa1056","type":"journal-article","created":{"date-parts":[[2020,12,10]],"date-time":"2020-12-10T04:10:01Z","timestamp":1607573401000},"page":"5499-5506","source":"Crossref","is-referenced-by-count":15,"title":["PStrain: an iterative microbial strains profiling algorithm for shotgun metagenomic sequencing data"],"prefix":"10.1093","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1922-4878","authenticated-orcid":false,"given":"Shuai","family":"Wang","sequence":"first","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong , Kowloon Tong, Hong Kong"}]},{"given":"Yiqi","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong , Kowloon Tong, Hong Kong"}]},{"given":"Shuaicheng","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong , Kowloon Tong, Hong Kong"}]}],"member":"286","published-online":{"date-parts":[[2020,12,21]]},"reference":[{"key":"2023062708450506700_btaa1056-B1","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1093\/bioinformatics\/btu641","article-title":"Sigma: strain-level inference of genomes from metagenomic analysis for biosurveillance","volume":"31","author":"Ahn","year":"2015","journal-title":"Bioinformatics"},{"key":"2023062708450506700_btaa1056-B2","doi-asserted-by":"crossref","first-page":"2260","DOI":"10.1038\/s41467-017-02209-5","article-title":"Strain profiling and epidemiology of bacterial species from metagenomic sequencing","volume":"8","author":"Albanese","year":"2017","journal-title":"Nat. Commun"},{"key":"2023062708450506700_btaa1056-B3","doi-asserted-by":"crossref","first-page":"868","DOI":"10.1128\/AAC.43.4.868","article-title":"Impact of gyrA and parcmutations on quinolone resistance, doubling time, and supercoiling degree of Escherichia coli","volume":"43","author":"Bagel","year":"1999","journal-title":"Antimicrob. Agents Chemother"},{"key":"2023062708450506700_btaa1056-B4","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1038\/s41586-018-0386-6","article-title":"Structure and function of the global topsoil microbiome","volume":"560","author":"Bahram","year":"2018","journal-title":"Nature"},{"key":"2023062708450506700_btaa1056-B5","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1111\/j.1469-0691.2012.03916.x","article-title":"The microbiome as a human organ","volume":"18","author":"Baquero","year":"2012","journal-title":"Clin. Microbiol. Infect"},{"key":"2023062708450506700_btaa1056-B6","doi-asserted-by":"crossref","first-page":"e415","DOI":"10.7717\/peerj.415","article-title":"Strain-and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products","volume":"2","author":"Beitel","year":"2014","journal-title":"PeerJ"},{"key":"2023062708450506700_btaa1056-B7","first-page":"15","article-title":"Gut metagenomes of type 2 diabetic patients have characteristic single-nucleotide polymorphism distribution in Bacteroides coprocola","volume":"5","author":"Chen","year":"2017","journal-title":"Mbio"},{"key":"2023062708450506700_btaa1056-B8","doi-asserted-by":"crossref","first-page":"915","DOI":"10.1038\/nbt.1966","article-title":"Efficient de novo assembly of single-cell bacterial genomes from short-read data sets","volume":"29","author":"Chitsaz","year":"2011","journal-title":"Nat. Biotechnol"},{"key":"2023062708450506700_btaa1056-B10","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1038\/nbt.3329","article-title":"Detection of low-abundance bacterial strains in metagenomic datasets by eigengenome partitioning","volume":"33","author":"Cleary","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023062708450506700_btaa1056-B11","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1186\/s40168-019-0648-z","article-title":"Urban metagenomics uncover antibiotic resistance reservoirs in coastal beach and sewage waters","volume":"7","author":"Fresia","year":"2019","journal-title":"Microbiome"},{"year":"2004","author":"Fuglede","key":"2023062708450506700_btaa1056-B12"},{"key":"2023062708450506700_btaa1056-B13","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1016\/j.cell.2014.12.038","article-title":"Extensive strain-level copy-number variation across human gut microbiome species","volume":"160","author":"Greenblum","year":"2015","journal-title":"Cell"},{"key":"2023062708450506700_btaa1056-B14","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1016\/S0140-6736(03)12489-0","article-title":"Gut flora in health and disease","volume":"361","author":"Guarner","year":"2003","journal-title":"Lancet"},{"key":"2023062708450506700_btaa1056-B15","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"Art: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2012","journal-title":"Bioinformatics"},{"year":"2020","author":"Inga","key":"2023062708450506700_btaa1056-B16"},{"key":"2023062708450506700_btaa1056-B17","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023062708450506700_btaa1056-B18","doi-asserted-by":"crossref","first-page":"W256","DOI":"10.1093\/nar\/gkz239","article-title":"Interactive tree of life (itol) v4: recent updates and new developments","volume":"47","author":"Letunic","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023062708450506700_btaa1056-B19","doi-asserted-by":"crossref","first-page":"2987","DOI":"10.1093\/bioinformatics\/btr509","article-title":"A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data","volume":"27","author":"Li","year":"2011","journal-title":"Bioinformatics"},{"key":"2023062708450506700_btaa1056-B20","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The sequence alignment\/map format and samtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023062708450506700_btaa1056-B21","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nrmicro3344","article-title":"The gut microbiota, bacterial metabolites and colorectal cancer","volume":"12","author":"Louis","year":"2014","journal-title":"Nat. Rev. Microbiol"},{"key":"2023062708450506700_btaa1056-B22","doi-asserted-by":"crossref","first-page":"7200","DOI":"10.1073\/pnas.1015622108","article-title":"Genome sequencing of environmental Escherichia coli expands understanding of the ecology and speciation of the model bacterial species","volume":"108","author":"Luo","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062708450506700_btaa1056-B23","doi-asserted-by":"crossref","first-page":"1045","DOI":"10.1038\/nbt.3319","article-title":"Constrains identifies microbial strains in metagenomic datasets","volume":"33","author":"Luo","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023062708450506700_btaa1056-B24","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/j.watres.2018.12.069","article-title":"New insights into antibiotic resistome in drinking water and management perspectives: a metagenomic based study of small-sized microbes","volume":"152","author":"Ma","year":"2019","journal-title":"Water Res"},{"key":"2023062708450506700_btaa1056-B25","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1186\/1471-2164-13-74","article-title":"Gemsim: general, error-model based simulator of next-generation sequencing data","volume":"13","author":"McElroy","year":"2012","journal-title":"BMC Genomics"},{"key":"2023062708450506700_btaa1056-B26","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1016\/j.ygeno.2010.03.001","article-title":"Assembly algorithms for next-generation sequencing data","volume":"95","author":"Miller","year":"2010","journal-title":"Genomics"},{"key":"2023062708450506700_btaa1056-B27","doi-asserted-by":"crossref","first-page":"1128","DOI":"10.1073\/pnas.1010992108","article-title":"Strain-resolved community genomic analysis of gut microbial colonization in a premature infant","volume":"108","author":"Morowitz","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062708450506700_btaa1056-B28","doi-asserted-by":"crossref","first-page":"1612","DOI":"10.1101\/gr.201863.115","article-title":"An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography","volume":"26","author":"Nayfach","year":"2016","journal-title":"Genome Res"},{"key":"2023062708450506700_btaa1056-B29","doi-asserted-by":"crossref","first-page":"822","DOI":"10.1038\/nbt.2939","article-title":"Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes","volume":"32","author":"Nielsen","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023062708450506700_btaa1056-B30","doi-asserted-by":"crossref","first-page":"e9490","DOI":"10.1371\/journal.pone.0009490","article-title":"Fasttree 2\u2013approximately maximum-likelihood trees for large alignments","volume":"5","author":"Price","year":"2010","journal-title":"PLoS One"},{"key":"2023062708450506700_btaa1056-B31","doi-asserted-by":"crossref","first-page":"431","DOI":"10.3389\/fmicb.2018.00431","article-title":"Diversity and contributions to nitrogen cycling and carbon fixation of soil salinity shaped microbial communities in Tarim Basin","volume":"9","author":"Ren","year":"2018","journal-title":"Front. Microbiol"},{"key":"2023062708450506700_btaa1056-B32","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1038\/nature11711","article-title":"Genomic variation landscape of the human gut microbiome","volume":"493","author":"Schloissnig","year":"2013","journal-title":"Nature"},{"key":"2023062708450506700_btaa1056-B33","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/nmeth.3802","article-title":"Strain-level microbial epidemiology and population genomics from shotgun metagenomics","volume":"13","author":"Scholz","year":"2016","journal-title":"Nat. Methods"},{"key":"2023062708450506700_btaa1056-B34","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1016\/j.anaerobe.2005.05.001","article-title":"A dynamic partnership: celebrating our gut flora","volume":"11","author":"Sears","year":"2005","journal-title":"Anaerobe"},{"key":"2023062708450506700_btaa1056-B35","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1038\/nmeth.2066","article-title":"Metagenomic microbial community profiling using unique clade-specific marker genes","volume":"9","author":"Segata","year":"2012","journal-title":"Nat. Methods"},{"key":"2023062708450506700_btaa1056-B36","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1016\/j.chom.2018.01.003","article-title":"Strain tracking reveals the determinants of bacterial engraftment in the human gut following fecal microbiota transplantation","volume":"23","author":"Smillie","year":"2018","journal-title":"Cell Host Microbe"},{"key":"2023062708450506700_btaa1056-B37","doi-asserted-by":"crossref","first-page":"8922","DOI":"10.1073\/pnas.95.15.8922","article-title":"Pathogenic adaptation of Escherichia coli by natural variation of the FimH adhesin","volume":"95","author":"Sokurenko","year":"1998","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062708450506700_btaa1056-B38","doi-asserted-by":"crossref","first-page":"1789","DOI":"10.1093\/bioinformatics\/bty844","article-title":"Strain-gems: optimized subspecies identification from microbiome data based on accurate variant modeling","volume":"35","author":"Tan","year":"2019","journal-title":"Bioinformatics"},{"key":"2023062708450506700_btaa1056-B39","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1038\/ismej.2016.99","article-title":"Metagenomic covariation along densely sampled environmental gradients in the red sea","volume":"11","author":"Thompson","year":"2017","journal-title":"ISME J"},{"key":"2023062708450506700_btaa1056-B40","doi-asserted-by":"crossref","first-page":"902","DOI":"10.1038\/nmeth.3589","article-title":"Metaphlan2 for enhanced metagenomic taxonomic profiling","volume":"12","author":"Truong","year":"2015","journal-title":"Nat. Methods"},{"key":"2023062708450506700_btaa1056-B41","doi-asserted-by":"crossref","first-page":"626","DOI":"10.1101\/gr.216242.116","article-title":"Microbial strain-level population structure and genetic diversity from metagenomes","volume":"27","author":"Truong","year":"2017","journal-title":"Genome Res"},{"key":"2023062708450506700_btaa1056-B42","doi-asserted-by":"crossref","first-page":"1027","DOI":"10.1038\/nature05414","article-title":"An obesity-associated gut microbiome with increased capacity for energy harvest","volume":"444","author":"Turnbaugh","year":"2006","journal-title":"Nature"},{"key":"2023062708450506700_btaa1056-B43","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1002\/0471250953.bi1110s43","article-title":"From fastq data to high-confidence variant calls: the genome analysis toolkit best practices pipeline","volume":"43","author":"Van der Auwera","year":"2013","journal-title":"Curr. Protoc. Bioinf"},{"key":"2023062708450506700_btaa1056-B44","doi-asserted-by":"crossref","first-page":"4223","DOI":"10.1016\/j.febslet.2014.09.039","article-title":"Meta-analyses of human gut microbes associated with obesity and IBD","volume":"588","author":"Walters","year":"2014","journal-title":"FEBS Lett"},{"key":"2023062708450506700_btaa1056-B45","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1136\/gutjnl-2015-309800","article-title":"Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer","volume":"66","author":"Yu","year":"2017","journal-title":"Gut"},{"key":"2023062708450506700_btaa1056-B46","doi-asserted-by":"crossref","first-page":"766","DOI":"10.15252\/msb.20145645","article-title":"Potential of fecal microbiota for early-stage detection of colorectal cancer","volume":"10","author":"Zeller","year":"2014","journal-title":"Mol. Syst. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa1056\/35433191\/btaa1056.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/22-23\/5499\/50716997\/btaa1056.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/22-23\/5499\/50716997\/btaa1056.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T08:47:40Z","timestamp":1687855660000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/22-23\/5499\/6042705"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,12,1]]},"references-count":45,"journal-issue":{"issue":"22-23","published-print":{"date-parts":[[2021,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa1056","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2020,12,1]]},"published":{"date-parts":[[2020,12,1]]}}}