{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T04:52:12Z","timestamp":1774068732758,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"21","license":[{"start":{"date-parts":[[2020,7,22]],"date-time":"2020-07-22T00:00:00Z","timestamp":1595376000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000199","name":"United States Department of Agriculture","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000199","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100005825","name":"National Institute of Food and Agriculture","doi-asserted-by":"publisher","award":["IOW03617"],"award-info":[{"award-number":["IOW03617"]}],"id":[{"id":"10.13039\/100005825","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,1,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Next-generation amplicon sequencing is a powerful tool for investigating microbial communities. A main challenge is to distinguish true biological variants from errors caused by amplification and sequencing. In traditional analyses, such errors are eliminated by clustering reads within a sequence similarity threshold, usually 97%, and constructing operational taxonomic units, but the arbitrary threshold leads to low resolution and high false-positive rates. Recently developed \u2018denoising\u2019 methods have proven able to resolve single-nucleotide amplicon variants, but they still miss low-frequency sequences, especially those near more frequent sequences, because they ignore the sequencing quality information.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We introduce AmpliCI, a reference-free, model-based method for rapidly resolving the number, abundance and identity of error-free sequences in massive Illumina amplicon datasets. AmpliCI considers the quality information and allows the data, not an arbitrary threshold or an external database, to drive conclusions. AmpliCI estimates a finite mixture model, using a greedy strategy to gradually select error-free sequences and approximately maximize the likelihood. AmpliCI has better performance than three popular denoising methods, with acceptable computation time and memory usage.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Source code is available at https:\/\/github.com\/DormanLab\/AmpliCI.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary material are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa648","type":"journal-article","created":{"date-parts":[[2020,7,16]],"date-time":"2020-07-16T15:13:02Z","timestamp":1594912382000},"page":"5151-5158","source":"Crossref","is-referenced-by-count":21,"title":["AmpliCI: a high-resolution model-based approach for denoising Illumina amplicon data"],"prefix":"10.1093","volume":"36","author":[{"given":"Xiyu","family":"Peng","sequence":"first","affiliation":[{"name":"Department of Statistics , Ames, IA 50011, USA"},{"name":"Interdepartmental Program in Bioinformatics and Computational Biology , Ames, IA 50011, USA"}]},{"given":"Karin S","family":"Dorman","sequence":"additional","affiliation":[{"name":"Department of Statistics , Ames, IA 50011, USA"},{"name":"Interdepartmental Program in Bioinformatics and Computational Biology , Ames, IA 50011, USA"},{"name":"Department of Genetics, Development and Cell Biology, Iowa State University , Ames, IA 50011, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,7,22]]},"reference":[{"key":"2023062408071177700_btaa648-B1","doi-asserted-by":"crossref","first-page":"e00191","DOI":"10.1128\/mSystems.00191-16","article-title":"Deblur rapidly resolves single-nucleotide community sequence patterns","volume":"2","author":"Amir","year":"2017","journal-title":"mSystems"},{"key":"2023062408071177700_btaa648-B2","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1186\/s40168-018-0543-z","article-title":"Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies","volume":"6","author":"Bender","year":"2018","journal-title":"Microbiome"},{"key":"2023062408071177700_btaa648-B3","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nmeth.2276","article-title":"Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing","volume":"10","author":"Bokulich","year":"2013","journal-title":"Nat. Methods"},{"key":"2023062408071177700_btaa648-B4","first-page":"e934v2","article-title":"A standardized, extensible framework for optimizing classification improves marker-gene taxonomic assignments","volume":"3","author":"Bokulich","year":"2015","journal-title":"PeerJ PrePrints"},{"key":"2023062408071177700_btaa648-B5","doi-asserted-by":"crossref","first-page":"e00062","DOI":"10.1128\/mSystems.00062-16","article-title":"mockrobiota: a public resource for microbiome bioinformatics benchmarking","volume":"1","author":"Bokulich","year":"2016","journal-title":"mSystems"},{"key":"2023062408071177700_btaa648-B6","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1038\/nmeth.3869","article-title":"DADA2: high-resolution sample inference from Illumina amplicon data","volume":"13","author":"Callahan","year":"2016","journal-title":"Nat. Methods"},{"key":"2023062408071177700_btaa648-B7","doi-asserted-by":"crossref","first-page":"2639","DOI":"10.1038\/ismej.2017.119","article-title":"Exact sequence variants should replace operational taxonomic units in marker-gene data analysis","volume":"11","author":"Callahan","year":"2017","journal-title":"ISME J"},{"key":"2023062408071177700_btaa648-B8","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1038\/nmeth.f.303","article-title":"QIIME allows analysis of high-throughput community sequencing data","volume":"7","author":"Caporaso","year":"2010","journal-title":"Nat. Methods"},{"key":"2023062408071177700_btaa648-B9","doi-asserted-by":"crossref","first-page":"4516","DOI":"10.1073\/pnas.1000080107","article-title":"Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample","volume":"108","author":"Caporaso","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062408071177700_btaa648-B10","doi-asserted-by":"crossref","first-page":"2460","DOI":"10.1093\/bioinformatics\/btq461","article-title":"Search and clustering orders of magnitude faster than BLAST","volume":"26","author":"Edgar","year":"2010","journal-title":"Bioinformatics"},{"key":"2023062408071177700_btaa648-B11","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1038\/nmeth.2604","article-title":"UPARSE: highly accurate OTU sequences from microbial amplicon reads","volume":"10","author":"Edgar","year":"2013","journal-title":"Nat. Methods"},{"key":"2023062408071177700_btaa648-B12","doi-asserted-by":"publisher","author":"Edgar","year":"2016","DOI":"10.1101\/074252"},{"key":"2023062408071177700_btaa648-B13","doi-asserted-by":"publisher","author":"Edgar","year":"2016","DOI":"10.1101\/081257"},{"key":"2023062408071177700_btaa648-B14","doi-asserted-by":"crossref","first-page":"e3889","DOI":"10.7717\/peerj.3889","article-title":"Accuracy of microbial community diversity estimated by closed- and open-reference OTUs","volume":"5","author":"Edgar","year":"2017","journal-title":"PeerJ"},{"key":"2023062408071177700_btaa648-B15","doi-asserted-by":"crossref","first-page":"2371","DOI":"10.1093\/bioinformatics\/bty113","article-title":"Updating the 97% identity threshold for 16S ribosomal RNA OTUs","volume":"34","author":"Edgar","year":"2018","journal-title":"Bioinformatics"},{"key":"2023062408071177700_btaa648-B16","doi-asserted-by":"crossref","first-page":"1111","DOI":"10.1111\/2041-210X.12114","article-title":"Oligotyping: differentiating between closely related microbial taxa using 16S rRNA gene data","volume":"4","author":"Eren","year":"2013","journal-title":"Methods Ecol. Evol"},{"key":"2023062408071177700_btaa648-B17","doi-asserted-by":"crossref","first-page":"968","DOI":"10.1038\/ismej.2014.195","article-title":"Minimum entropy decomposition: unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences","volume":"9","author":"Eren","year":"2015","journal-title":"ISME J"},{"key":"2023062408071177700_btaa648-B18","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1101\/gr.8.3.186","article-title":"Base-calling of automated sequencer traces using phred. II. Error probabilities","volume":"8","author":"Ewing","year":"1998","journal-title":"Genome Res"},{"key":"2023062408071177700_btaa648-B19","doi-asserted-by":"crossref","first-page":"e21","DOI":"10.1093\/nar\/gkx1201","article-title":"SeekDeep: single-base resolution de novo clustering for amplicon deep sequencing","volume":"46","author":"Hathaway","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023062408071177700_btaa648-B20","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"ART: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2012","journal-title":"Bioinformatics"},{"key":"2023062408071177700_btaa648-B21","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classif"},{"key":"2023062408071177700_btaa648-B22","doi-asserted-by":"crossref","first-page":"R143","DOI":"10.1186\/gb-2007-8-7-r143","article-title":"Accuracy and quality of massively parallel DNA pyrosequencing","volume":"8","author":"Huse","year":"2007","journal-title":"Genome Biol"},{"key":"2023062408071177700_btaa648-B23","doi-asserted-by":"crossref","first-page":"1889","DOI":"10.1111\/j.1462-2920.2010.02193.x","article-title":"Ironing out the wrinkles in the rare biosphere through improved OTU clustering","volume":"12","author":"Huse","year":"2010","journal-title":"Environ. Microbiol"},{"key":"2023062408071177700_btaa648-B24","doi-asserted-by":"crossref","first-page":"5029","DOI":"10.1038\/s41467-019-13036-1","article-title":"Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis","volume":"10","author":"Johnson","year":"2019","journal-title":"Nat. Commun"},{"key":"2023062408071177700_btaa648-B25","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/B978-1-4832-3211-9.50009-7","volume-title":"Mammalian Protein Metabolism","author":"Jukes","year":"1969"},{"key":"2023062408071177700_btaa648-B26","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1038\/s41579-018-0029-9","article-title":"Best practices for analysing microbiomes","volume":"16","author":"Knight","year":"2018","journal-title":"Nat. Rev. Microbiol"},{"key":"2023062408071177700_btaa648-B27","doi-asserted-by":"crossref","first-page":"2567","DOI":"10.1073\/pnas.0409727102","article-title":"Genomic insights that advance the species definition for prokaryotes","volume":"102","author":"Konstantinidis","year":"2005","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062408071177700_btaa648-B28","doi-asserted-by":"crossref","first-page":"e00003","DOI":"10.1128\/mSystems.00003-15","article-title":"Open-source sequence clustering methods improve the state of the art","volume":"1","author":"Kopylova","year":"2016","journal-title":"mSystems"},{"key":"2023062408071177700_btaa648-B29","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1186\/s13059-019-1659-6","article-title":"Analysis of error profiles in deep next-generation sequencing data","volume":"20","author":"Ma","year":"2019","journal-title":"Genome Biol"},{"key":"2023062408071177700_btaa648-B30","doi-asserted-by":"crossref","first-page":"8988","DOI":"10.1038\/srep08988","article-title":"The vaginal microbiome during pregnancy and the postpartum period in a European population","volume":"5","author":"MacIntyre","year":"2015","journal-title":"Sci. Rep"},{"key":"2023062408071177700_btaa648-B31","doi-asserted-by":"crossref","DOI":"10.1002\/0471721182","volume-title":"Finite Mixture Models. Wiley Series in Probability and Statistics","author":"McLachlan","year":"2000"},{"key":"2023062408071177700_btaa648-B32","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1214\/09-SS053","article-title":"Finite mixture models and model-based clustering","volume":"4","author":"Melnykov","year":"2010","journal-title":"Stat. Surv"},{"key":"2023062408071177700_btaa648-B33","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1186\/s12859-016-1061-2","article-title":"IPED: a highly efficient denoising tool for Illumina MiSeq paired-end 16S rRNA gene amplicon sequencing data","volume":"17","author":"Mysara","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023062408071177700_btaa648-B34","doi-asserted-by":"crossref","first-page":"e90","DOI":"10.1093\/nar\/gkr344","article-title":"Sequence-specific error profile of Illumina sequencers","volume":"39","author":"Nakamura","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023062408071177700_btaa648-B35","doi-asserted-by":"crossref","first-page":"e5364","DOI":"10.7717\/peerj.5364","article-title":"Denoising the denoisers: an independent evaluation of microbiome sequence error-correction approaches","volume":"6","author":"Nearing","year":"2018","journal-title":"PeerJ"},{"key":"2023062408071177700_btaa648-B36","doi-asserted-by":"crossref","first-page":"D590","DOI":"10.1093\/nar\/gks1219","article-title":"The SILVA ribosomal RNA gene database project: improved data processing and web-based tools","volume":"41","author":"Quast","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023062408071177700_btaa648-B37","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1038\/nmeth.1361","article-title":"Accurate determination of microbial diversity from 454 pyrosequencing data","volume":"6","author":"Quince","year":"2009","journal-title":"Nat. Methods"},{"key":"2023062408071177700_btaa648-B38","doi-asserted-by":"crossref","first-page":"1929","DOI":"10.1099\/ijs.0.000161","article-title":"Cautionary tale of using 16S rRNA gene sequence similarity values in identification of human-associated bacterial species","volume":"65","author":"Rossi-Tamisier","year":"2015","journal-title":"Int. J. Syst. Evol. Microbiol"},{"key":"2023062408071177700_btaa648-B39","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/s12859-016-0976-y","article-title":"Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data","volume":"17","author":"Schirmer","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023062408071177700_btaa648-B40","doi-asserted-by":"crossref","first-page":"3219","DOI":"10.1128\/AEM.02810-10","article-title":"Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis","volume":"77","author":"Schloss","year":"2011","journal-title":"Appl. Environ. Microbiol"},{"key":"2023062408071177700_btaa648-B41","first-page":"152","article-title":"Taxonomic parameters revisited: tarnished gold standards","volume":"33","author":"Stackebrandt","year":"2006","journal-title":"Microbiol. Today"},{"key":"2023062408071177700_btaa648-B42","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1099\/00207713-44-4-846","article-title":"Taxonomic note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology","volume":"44","author":"Stackebrandt","year":"1994","journal-title":"Int. J. Syst. Evol. Microbiol"},{"key":"2023062408071177700_btaa648-B43","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/ismej.2014.117","article-title":"Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution","volume":"9","author":"Tikhonov","year":"2015","journal-title":"ISME J"},{"key":"2023062408071177700_btaa648-B44","doi-asserted-by":"crossref","first-page":"S52","DOI":"10.1186\/1471-2105-12-S1-S52","article-title":"Repeat-aware modeling and correction of short read errors","volume":"12","author":"Yang","year":"2011","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa648\/34001361\/btaa648.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/21\/5151\/50692739\/btaa648.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/21\/5151\/50692739\/btaa648.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T19:31:33Z","timestamp":1687635093000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/21\/5151\/5875058"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,7,22]]},"references-count":44,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2021,1,29]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa648","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.02.23.961227","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,11,1]]},"published":{"date-parts":[[2020,7,22]]}}}