{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,17]],"date-time":"2026-01-17T21:42:56Z","timestamp":1768686176335,"version":"3.49.0"},"reference-count":33,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2022,2,3]],"date-time":"2022-02-03T00:00:00Z","timestamp":1643846400000},"content-version":"vor","delay-in-days":2,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Biological Information ITALY and ELIXIR-CONVERGE","award":["H2020-INFRADEV-2019-2"],"award-info":[{"award-number":["H2020-INFRADEV-2019-2"]}]},{"name":"FAIR lifescience data management services","award":["GA 871075"],"award-info":[{"award-number":["GA 871075"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,2,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Nucleotide sequences reference collections or databases are fundamental components in DNA barcoding and metabarcoding data analyses pipelines. In such analyses, the accurate taxonomic assignment is a crucial aspect, relying directly on the availability of comprehensive and curated reference sequence collection and its taxonomy information. The currently wide use of the mitochondrial cytochrome oxidase subunit-I (COXI) as a standard DNA barcode marker in metazoan biodiversity studies highlights the need to shed light on the availability of the related relevant information from different data sources and their eventual integration. To adequately address data integration process, many aspects should be markedly considered starting from DNA sequence curation followed by taxonomy alignment with solid reference backbone and metadata harmonization according to universal standards. Here, we present MetaCOXI, an integrated collection of curated metazoan COXI DNA sequences with their associated harmonized taxonomy and metadata. This collection was built on the two most extensive available data resources, namely the European Nucleotide Archive (ENA) and the Barcode of Life Data System (BOLD). The current release contains more than 5.6 million entries (39.1% unique to BOLD, 3.6% unique to ENA, and 57.2% shared between both), their related taxonomic classification based on NCBI reference taxonomy, and their available main metadata relevant to environmental DNA studies, such as geographical coordinates, sampling country and host species. MetaCOXI is available in standard universal formats (\u2018fasta\u2019 for sequences &amp; \u2018tsv\u2019 for taxonomy and metadata), which can be easily incorporated in standard or specific DNA barcoding and\/or metabarcoding data analysis pipelines.<\/jats:p>\n               <jats:p>Database URL: https:\/\/github.com\/bachob5\/MetaCOXI<\/jats:p>","DOI":"10.1093\/database\/baab084","type":"journal-article","created":{"date-parts":[[2022,1,7]],"date-time":"2022-01-07T20:09:02Z","timestamp":1641586142000},"source":"Crossref","is-referenced-by-count":17,"title":["MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences"],"prefix":"10.1093","volume":"2022","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4419-0729","authenticated-orcid":false,"given":"Bachir","family":"Balech","sequence":"first","affiliation":[{"name":"Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council of Italy, via Amendola 122\/O, Bari 70126, Italy"}]},{"given":"Anna","family":"Sandionigi","sequence":"additional","affiliation":[{"name":"Research and Development Department, Quantia Consulting srl, via Francesco Petrarca 20, Mariano Comense 22066, Italy"}]},{"given":"Marinella","family":"Marzano","sequence":"additional","affiliation":[{"name":"Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council of Italy, via Amendola 122\/O, Bari 70126, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3663-0859","authenticated-orcid":false,"given":"Graziano","family":"Pesole","sequence":"additional","affiliation":[{"name":"Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council of Italy, via Amendola 122\/O, Bari 70126, Italy"},{"name":"Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari \u2018A. Moro\u2019, via Orabona 4, Bari 70126, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1488-6052","authenticated-orcid":false,"given":"Monica","family":"Santamaria","sequence":"additional","affiliation":[{"name":"Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council of Italy, via Amendola 122\/O, Bari 70126, Italy"}]}],"member":"286","published-online":{"date-parts":[[2022,2,5]]},"reference":[{"key":"2022021913191085600_R1","doi-asserted-by":"crossref","first-page":"2045","DOI":"10.1111\/j.1365-294X.2012.05470.x","article-title":"Towards next-generation biodiversity assessment using DNA metabarcoding","volume":"21","author":"Taberlet","year":"2012","journal-title":"Mol. Ecol."},{"key":"2022021913191085600_R2","doi-asserted-by":"crossref","DOI":"10.1038\/s42003-021-02031-2","article-title":"Environmental DNA provides higher resolution assessment of riverine biodiversity and ecosystem function via spatio-temporal nestedness and turnover partitioning","volume":"4","author":"Seymour","year":"2021","journal-title":"Commun. Biol."},{"key":"2022021913191085600_R3","doi-asserted-by":"crossref","first-page":"4258","DOI":"10.1111\/mec.15643","article-title":"Environmental DNA: what\u2019s behind the term? Clarifying the terminology and recommendations for its future use in biomonitoring","volume":"29","author":"Pawlowski","year":"2020","journal-title":"Mol. Ecol."},{"key":"2022021913191085600_R4","doi-asserted-by":"crossref","DOI":"10.3390\/microorganisms8020308","article-title":"The human oral microbiome in health and disease: from sequences to ecosystems","volume":"8","author":"Willis","year":"2020","journal-title":"Microorganisms"},{"key":"2022021913191085600_R5","article-title":"Past, present, and future perspectives of environmental DNA (eDNA) metabarcoding: a systematic review in methods, monitoring, and applications of global eDNA","volume":"17","author":"Ruppert","year":"2019","journal-title":"Glob. Ecol. Conserv."},{"key":"2022021913191085600_R6","doi-asserted-by":"crossref","first-page":"1857","DOI":"10.1111\/mec.15060","article-title":"DNA metabarcoding\u2014Need for robust experimental designs to draw sound ecological conclusions","volume":"28","author":"Zinger","year":"2019","journal-title":"Mol. Ecol."},{"key":"2022021913191085600_R7","doi-asserted-by":"crossref","first-page":"5872","DOI":"10.1111\/mec.14350","article-title":"Environmental DNA metabarcoding: transforming how we survey animal and plant communities","volume":"26","author":"Deiner","year":"2017","journal-title":"Mol. Ecol."},{"key":"2022021913191085600_R8","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1098\/rspb.2002.2218","article-title":"Biological identifications through DNA barcodes","volume":"270","author":"Hebert","year":"2003","journal-title":"Proc. R. Soc. B Biol. Sci."},{"key":"2022021913191085600_R9","doi-asserted-by":"crossref","DOI":"10.7717\/peerj.4845","article-title":"Tackling critical parameters in metazoan meta-barcoding experiments: a preliminary study based on coxI DNA barcode","volume":"6","author":"Balech","year":"2018","journal-title":"PeerJ"},{"key":"2022021913191085600_R10","doi-asserted-by":"crossref","DOI":"10.1002\/eap.1877","article-title":"Estimating the biodiversity of terrestrial invertebrates on a forested Island using DNA barcodes and metabarcoding data","volume":"29","author":"Dopheide","year":"2019","journal-title":"Ecol. Appl."},{"key":"2022021913191085600_R11","doi-asserted-by":"crossref","DOI":"10.1186\/s12862-019-1346-y","article-title":"DNA barcoding a unique avifauna: an important tool for evolution, systematics and conservation","volume":"19","author":"Tizard","year":"2019","journal-title":"BMC Evol. Biol."},{"key":"2022021913191085600_R12","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1016\/j.tig.2012.08.001","article-title":"The golden age of DNA metasystematics","volume":"28","author":"Hajibabaei","year":"2012","journal-title":"Trends Genet."},{"key":"2022021913191085600_R13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s12526-020-01093-5","article-title":"Significant taxon sampling gaps in DNA databases limit the operational use of marine macrofauna metabarcoding","volume":"50","author":"Hestetun","year":"2020","journal-title":"Mar. Biodivers."},{"key":"2022021913191085600_R14","doi-asserted-by":"crossref","DOI":"10.7717\/peerj.4705","article-title":"DNA metabarcoding of littoral hard-bottom communities: high diversity and database gaps revealed by two molecular markers","volume":"6","author":"Wangensteen","year":"2018","journal-title":"PeerJ"},{"key":"2022021913191085600_R15","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1111\/j.1471-8286.2007.01678.x","article-title":"BOLD: the barcode of life data system: barcoding","volume":"7","author":"Ratnasingham","year":"2007","journal-title":"Mol. Ecol. Notes"},{"key":"2022021913191085600_R16","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkq967","article-title":"The European nucleotide archive","volume":"39","author":"Leinonen","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2022021913191085600_R17","article-title":"A reference cytochrome c oxidase subunit I database curated for hierarchical classification of arthropod metabarcoding data","volume":"2018","author":"Richardson","year":"2018","journal-title":"PeerJ"},{"key":"2022021913191085600_R18","doi-asserted-by":"crossref","DOI":"10.1038\/sdata.2017.27","article-title":"Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples","volume":"4","author":"Machida","year":"2017","journal-title":"Sci. Data"},{"key":"2022021913191085600_R19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2018.156","article-title":"A database of metazoan cytochrome c oxidase subunit I gene sequences derived from GenBank with CO-ARBitrator","volume":"5","author":"Heller","year":"2018","journal-title":"Sci. Data"},{"key":"2022021913191085600_R20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41597-020-0549-9","article-title":"MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding","volume":"7","author":"Arranz","year":"2020","journal-title":"Sci. Data"},{"key":"2022021913191085600_R21","doi-asserted-by":"crossref","DOI":"10.5334\/dsj-2021-011","article-title":"Sample identifiers and metadata to support data management and reuse in multidisciplinary ecosystem sciences","volume":"20","author":"Damerow","year":"2021","journal-title":"Data Sci. J."},{"key":"2022021913191085600_R22","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1038\/nbt.1823","article-title":"Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications","volume":"29","author":"Yilmaz","year":"2011","journal-title":"Nat. Biotechnol."},{"key":"2022021913191085600_R23","doi-asserted-by":"crossref","DOI":"10.1186\/s13326-016-0097-6","article-title":"The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation","volume":"7","author":"Buttigieg","year":"2016","journal-title":"J. Biomed. Semantics"},{"key":"2022021913191085600_R24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.18","article-title":"Comment: the FAIR Guiding Principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci. Data"},{"key":"2022021913191085600_R25","doi-asserted-by":"crossref","DOI":"10.1093\/database\/baaa062","article-title":"NCBI Taxonomy: a comprehensive update on curation, resources and tools","volume":"2020","author":"Schoch","year":"2020","journal-title":"Database"},{"key":"2022021913191085600_R26","doi-asserted-by":"crossref","DOI":"10.1016\/j.ympev.2020.106857","article-title":"Mitogenomics reveals phylogenetic relationships of Arcoida (Mollusca, Bivalvia) and multiple independent expansions and contractions in mitochondrial genome size","volume":"150","author":"Kong","year":"2020","journal-title":"Mol. Phylogenet. Evol."},{"key":"2022021913191085600_R27","doi-asserted-by":"crossref","DOI":"10.1038\/srep15894","article-title":"A DNA mini-barcoding system for authentication of processed fish products","volume":"5","author":"Shokralla","year":"2015","journal-title":"Sci. Rep."},{"key":"2022021913191085600_R28","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1007\/978-1-61779-591-6_15","article-title":"DNA mini-barcodes","volume":"858","author":"Hajibabaei","year":"2012","journal-title":"Methods Mol. Biol."},{"key":"2022021913191085600_R29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-020-74918-9","article-title":"NGS-based barcoding with mini- COI gene target is useful for pet food market surveys aimed at mislabelling detection","volume":"10","author":"Palumbo","year":"2020","journal-title":"Sci. Rep."},{"key":"2022021913191085600_R30","doi-asserted-by":"crossref","first-page":"D412","DOI":"10.1093\/nar\/gkaa913","article-title":"Pfam: the protein families database in 2021","volume":"49","author":"Mistry","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"2022021913191085600_R31","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated profile HMM searches","volume":"7","author":"Eddy","year":"2011","journal-title":"PLoS Comput. Biol."},{"key":"2022021913191085600_R32","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2105-10-421","article-title":"BLAST+: architecture and applications","volume":"10","author":"Camacho","year":"2009","journal-title":"BMC Bioinform."},{"key":"2022021913191085600_R33","doi-asserted-by":"crossref","first-page":"2460","DOI":"10.1093\/bioinformatics\/btq461","article-title":"Search and clustering orders of magnitude faster than BLAST","volume":"26","author":"Edgar","year":"2010","journal-title":"Bioinformatics"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baab084\/42558847\/baab084.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baab084\/42558847\/baab084.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,2,19]],"date-time":"2022-02-19T13:19:51Z","timestamp":1645276791000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baab084\/6521297"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,1]]},"references-count":33,"URL":"https:\/\/doi.org\/10.1093\/database\/baab084","relation":{},"ISSN":["1758-0463"],"issn-type":[{"value":"1758-0463","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,2,1]]},"published":{"date-parts":[[2022,2,1]]},"article-number":"baab084"}}