{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,16]],"date-time":"2026-03-16T22:39:50Z","timestamp":1773700790108,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"20","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Studies of the biochemical functions and activities of uncultivated microorganisms in the environment require analysis of DNA sequences for phylogenetic characterization and for the development of sequence-based assays for the detection of microorganisms. The numbers of sequences for genes that are indicators of environmentally important functions such as nitrogen (N 2 ) fixation have been rapidly growing over the past few decades. Obtaining these sequences from the National Center for Biotechnology Information\u2019s GenBank database is problematic because of annotation errors, nomenclature variation and paralogues; moreover, GenBank\u2019s structure and tools are not conducive to searching solely by function. For some genes, such as the nifH gene commonly used to assess community potential for N 2 fixation, manual collection and curation are becoming intractable because of the large number of sequences in GenBank and the large number of highly similar paralogues. If analysis is to keep pace with sequence discovery, an automated retrieval and curation system is necessary.<\/jats:p><jats:p>Results: ARBitrator uses a two-step process composed of a broad collection of potential homologues followed by screening with a best hit strategy to conserved domains. 34 420 nifH sequences were identified in GenBank as of November 20, 2012. The false-positive rate is \u223c0.033%. ARBitrator rapidly updates a public nifH sequence database, and we show that it can be adapted for other genes.<\/jats:p><jats:p>Availability and implementation: Java source and executable code are freely available to non-commercial users at http:\/\/pmc.ucsc.edu\/\u223cwwwzehr\/research\/database\/ .<\/jats:p><jats:p>Contact: \u00a0zehrj@ucsc.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary information is available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu417","type":"journal-article","created":{"date-parts":[[2014,7,3]],"date-time":"2014-07-03T00:27:22Z","timestamp":1404347242000},"page":"2883-2890","source":"Crossref","is-referenced-by-count":67,"title":["ARBitrator: a software pipeline for on-demand retrieval of auto-curated<i>nifH<\/i>sequences from GenBank"],"prefix":"10.1093","volume":"30","author":[{"given":"Philip","family":"Heller","sequence":"first","affiliation":[{"name":"1 Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA, 2 Department of Energy (DOE) Joint Genome Institute, Walnut Creek, CA 94598, USA and 3 Department of Ocean Sciences, University of California, Santa Cruz, CA 95064, USA"}]},{"given":"H. James","family":"Tripp","sequence":"additional","affiliation":[{"name":"1 Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA, 2 Department of Energy (DOE) Joint Genome Institute, Walnut Creek, CA 94598, USA and 3 Department of Ocean Sciences, University of California, Santa Cruz, CA 95064, USA"}]},{"given":"Kendra","family":"Turk-Kubo","sequence":"additional","affiliation":[{"name":"1 Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA, 2 Department of Energy (DOE) Joint Genome Institute, Walnut Creek, CA 94598, USA and 3 Department of Ocean Sciences, University of California, Santa Cruz, CA 95064, USA"}]},{"given":"Jonathan P.","family":"Zehr","sequence":"additional","affiliation":[{"name":"1 Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA, 2 Department of Energy (DOE) Joint Genome Institute, Walnut Creek, CA 94598, USA and 3 Department of Ocean Sciences, University of California, Santa Cruz, CA 95064, USA"}]}],"member":"286","published-online":{"date-parts":[[2014,7,2]]},"reference":[{"key":"2023012711562049900_btu417-B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023012711562049900_btu417-B2","doi-asserted-by":"crossref","first-page":"23D","DOI":"10.1093\/nar\/gkh045","article-title":"GenBank: update","volume":"32","author":"Benson","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012711562049900_btu417-B3","doi-asserted-by":"crossref","first-page":"205","DOI":"10.3389\/fmicb.2011.00205","article-title":"An alternative path for the evolution of biological nitrogen fixation","volume":"2","author":"Boyd","year":"2011","journal-title":"Front. Microbiol."},{"key":"2023012711562049900_btu417-B4","doi-asserted-by":"crossref","first-page":"6590","DOI":"10.1128\/jb.176.21.6590-6598.1994","article-title":"Cloning, DNA sequencing, and characterization of a nifD-homologous gene from the archaeon Methanosarcina barkeri 227 which resembles nifD1 from the eubacterium Clostridium pasteurianum","volume":"176","author":"Chien","year":"1994","journal-title":"J. Bacteriol."},{"key":"2023012711562049900_btu417-B5","doi-asserted-by":"crossref","first-page":"1301","DOI":"10.1099\/mic.0.26585-0","article-title":"NifH and NifD phylogenies: an evolutionary basis for understanding nitrogen fixation capabilities of methanotrophic bacteria","volume":"150","author":"Dedysh","year":"2004","journal-title":"Microbiology"},{"key":"2023012711562049900_btu417-B6","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile hidden Markov models","volume":"9","author":"Eddy","year":"1998","journal-title":"Bioinform. Rev."},{"key":"2023012711562049900_btu417-B8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s002390010061","article-title":"Molecular evolution of nitrogen fixation: the evolutionaryhistory of the nifD, nifK, nifE, and nifN genes","volume":"51","author":"Fani","year":"2000","journal-title":"J. Mol. Evol."},{"key":"2023012711562049900_btu417-B9","doi-asserted-by":"crossref","first-page":"W29","DOI":"10.1093\/nar\/gkr367","article-title":"HMMER web server: interactive sequence similarity searching","volume":"39","author":"Finn","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023012711562049900_btu417-B10","doi-asserted-by":"crossref","first-page":"1790","DOI":"10.1111\/j.1462-2920.2011.02488.x","article-title":"A global census of nitrogenase diversity","volume":"13","author":"Gaby","year":"2011","journal-title":"Environ. Microbiol."},{"key":"2023012711562049900_btu417-B11","doi-asserted-by":"crossref","first-page":"bau001","DOI":"10.1093\/database\/bau001","article-title":"A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria","volume":"2014","author":"Gaby","year":"2014","journal-title":"Database"},{"key":"2023012711562049900_btu417-B12","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1099\/ijs.0.02821-0","article-title":"Molecular phylogeny of the heterocystous cyanobacteria (subsections IV and V) based on nifD","volume":"54","author":"Henson","year":"2004","journal-title":"Int. J. Syst. Evol. Microbiol."},{"key":"2023012711562049900_btu417-B13","doi-asserted-by":"crossref","first-page":"1591","DOI":"10.1099\/ijs.0.02958-0","article-title":"Comparison of 16S rRNA, nifD , recA , gyrB , rpoB and fusA genes within the family Geobacteraceae fam. nov","volume":"54","author":"Holmes","year":"2004","journal-title":"Int. J. Syst. Evol. Microbiol."},{"key":"2023012711562049900_btu417-B14","doi-asserted-by":"crossref","first-page":"680","DOI":"10.1093\/bioinformatics\/btq003","article-title":"CD-HIT Suite: a web server for clustering and comparing biological sequences","volume":"26","author":"Huang","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012711562049900_btu417-B15","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1080\/10409230391036766","article-title":"Nitrogen fixation: the mechanism of the Mo-dependent nitrogenase","volume":"38","author":"Igarashi","year":"2003","journal-title":"Critical Rev. Biochem. Mol. Biol."},{"key":"2023012711562049900_btu417-B16","doi-asserted-by":"crossref","first-page":"3258","DOI":"10.1128\/jb.171.6.3258-3267.1989","article-title":"Two nifA-like genes required for expression of alternative nitrogenases by Azotobacter vinelandii","volume":"171","author":"Joerger","year":"1989","journal-title":"J. Bacteriol."},{"key":"2023012711562049900_btu417-B17","doi-asserted-by":"crossref","first-page":"D29","DOI":"10.1093\/nar\/gki098","article-title":"The EMBL nucleotide sequence database","volume":"33","author":"Kanz","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012711562049900_btu417-B18","doi-asserted-by":"crossref","first-page":"1501","DOI":"10.1006\/jmbi.1994.1104","article-title":"Protein modeling using hidden Markov models","volume":"235","author":"Krogh","year":"1994","journal-title":"J. Mol. Biol."},{"key":"2023012711562049900_btu417-B19","doi-asserted-by":"crossref","first-page":"D16","DOI":"10.1093\/nar\/gkl913","article-title":"EMBL nucleotide sequence database in 2006","volume":"35","author":"Kulikova","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012711562049900_btu417-B21","doi-asserted-by":"crossref","first-page":"5705","DOI":"10.1128\/jb.173.18.5705-5711.1991","article-title":"Identification of an alternative nitrogenase system in Rhodospirillum rubrum","volume":"173","author":"Lehman","year":"1991","journal-title":"J. Bacteriol."},{"key":"2023012711562049900_btu417-B23","doi-asserted-by":"crossref","first-page":"5308","DOI":"10.1128\/AEM.67.11.5308-5314.2001","article-title":"Recovery and phylogenetic analysis of nifh sequences from diazotrophic bacteria associated with dead aboveground biomass of spartina alterniflora","volume":"67","author":"Lovell","year":"2001","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012711562049900_btu417-B24","doi-asserted-by":"crossref","first-page":"1363","DOI":"10.1093\/nar\/gkh293","article-title":"ARB: a software environment for sequence data","volume":"32","author":"Ludwig","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012711562049900_btu417-B25","first-page":"1","article-title":"CDD: a database of conserved domain alignments with links to domain three-dimensional structure","volume":"30","author":"Marchler-Bauer","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023012711562049900_btu417-B26","doi-asserted-by":"crossref","first-page":"D225","DOI":"10.1093\/nar\/gkq1189","article-title":"CDD: a conserved domain database for the functional annotation of proteins","volume":"39","author":"Marchler-Bauer","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012711562049900_btu417-B27","doi-asserted-by":"crossref","first-page":"2747","DOI":"10.1128\/aem.62.8.2747-2752.1996","article-title":"Diversity of nitrogen fixation genes in the symbiotic intestinal microflora of the termite Reticulitermes speratus","volume":"62","author":"Ohkuma","year":"1996","journal-title":"Appl. Environ. Microbiol."},{"issue":"Pt. 8","key":"2023012711562049900_btu417-B28","doi-asserted-by":"crossref","first-page":"2557","DOI":"10.1099\/00221287-148-8-2557","article-title":"Conflicting phylogeographic patterns in rRNA and nifD indicate regionally restricted gene transfer in Bradyrhizobium","volume":"148","author":"Parker","year":"2002","journal-title":"Microbiology"},{"key":"2023012711562049900_btu417-B29","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1093\/molbev\/msh047","article-title":"The natural history of nitrogen fixation","volume":"21","author":"Raymond","year":"2004","journal-title":"Mol. Biol. Evol."},{"key":"2023012711562049900_btu417-B30","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1111\/j.1365-2699.2010.02284.x","article-title":"Rhizobial hitchhikers from Down Under: invasional meltdown in a plant-bacteria mutualism?","volume":"37","author":"Rodr\u00edguez-Echeverr\u00eda","year":"2010","journal-title":"J. Biogeogr."},{"key":"2023012711562049900_btu417-B31","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1146\/annurev.micro.62.081307.162737","article-title":"Biosynthesis of the iron-molybdenum cofactor of nitrogenase","volume":"62","author":"Rubio","year":"2008","journal-title":"Annu. Rev. Microbiol."},{"key":"2023012711562049900_btu417-B32","first-page":"406","article-title":"The neighbor-joining method: a new method for reconstructing phylogenetic trees","volume":"4","author":"Saitou","year":"1987","journal-title":"Mol. Biol. Evol."},{"key":"2023012711562049900_btu417-B33","doi-asserted-by":"crossref","first-page":"3278","DOI":"10.1128\/aem.57.11.3278-3286.1991","article-title":"Frankia genus-specific characterization by polymerase chain reaction","volume":"57","author":"Simonet","year":"1991","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012711562049900_btu417-B34","doi-asserted-by":"crossref","first-page":"7392","DOI":"10.1128\/JB.00876-07","article-title":"Expression and association of group IV nitrogenase NifD and NifH homologs in the non-nitrogen-fixing archaeon Methanocaldococcus jannaschii","volume":"189","author":"Staples","year":"2007","journal-title":"J. Bacteriol."},{"key":"2023012711562049900_btu417-B35","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1111\/j.1574-6941.1996.tb00342.x","article-title":"Consortial N 2 fixation: a strategy for meeting nitrogen requirements of marine and terrestrial cyanobacterial mats","volume":"21","author":"Steppe","year":"1996","journal-title":"FEMS Microbiol. Ecol."},{"key":"2023012711562049900_btu417-B36","doi-asserted-by":"crossref","first-page":"8792","DOI":"10.1093\/nar\/gkr576","article-title":"Misannotations of rRNA can now generate 90% false positive protein matches in metatranscriptomic studies","volume":"39","author":"Tripp","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012711562049900_btu417-B37","doi-asserted-by":"crossref","first-page":"1414","DOI":"10.1128\/jb.177.5.1414-1417.1995","article-title":"Remarkable N2-fixing bacterial diversity detected in rice roots by molecular evolutionary analysis of nifH gene sequences","volume":"177","author":"Ueda","year":"1995","journal-title":"J. Bacteriol."},{"key":"2023012711562049900_btu417-B38","first-page":"43","article-title":"Phylogenetic classification of nitrogen-fixing organisms","volume-title":"Biological nitrogen fixation","author":"Young","year":"1992"},{"key":"2023012711562049900_btu417-B39","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1007\/1-4020-3054-1_14","article-title":"The phylogeny and evolution of nitrogenases","volume-title":"Genomes and Genomics of Nitrogen-Fixing Organisms","author":"Young","year":"2005"},{"key":"2023012711562049900_btu417-B40","doi-asserted-by":"crossref","first-page":"2522","DOI":"10.1128\/aem.55.10.2522-2526.1989","article-title":"Use of degenerate oligonucleotides for amplification of the nifH gene from the marine cyanobacterium Trichodesmium thiebautii","volume":"55","author":"Zehr","year":"1989","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012711562049900_btu417-B41","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1007\/BF00183062","article-title":"Problems and Promises of Assaying the Genetic Potential for Nitrogen Fixation in the Marine Environment","volume":"32","author":"Zehr","year":"1996","journal-title":"Microb. Ecol."},{"key":"2023012711562049900_btu417-B42","doi-asserted-by":"crossref","first-page":"2527","DOI":"10.1128\/aem.61.7.2527-2532.1995","article-title":"Diversity of heterotrophic nitrogen fixation genes in a marine cyanobacterial mat","volume":"61","author":"Zehr","year":"1995","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012711562049900_btu417-B43","doi-asserted-by":"crossref","first-page":"1443","DOI":"10.1099\/00221287-143-4-1443","article-title":"Phylogeny of cyanobacterial nifH genes: evolutionary implications and potential applications to natural assemblages","volume":"143","author":"Zehr","year":"1997","journal-title":"Microbiology"},{"key":"2023012711562049900_btu417-B44","doi-asserted-by":"crossref","first-page":"3444","DOI":"10.1128\/AEM.64.9.3444-3450.1998","article-title":"New nitrogen-fixing microorganisms detected in oligotrophic oceans by amplification of nitrogenase (nifH) genes","volume":"64","author":"Zehr","year":"1998","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012711562049900_btu417-B45","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1046\/j.1462-2920.2003.00451.x","article-title":"Nitrogenase gene diversity and microbial community structure: a cross-system comparison","volume":"5","author":"Zehr","year":"2003","journal-title":"Environ. Microbiol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/20\/2883\/48929918\/bioinformatics_30_20_2883.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/20\/2883\/48929918\/bioinformatics_30_20_2883.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T20:01:22Z","timestamp":1716926482000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/20\/2883\/2422235"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,7,2]]},"references-count":42,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2014,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu417","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,10,15]]},"published":{"date-parts":[[2014,7,2]]}}}