{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,26]],"date-time":"2026-04-26T23:36:55Z","timestamp":1777246615614,"version":"3.51.4"},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: A bacterial polysaccharide utilization locus (PUL) is a set of physically linked genes that orchestrate the breakdown of a specific glycan. PULs are prevalent in the Bacteroidetes phylum and are key to the digestion of complex carbohydrates, notably by the human gut microbiota. A given Bacteroidetes genome can encode dozens of different PULs whose boundaries and precise gene content are difficult to predict.<\/jats:p>\n               <jats:p>Results: Here, we present a fully automated approach for PUL prediction using genomic context and domain annotation alone. By combining the detection of a pair of marker genes with operon prediction using intergenic distances, and queries to the carbohydrate-active enzymes database (www.cazy.org), our predictor achieved above 86% accuracy in two Bacteroides species with extensive experimental PUL characterization.<\/jats:p>\n               <jats:p>Availability and implementation: PUL predictions in 67 Bacteroidetes genomes from the human gut microbiota and two additional species, from the canine oral sphere and from the environment, are presented in our database accessible at www.cazy.org\/PULDB\/index.php.<\/jats:p>\n               <jats:p>Contact: \u00a0bernard.henrissat@afmb.univ-mrs.fr<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu716","type":"journal-article","created":{"date-parts":[[2014,10,30]],"date-time":"2014-10-30T05:01:31Z","timestamp":1414645291000},"page":"647-655","source":"Crossref","is-referenced-by-count":224,"title":["Automatic prediction of polysaccharide utilization loci in Bacteroidetes species"],"prefix":"10.1093","volume":"31","author":[{"given":"Nicolas","family":"Terrapon","sequence":"first","affiliation":[{"name":"1 Centre National de la Recherche Scientifique, CNRS UMR 7257, 13288 Marseille, France, 2Aix-Marseille Universit\u00e9, AFMB, 13288 Marseille, France, 3Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Newcastle upon Tyne NE2 4HH, UK and 4Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia"},{"name":"1 Centre National de la Recherche Scientifique, CNRS UMR 7257, 13288 Marseille, France, 2Aix-Marseille Universit\u00e9, AFMB, 13288 Marseille, France, 3Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Newcastle upon Tyne NE2 4HH, UK and 4Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia"}]},{"given":"Vincent","family":"Lombard","sequence":"additional","affiliation":[{"name":"1 Centre National de la Recherche Scientifique, CNRS UMR 7257, 13288 Marseille, France, 2Aix-Marseille Universit\u00e9, AFMB, 13288 Marseille, France, 3Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Newcastle upon Tyne NE2 4HH, UK and 4Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia"}]},{"given":"Harry J.","family":"Gilbert","sequence":"additional","affiliation":[{"name":"1 Centre National de la Recherche Scientifique, CNRS UMR 7257, 13288 Marseille, France, 2Aix-Marseille Universit\u00e9, AFMB, 13288 Marseille, France, 3Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Newcastle upon Tyne NE2 4HH, UK and 4Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia"}]},{"given":"Bernard","family":"Henrissat","sequence":"additional","affiliation":[{"name":"1 Centre National de la Recherche Scientifique, CNRS UMR 7257, 13288 Marseille, France, 2Aix-Marseille Universit\u00e9, AFMB, 13288 Marseille, France, 3Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Newcastle upon Tyne NE2 4HH, UK and 4Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia"},{"name":"1 Centre National de la Recherche Scientifique, CNRS UMR 7257, 13288 Marseille, France, 2Aix-Marseille Universit\u00e9, AFMB, 13288 Marseille, France, 3Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Newcastle upon Tyne NE2 4HH, UK and 4Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia"}]}],"member":"286","published-online":{"date-parts":[[2014,10,28]]},"reference":[{"key":"2023020116162494900_btu716-B1","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1006\/jmbi.2001.4776","article-title":"Domain combinations in archaeal, eubacterial and eukaryotic proteomes","volume":"310","author":"Apic","year":"2001","journal-title":"J. Mol. Biol."},{"key":"2023020116162494900_btu716-B2","doi-asserted-by":"crossref","first-page":"i34","DOI":"10.1093\/bioinformatics\/btg1003","article-title":"Predicting bacterial transcription units using sequence and expression data","volume":"19","author":"Bockhorst","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020116162494900_btu716-B3","doi-asserted-by":"crossref","first-page":"34614","DOI":"10.1074\/jbc.M112.397380","article-title":"Multidomain carbohydrate-binding proteins involved in Bacteroides thetaiotaomicron starch metabolism","volume":"287","author":"Cameron","year":"2012","journal-title":"J. Biol. Chem."},{"key":"2023020116162494900_btu716-B4","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1016\/S0968-0004(98)01274-2","article-title":"Conservation of gene order: a fingerprint of proteins that physically interact","volume":"23","author":"Dandekar","year":"1998","journal-title":"Trends Biochem. Sci."},{"key":"2023020116162494900_btu716-B5","doi-asserted-by":"crossref","first-page":"497","DOI":"10.1038\/nrmicro3050","article-title":"The abundance and variety of carbohydrate-active enzymes in the human gut microbiota","volume":"11","author":"El Kaoutari","year":"2013","journal-title":"Nat. Rev. Microbiol."},{"key":"2023020116162494900_btu716-B6","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/47056","article-title":"Protein interaction maps for complete genomes based on gene fusion events","volume":"402","author":"Enright","year":"1999","journal-title":"Nature"},{"key":"2023020116162494900_btu716-B7","doi-asserted-by":"crossref","first-page":"1216","DOI":"10.1093\/nar\/29.5.1216","article-title":"Prediction of operons in microbial genomes","volume":"29","author":"Ermolaeva","year":"2001","journal-title":"Nucleic Acids Res."},{"issue":"Database issue","key":"2023020116162494900_btu716-B8","doi-asserted-by":"crossref","first-page":"D222","DOI":"10.1093\/nar\/gkt1223","article-title":"Pfam: the protein families database","volume":"42","author":"Finn","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023020116162494900_btu716-B9","doi-asserted-by":"crossref","first-page":"289","DOI":"10.4161\/gmic.19897","article-title":"Microbial degradation of complex carbohydrates in the gut","volume":"3","author":"Flint","year":"2012","journal-title":"Gut Microbes"},{"key":"2023020116162494900_btu716-B10","doi-asserted-by":"crossref","first-page":"1632","DOI":"10.1101\/gr.183801","article-title":"Annotation transfer for genomics: measuring functional divergence in multi-domain proteins","volume":"11","author":"Hegyi","year":"2001","journal-title":"Genome Res."},{"key":"2023020116162494900_btu716-B11","doi-asserted-by":"crossref","first-page":"36328","DOI":"10.1074\/jbc.M806115200","article-title":"Structural and functional analysis of a glycoside hydrolase family 97 enzyme from Bacteroides thetaiotaomicron","volume":"283","author":"Kitamura","year":"2008","journal-title":"J. Biol. Chem."},{"key":"2023020116162494900_btu716-B12","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1016\/j.tim.2005.06.005","article-title":"TonB-dependent trans-envelope signalling: the exception or the rule?","volume":"13","author":"Koebnik","year":"2005","journal-title":"Trends Microbiol."},{"key":"2023020116162494900_btu716-B13","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1016\/j.str.2009.12.010","article-title":"SusG: a unique cell-membrane-associated alpha-amylase from a prominent human gut symbiont targets complex starch molecules","volume":"18","author":"Koropatkin","year":"2010","journal-title":"Structure"},{"key":"2023020116162494900_btu716-B14","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.1016\/j.str.2008.03.017","article-title":"Starch catabolism by a prominent human gut symbiont is directed by the recognition of amylose helices","volume":"16","author":"Koropatkin","year":"2008","journal-title":"Structure"},{"key":"2023020116162494900_btu716-B15","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1093\/glycob\/4.6.759","article-title":"A calculation of all possible oligosaccharide isomers both branched and linear yields 1.05 x 10(12) structures for a reducing hexasaccharide: the Isomer Barrier to development of single-method saccharide sequencing or synthesis systems","volume":"4","author":"Laine","year":"1994","journal-title":"Glycobiology"},{"key":"2023020116162494900_btu716-B16","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1038\/nature12907","article-title":"A discrete genetic locus confers xyloglucan metabolism in select human gut Bacteroidetes","volume":"506","author":"Larsbrink","year":"2014","journal-title":"Nature"},{"key":"2023020116162494900_btu716-B17","doi-asserted-by":"crossref","first-page":"1022","DOI":"10.1038\/4441022a","article-title":"Microbial ecology: human gut microbes associated with obesity","volume":"444","author":"Ley","year":"2006","journal-title":"Nature"},{"issue":"Database issue","key":"2023020116162494900_btu716-B18","doi-asserted-by":"crossref","first-page":"D490","DOI":"10.1093\/nar\/gkt1178","article-title":"The carbohydrate-active enzymes database (CAZy) in 2013","volume":"42","author":"Lombard","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023020116162494900_btu716-B19","doi-asserted-by":"crossref","first-page":"1050","DOI":"10.1111\/j.1365-2958.2011.07750.x","article-title":"The genome and surface proteome of Capnocytophaga canimorsus reveal a key role of glycan foraging systems in host glycoproteins deglycosylation","volume":"81","author":"Manfredi","year":"2011","journal-title":"Mol. Microbiol."},{"issue":"Database issue","key":"2023020116162494900_btu716-B20","doi-asserted-by":"crossref","first-page":"D459","DOI":"10.1093\/nar\/gkn757","article-title":"DOOR: a database for prokaryotic operons","volume":"37","author":"Mao","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023020116162494900_btu716-B21","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1126\/science.285.5428.751","article-title":"Detecting protein function and protein-protein interactions from genome sequences","volume":"285","author":"Marcotte","year":"1999","journal-title":"Science"},{"issue":"Database issue","key":"2023020116162494900_btu716-B22","doi-asserted-by":"crossref","first-page":"D123","DOI":"10.1093\/nar\/gkr975","article-title":"IMG\/M: the integrated metagenome data management and comparative analysis system","volume":"40","author":"Markowitz","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023020116162494900_btu716-B23","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1016\/j.chom.2008.09.007","article-title":"Mucosal glycan foraging enhances fitness and transmission of a saccharolytic human gut bacterial symbiont","volume":"4","author":"Martens","year":"2008","journal-title":"Cell Host Microbe"},{"key":"2023020116162494900_btu716-B24","doi-asserted-by":"crossref","DOI":"10.1074\/jbc.M109.008094","article-title":"Coordinate regulation of glycan degradation and polysaccharide capsule biosynthesis by a prominent human gut symbiont","author":"Martens","year":"2009","journal-title":"J. Biol. Chem."},{"key":"2023020116162494900_btu716-B25","doi-asserted-by":"crossref","first-page":"e1001221","DOI":"10.1371\/journal.pbio.1001221","article-title":"Recognition and degradation of plant cell wall polysaccharides by two human gut symbionts","volume":"9","author":"Martens","year":"2011","journal-title":"PLoS Biol."},{"key":"2023020116162494900_btu716-B26","doi-asserted-by":"crossref","first-page":"6864","DOI":"10.1128\/AEM.01495-09","article-title":"Novel features of the polysaccharide-digesting gliding bacterium Flavobacterium johnsoniae as revealed by genome sequence analysis","volume":"75","author":"McBride","year":"2009","journal-title":"Appl. Environ. Microbiol."},{"key":"2023020116162494900_btu716-B27","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1093\/ajcn\/39.2.338","article-title":"The contribution of the large intestine to energy supplies in man","volume":"39","author":"McNeil","year":"1984","journal-title":"Am. J. Clin. Nutr."},{"key":"2023020116162494900_btu716-B28","doi-asserted-by":"crossref","first-page":"e1001637","DOI":"10.1371\/journal.pbio.1001637","article-title":"Effects of diet on resource utilization by a model human gut microbiota containing Bacteroides cellulosilyticus WH2, a symbiont with an extensive glycobiome","volume":"11","author":"McNulty","year":"2013","journal-title":"PLoS Biol."},{"key":"2023020116162494900_btu716-B29","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1093\/bioinformatics\/btt640","article-title":"DoMosaics: software for domain arrangement visualization and domain-centric analysis of proteins","volume":"30","author":"Moore","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020116162494900_btu716-B30","doi-asserted-by":"crossref","first-page":"S329","DOI":"10.1093\/bioinformatics\/18.suppl_1.S329","article-title":"A powerful non-homology method for the prediction of operons in prokaryotes","volume":"18","author":"Moreno-Hagelsieb","year":"2002","journal-title":"Bioinformatics"},{"key":"2023020116162494900_btu716-B31","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1186\/1471-2164-14-873","article-title":"Polysaccharides utilization in human gut bacterium Bacteroides thetaiotaomicron: comparative genomics reconstruction of metabolic and regulatory networks","volume":"14","author":"Ravcheev","year":"2013","journal-title":"BMC Genomics"},{"key":"2023020116162494900_btu716-B32","doi-asserted-by":"crossref","first-page":"6652","DOI":"10.1073\/pnas.110147297","article-title":"Operons in Escherichia coli: genomic analyses and predictions","volume":"97","author":"Salgado","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020116162494900_btu716-B33","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1016\/j.tibs.2008.04.012","article-title":"New substrates for TonB-dependent transport: do we only see the \u2018tip of the iceberg'?","volume":"33","author":"Schauer","year":"2008","journal-title":"Trends Biochem. Sci."},{"key":"2023020116162494900_btu716-B34","doi-asserted-by":"crossref","first-page":"1630","DOI":"10.1101\/gr.094607.109","article-title":"JBrowse: a next-generation genome browser","volume":"19","author":"Skinner","year":"2009","journal-title":"Genome Res."},{"issue":"Database issue","key":"2023020116162494900_btu716-B35","doi-asserted-by":"crossref","first-page":"D627","DOI":"10.1093\/nar\/gkr1020","article-title":"ProOpDB: Prokaryotic Operon DataBase","volume":"40","author":"Taboada","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023020116162494900_btu716-B36","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1016\/j.tibs.2014.02.005","article-title":"How do gut microbes break down dietary fiber?","volume":"39","author":"Terrapon","year":"2014","journal-title":"Trends Biochem. Sci"},{"key":"2023020116162494900_btu716-B37","doi-asserted-by":"crossref","first-page":"3077","DOI":"10.1093\/bioinformatics\/btp560","article-title":"Detection of new protein domains using co-occurrence: application to Plasmodium falciparum","volume":"25","author":"Terrapon","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020116162494900_btu716-B38","doi-asserted-by":"crossref","first-page":"880","DOI":"10.1093\/bioinformatics\/bti123","article-title":"Operon prediction without a training set","volume":"21","author":"Westover","year":"2005","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/5\/647\/49011417\/bioinformatics_31_5_647.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/5\/647\/49011417\/bioinformatics_31_5_647.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T00:27:54Z","timestamp":1675297674000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/5\/647\/317967"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,10,28]]},"references-count":38,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2015,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu716","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,3,1]]},"published":{"date-parts":[[2014,10,28]]}}}