{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,15]],"date-time":"2024-09-15T21:18:32Z","timestamp":1726435112065},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1937,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.5"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The identification of non-coding functional regions of the human genome remains one of the main challenges of genomics. By observing how a given region evolved over time, one can detect signs of negative or positive selection hinting that the region may be functional. With the quickly increasing number of vertebrate genomes to compare with our own, this type of approach is set to become extremely powerful, provided the right analytical tools are available.<\/jats:p>\n               <jats:p>Results: A large number of approaches have been proposed to measure signs of past selective pressure, usually in the form of reduced mutation rate. Here, we propose a radically different approach to the detection of non-coding functional region: instead of measuring past evolutionary rates, we build a machine learning classifier to predict current substitution rates in human based on the inferred evolutionary events that affected the region during vertebrate evolution. We show that different types of evolutionary events, occurring along different branches of the phylogenetic tree, bring very different amounts of information. We propose a number of simple machine learning classifiers and show that a Support-Vector Machine (SVM) predictor clearly outperforms existing tools at predicting human non-coding functional sites. Comparison to external evidences of selection and regulatory function confirms that these SVM predictions are more accurate than those of other approaches.<\/jats:p>\n               <jats:p>Availability: The predictor and predictions made are available at http:\/\/www.mcb.mcgill.ca\/~blanchem\/sadri.<\/jats:p>\n               <jats:p>Contact: \u00a0blanchem@mcb.mcgill.ca<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr241","type":"journal-article","created":{"date-parts":[[2011,6,17]],"date-time":"2011-06-17T23:32:32Z","timestamp":1308353552000},"page":"i266-i274","source":"Crossref","is-referenced-by-count":6,"title":["Predicting site-specific human selective pressure using evolutionary signatures"],"prefix":"10.1093","volume":"27","author":[{"given":"Javad","family":"Sadri","sequence":"first","affiliation":[{"name":"1 School of Computer Science, McGill University, 3630 University, Montreal, QC, Canada H3A 2B2, 2Department of Computer Engineering, Faculty of Engineering, University of Birjand, Birjand, Iran and 3Department of Computer Science, Universit\u00e9 du Qu\u00e9bec \u00e0 Montr\u00e9al, Montreal, QC, Canada H3C 3P8"},{"name":"1 School of Computer Science, McGill University, 3630 University, Montreal, QC, Canada H3A 2B2, 2Department of Computer Engineering, Faculty of Engineering, University of Birjand, Birjand, Iran and 3Department of Computer Science, Universit\u00e9 du Qu\u00e9bec \u00e0 Montr\u00e9al, Montreal, QC, Canada H3C 3P8"}]},{"given":"Abdoulaye Banire","family":"Diallo","sequence":"additional","affiliation":[{"name":"1 School of Computer Science, McGill University, 3630 University, Montreal, QC, Canada H3A 2B2, 2Department of Computer Engineering, Faculty of Engineering, University of Birjand, Birjand, Iran and 3Department of Computer Science, Universit\u00e9 du Qu\u00e9bec \u00e0 Montr\u00e9al, Montreal, QC, Canada H3C 3P8"}]},{"given":"Mathieu","family":"Blanchette","sequence":"additional","affiliation":[{"name":"1 School of Computer Science, McGill University, 3630 University, Montreal, QC, Canada H3A 2B2, 2Department of Computer Engineering, Faculty of Engineering, University of Birjand, Birjand, Iran and 3Department of Computer Science, Universit\u00e9 du Qu\u00e9bec \u00e0 Montr\u00e9al, Montreal, QC, Canada H3C 3P8"}]}],"member":"286","published-online":{"date-parts":[[2011,6,14]]},"reference":[{"key":"2023012512130033500_B1","doi-asserted-by":"crossref","first-page":"e254","DOI":"10.1371\/journal.pcbi.0030254","article-title":"Analysis of sequence conservation at nucleotide resolution","volume":"3","author":"Asthana","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023012512130033500_B2","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1101\/gr.1933104","article-title":"Aligning multiple genomic sequences with the threaded blockset aligner","volume":"14","author":"Blanchette","year":"2004","journal-title":"Genome Res."},{"key":"2023012512130033500_B3","doi-asserted-by":"crossref","first-page":"2412","DOI":"10.1101\/gr.2800104","article-title":"Reconstructing large regions of an ancestral mammalian genome in silico","volume":"14","author":"Blanchette","year":"2004","journal-title":"Genome Res."},{"key":"2023012512130033500_B4","doi-asserted-by":"crossref","first-page":"1391","DOI":"10.1126\/science.1081331","article-title":"Phylogenetic shadowing of primate sequences to find functional regions of the human genome","volume":"299","author":"Boffelli","year":"2003","journal-title":"Science"},{"key":"2023012512130033500_B5","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1101\/gr.926603","article-title":"LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA","volume":"13","author":"Brudno","year":"2003","journal-title":"Genome Res."},{"key":"2023012512130033500_B6","doi-asserted-by":"crossref","first-page":"901","DOI":"10.1101\/gr.3577405","article-title":"Distribution and intensity of constraint in mammalian genomic sequence","volume":"15","author":"Cooper","year":"2005","journal-title":"Genome Res."},{"key":"2023012512130033500_B7","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511801389","volume-title":"An Introduction to Support Vector Machines and other Kernel-Based Learning Methods.","author":"Cristianini","year":"2000"},{"key":"2023012512130033500_B8","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1101\/gr.1939804","article-title":"Accurate identi\u00fecation of novel human genes through simultaneous gene prediction in human,mouse, and rat","volume":"14","author":"Dewey","year":"2004","journal-title":"Genome Res."},{"key":"2023012512130033500_B9","doi-asserted-by":"crossref","first-page":"446","DOI":"10.1089\/cmb.2007.A006","article-title":"Exact and heuristic algorithms for the indel maximum likelihood problem","volume":"14","author":"Diallo","year":"2007","journal-title":"J. Comput. Biol."},{"key":"2023012512130033500_B10","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1093\/bioinformatics\/btp600","article-title":"Ancestors 1.0: a web server for ancestral sequence reconstruction","volume":"26","author":"Diallo","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512130033500_B11","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1186\/1471-2105-7-400","article-title":"Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints","volume":"7","author":"Dowell","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012512130033500_B12","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1038\/nature05874","article-title":"Identification and analysis of functional elements in 1% of the human genome by the encode pilot project","volume":"447","author":"ENCODE-Project-Consortium","year":"2007","journal-title":"Nature"},{"key":"2023012512130033500_B13","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1093\/nar\/gkh458","article-title":"Vista: computational tools for comparative genomics","volume":"32","author":"Frazer","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012512130033500_B14","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1089\/cmb.2006.13.379","article-title":"Using multiple alignments to improve gene prediction","volume":"13","author":"Gross","year":"2006","journal-title":"J. Comput. Biol."},{"key":"2023012512130033500_B15","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1038\/nature07829","article-title":"Histone modifications at human enhancers reflect global cell-type-specific gene expression","volume":"459","author":"Heintzman","year":"2009","journal-title":"Nature"},{"key":"2023012512130033500_B16","doi-asserted-by":"crossref","first-page":"851","DOI":"10.1038\/nature06258","article-title":"A second generation human haplotype map of over 3.1 million SNPs","volume":"449","author":"International HapMap Consortium","year":"2007","journal-title":"Nature"},{"key":"2023012512130033500_B17","article-title":"Making large-scale SVM learning practical","volume-title":"Advances in Kernel Methods - Support Vector Learning.","author":"Joachims","year":"1999"},{"key":"2023012512130033500_B18","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1038\/nature01644","article-title":"Sequencing and comparison of yeast species to identify genes and regulatory elements","volume":"423","author":"Kellis","year":"2003","journal-title":"Nature"},{"key":"2023012512130033500_B19","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511623486","volume-title":"The Neutral Theory of Molecular Evolution.","author":"Kimura","year":"1983"},{"key":"2023012512130033500_B20","doi-asserted-by":"crossref","first-page":"D668","DOI":"10.1093\/nar\/gkl928","article-title":"The UCSC genome browser database: update 2007","volume":"35","author":"Kuhn","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012512130033500_B21","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1093\/nar\/gkh383","article-title":"rVISTA 2.0: evolutionary analysis of transcription factor binding sites","volume":"32","author":"Loots","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012512130033500_B22","doi-asserted-by":"crossref","first-page":"2507","DOI":"10.1101\/gr.1602203","article-title":"Identification and characterization of multi-species conserved sequences","volume":"13","author":"Margulies","year":"2003","journal-title":"Genome Res."},{"key":"2023012512130033500_B23","doi-asserted-by":"crossref","first-page":"4795","DOI":"10.1073\/pnas.0409882102","article-title":"An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing","volume":"102","author":"Margulies","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512130033500_B24","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1101\/gr.6034307","article-title":"Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome","volume":"17","author":"Margulies","year":"2007","journal-title":"Genome Res."},{"key":"2023012512130033500_B25","doi-asserted-by":"crossref","first-page":"1797","DOI":"10.1101\/gr.6761107","article-title":"28-way vertebrate alignment and conservation track in the UCSC genome browser","volume":"17","author":"Miller","year":"2007","journal-title":"Genome Res."},{"key":"2023012512130033500_B26","volume-title":"The Statistical Processes of Evolutionary Theory.","author":"Moran","year":"1962"},{"key":"2023012512130033500_B27","doi-asserted-by":"crossref","first-page":"R9","DOI":"10.1186\/gb-2004-5-12-r98","article-title":"MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding sitespeci\u00fec evolutionary model","volume":"5","author":"Moses","year":"2004","journal-title":"Genome Biol."},{"key":"2023012512130033500_B28","doi-asserted-by":"crossref","first-page":"e130","DOI":"10.1371\/journal.pcbi.0020130","article-title":"Large-scale turnover of functional transcription factor binding sites in drosophila","volume":"2","author":"Moses","year":"2006","journal-title":"PLoS Comput. Biol."},{"key":"2023012512130033500_B29","doi-asserted-by":"crossref","first-page":"e3","DOI":"10.1371\/journal.pcbi.0020033","article-title":"Identi\u00fecation and classi\u00fecation of conserved RNA secondary structures in the human genome","volume":"2","author":"Pedersen","year":"2006","journal-title":"PLOS Computat. Biol."},{"key":"2023012512130033500_B30","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1101\/gr.097857.109","article-title":"Detection of nonneutral substitution rates on mammalian phylogenies","volume":"20","author":"Pollard","year":"2010","journal-title":"Genome Res."},{"key":"2023012512130033500_B31","article-title":"An empirical study of the naive bayes classifier","volume-title":"IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence","author":"Rish","year":"2001"},{"key":"2023012512130033500_B32","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/nmeth890","article-title":"Genome-scale mapping of dnase i sensitivity in vivo using tiling dna microarrays","volume":"3","author":"Sabo","year":"2006","journal-title":"Nat. Methods"},{"key":"2023012512130033500_B33","volume-title":"Nearest-Neighbor Methods in Learning and Vision.","author":"Shakhnarovish","year":"2005"},{"key":"2023012512130033500_B34","doi-asserted-by":"crossref","first-page":"468","DOI":"10.1093\/molbev\/msh039","article-title":"Phylogenetic estimation of context-dependent substitution rates by maximum likelihood","volume":"21","author":"Siepel","year":"2004","journal-title":"Mol. Biol. Evol."},{"key":"2023012512130033500_B35","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1101\/gr.3715005","article-title":"Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes","volume":"15","author":"Siepel","year":"2005","journal-title":"Genome Res."},{"key":"2023012512130033500_B36","first-page":"190","article-title":"New methods for detecting lineage-specific selection","volume-title":"Proceedings of the 10th International Conference on Research in Computational Molecular Biology","author":"Siepel","year":"2006"},{"key":"2023012512130033500_B37","doi-asserted-by":"crossref","first-page":"1763","DOI":"10.1101\/gr.7128207","article-title":"Targeted discovery of novel human exons by comparative genomics","volume":"17","author":"Siepel","year":"2007","journal-title":"Genome Res."},{"key":"2023012512130033500_B38","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1038\/nature06340","article-title":"Discovery of functional elements in 12 drosophila genomes using evolutionary signatures","volume":"450","author":"Stark","year":"2007","journal-title":"Nature"},{"key":"2023012512130033500_B39","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1038\/nature01262","article-title":"Initial sequencing and comparative analysis of the mouse genome","volume":"420","author":"The International Mouse Genome Sequencing Consortium","year":"2002","journal-title":"Nature"},{"key":"2023012512130033500_B40","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/nature01858","article-title":"Comparative analyses of multi-species sequences from targeted genomic regions","volume":"424","author":"Thomas","year":"2003","journal-title":"Nature"},{"key":"2023012512130033500_B41","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learning Theory.","author":"Vapnik","year":"1995"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/13\/i266\/48872714\/bioinformatics_27_13_i266.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/13\/i266\/48872714\/bioinformatics_27_13_i266.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T14:26:27Z","timestamp":1674656787000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/13\/i266\/182309"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,6,14]]},"references-count":41,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2011,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr241","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,7,1]]},"published":{"date-parts":[[2011,6,14]]}}}