{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T19:10:06Z","timestamp":1675192206751},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Rapidly advancing genome technology has allowed access to a large number of diverse genomes and annotation data. We have defined a systems model that integrates assembly data, comparative genomics, gene predictions, mRNA and EST alignments and physiological tissue expression. Using these as predictive parameters, we engineered a machine learning approach to decipher putative active regions in the genome.<\/jats:p>\n               <jats:p>Results: Analysis of genomic sequences showed nucleosome-free region (NFR) modules containing a higher percentage of conserved regions, RNA-encoding sequences, CpG islands, splice sites and GC-rich areas. In contrast, random in silico fragments revealed higher percentages of DNA repeats and a lower conservation. The larger conserved sequences from the Vista enhancer browser (VEB) showed a greater percentage of short DNA sequence matches and RNA coding regions in multiple species.<\/jats:p>\n               <jats:p>Our model can predict small regulatory regions in the genome with &amp;gt;95% prediction accuracy using NFR modules and &amp;gt;85% prediction accuracy with VEB elements. Ultimately, this systems model can be applied to any organism to identify candidate transcriptional modules on a genome scale.<\/jats:p>\n               <jats:p>Contact: \u00a0myar@seas.upenn.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn622","type":"journal-article","created":{"date-parts":[[2008,12,4]],"date-time":"2008-12-04T01:47:06Z","timestamp":1228355226000},"page":"353-357","source":"Crossref","is-referenced-by-count":2,"title":["A predictive model for identifying mini-regulatory modules in the mouse genome"],"prefix":"10.1093","volume":"25","author":[{"given":"Mahesh","family":"Yaragatti","sequence":"first","affiliation":[{"name":"Biotechnology Program, CIS, University of Pennsylvania, 3330 Walnut Street, Philadelphia, PA 19104, USA"}]},{"given":"Ted","family":"Sandler","sequence":"additional","affiliation":[{"name":"Biotechnology Program, CIS, University of Pennsylvania, 3330 Walnut Street, Philadelphia, PA 19104, USA"}]},{"given":"Lyle","family":"Ungar","sequence":"additional","affiliation":[{"name":"Biotechnology Program, CIS, University of Pennsylvania, 3330 Walnut Street, Philadelphia, PA 19104, USA"}]}],"member":"286","published-online":{"date-parts":[[2008,12,3]]},"reference":[{"key":"2023013110002008900_B1","doi-asserted-by":"crossref","first-page":"656","DOI":"10.1101\/gr.4866006","article-title":"Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression","volume":"5","author":"Blanchette","year":"2006","journal-title":"Genome Res"},{"key":"2023013110002008900_B2","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1073\/pnas.97.1.262","article-title":"Knowledge-based analysis of microarray gene expression data by using support vector machines","volume":"97","author":"Brown","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110002008900_B3","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1101\/sqb.2003.68.245","article-title":"The share of human genomic DNA under selection estimated from human\u2013mouse genomic alignments.","volume":"68","author":"Chiaromonte","year":"2003","journal-title":"Cold Spring Harb. Symp. Quant. Biol"},{"key":"2023013110002008900_B4","unstructured":"Chih-Chung\n              C\n            \n            \u00a0Chih-JenL\n          LIBSVM, a library for support vector machines.\n          last accessed date September 2008\n          Available at http:\/\/www.csie.ntu.edu.tw\/~cjlin\/libsvm"},{"key":"2023013110002008900_B5","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1126\/science.1084337","article-title":"Finding functional features in Saccharomyces genomes by phylogenetic footprinting","volume":"301","author":"Cliften","year":"2003","journal-title":"Science"},{"key":"2023013110002008900_B6","doi-asserted-by":"crossref","first-page":"2381","DOI":"10.1101\/gr.1271603","article-title":"A biophysical approach to transcription factor binding site discovery","volume":"13","author":"Djordjevic","year":"2003","journal-title":"Genome Res"},{"key":"2023013110002008900_B7","doi-asserted-by":"crossref","first-page":"1455","DOI":"10.1101\/gr.4140006","article-title":"Locating mammalian transcription factor binding sites: a survey of computational and experimental techniques","volume":"12","author":"Elnitski","year":"2006","journal-title":"Genome Res"},{"key":"2023013110002008900_B8","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1038\/nature05874","article-title":"Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project","volume":"447","author":"ENCODE Project Consortium","year":"2007","journal-title":"Nature"},{"key":"2023013110002008900_B9","doi-asserted-by":"crossref","first-page":"3585","DOI":"10.1093\/nar\/gkl372","article-title":"Computational identification of transcriptional regulatory elements in DNA sequence","volume":"12","author":"GuhaThakurta","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023013110002008900_B10","doi-asserted-by":"crossref","first-page":"D773","DOI":"10.1093\/nar\/gkm966","article-title":"The UCSC Genome Browser Database: 2008 update","volume":"36","author":"Karolchik","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023013110002008900_B11","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1038\/nature01644","article-title":"Sequencing and comparison of yeast species to identify genes and regulatory elements","volume":"423","author":"Kellis","year":"2003","journal-title":"Nature"},{"key":"2023013110002008900_B12","first-page":"451","article-title":"Eukaryotic regulatory element conservation analysis and identification using comparative genomics","volume":"3","author":"Liu","year":"2008","journal-title":"Genome Res"},{"key":"2023013110002008900_B13","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/S0022-5193(03)00082-1","article-title":"Vector space classification of DNA sequences","volume":"2","author":"M\u00fcller","year":"2003","journal-title":"J. Theor. Biol."},{"key":"2023013110002008900_B14","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1038\/nature05295","article-title":"In vivo enhancer analysis of human conserved non-coding sequences","volume":"444","author":"Pennacchio","year":"2006","journal-title":"Nature"},{"key":"2023013110002008900_B15","doi-asserted-by":"crossref","first-page":"i273","DOI":"10.1093\/bioinformatics\/btg1038","article-title":"Genome-wide discovery of transcriptional modules from DNA sequence and gene expression","volume":"19","author":"Segal","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013110002008900_B16","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1101\/gr.3715005","article-title":"Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes","volume":"8","author":"Siepel","year":"2005","journal-title":"Genome Res"},{"key":"2023013110002008900_B17","doi-asserted-by":"crossref","first-page":"i292","DOI":"10.1093\/bioinformatics\/btg1040","article-title":"A probabilistic method to detect regulatory modules","volume":"19","author":"Sinha","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013110002008900_B18","doi-asserted-by":"crossref","first-page":"743","DOI":"10.1016\/S0959-437X(96)80030-X","article-title":"The origin of interspersed repeats in the human genome","volume":"6","author":"Smit","year":"1996","journal-title":"Curr. Opin. Genet. Dev."},{"key":"2023013110002008900_B19","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learning Theory.","author":"Vapnik","year":"1995"},{"key":"2023013110002008900_B20","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1038\/ng.2007.55","article-title":"Ultraconservation identifies a small subset of extremely constrained developmental enhancers","volume":"40","author":"Visel","year":"2008","journal-title":"Nat. Genet."},{"key":"2023013110002008900_B21","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1038\/nature01262","article-title":"Initial sequencing and comparative analysis of the mouse genome","volume":"420","author":"Waterston","year":"2002","journal-title":"Nature"},{"key":"2023013110002008900_B22","doi-asserted-by":"crossref","first-page":"930","DOI":"10.1101\/gr.073460.107","article-title":"Identification of active transcriptional regulatory modules by the functional assay of DNA from nucleosome-free regions","volume":"6","author":"Yaragatti","year":"2008","journal-title":"Genome Res"},{"key":"2023013110002008900_B23","doi-asserted-by":"crossref","first-page":"1896","DOI":"10.1126\/science.279.5358.1896","article-title":"Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene","volume":"279","author":"Yuh","year":"1998","journal-title":"Science"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/3\/353\/48982639\/bioinformatics_25_3_353.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/3\/353\/48982639\/bioinformatics_25_3_353.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T18:39:21Z","timestamp":1675190361000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/3\/353\/243176"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,12,3]]},"references-count":23,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2009,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn622","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,2,1]]},"published":{"date-parts":[[2008,12,3]]}}}