{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T01:40:12Z","timestamp":1776735612300,"version":"3.51.2"},"reference-count":61,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2018,9,1]],"date-time":"2018-09-01T00:00:00Z","timestamp":1535760000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004052","name":"King Abdullah University of Science and Technology","doi-asserted-by":"publisher","award":["BAS\/1\/1606-01-01"],"award-info":[{"award-number":["BAS\/1\/1606-01-01"]}],"id":[{"id":"10.13039\/501100004052","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Recognition of different genomic signals and regions (GSRs) in DNA is crucial for understanding genome organization, gene regulation, and gene function, which in turn generate better genome and gene annotations. Although many methods have been developed to recognize GSRs, their pure computational identification remains challenging. Moreover, various GSRs usually require a specialized set of features for developing robust recognition models. Recently, deep-learning (DL) methods have been shown to generate more accurate prediction models than \u2018shallow\u2019 methods without the need to develop specialized features for the problems in question. Here, we explore the potential use of DL for the recognition of GSRs.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed DeepGSR, an optimized DL architecture for the prediction of different types of GSRs. The performance of the DeepGSR structure is evaluated on the recognition of polyadenylation signals (PAS) and translation initiation sites (TIS) of different organisms: human, mouse, bovine and fruit fly. The results show that DeepGSR outperformed the state-of-the-art methods, reducing the classification error rate of the PAS and TIS prediction in the human genome by up to 29% and 86%, respectively. Moreover, the cross-organisms and genome-wide analyses we performed, confirmed the robustness of DeepGSR and provided new insights into the conservation of examined GSRs across species.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>DeepGSR is implemented in Python using Keras API; it is available as open-source software and can be obtained at https:\/\/doi.org\/10.5281\/zenodo.1117159.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty752","type":"journal-article","created":{"date-parts":[[2018,8,31]],"date-time":"2018-08-31T11:29:28Z","timestamp":1535714968000},"page":"1125-1132","source":"Crossref","is-referenced-by-count":71,"title":["DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions"],"prefix":"10.1093","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9820-4129","authenticated-orcid":false,"given":"Manal","family":"Kalkatawi","sequence":"first","affiliation":[{"name":"Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"},{"name":"Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Arturo","family":"Magana-Mora","sequence":"additional","affiliation":[{"name":"Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"},{"name":"Drilling Technology Team, EXPEC-ARC, Saudi Aramco, Dhahran, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Boris","family":"Jankovic","sequence":"additional","affiliation":[{"name":"Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5435-4750","authenticated-orcid":false,"given":"Vladimir B","family":"Bajic","sequence":"additional","affiliation":[{"name":"Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2018,9,1]]},"reference":[{"key":"2023013107275638500_bty752-B1","doi-asserted-by":"crossref","first-page":"i24","DOI":"10.1093\/bioinformatics\/btn172","article-title":"ProSOM: core promoter prediction based on unsupervised clustering of DNA physical profiles","volume":"24","author":"Abeel","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B2","doi-asserted-by":"crossref","DOI":"10.1093\/database\/baw093","article-title":"The Ensembl Gene Annotation System","author":"Aken","year":"2016","journal-title":"Database: The Journal of Biological Databases and Curation (Oxford)"},{"key":"2023013107275638500_bty752-B3","article-title":"Theano: a Python framework for fast computation of mathematical expressions","author":"Al-Rfou","year":"2016"},{"key":"2023013107275638500_bty752-B4","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023013107275638500_bty752-B5","first-page":"389","article-title":"Artificial neural networks based systems for recognition of genomic signals and regions: a review","volume":"26","author":"Bajic","year":"2002","journal-title":"Informatica"},{"key":"2023013107275638500_bty752-B6","article-title":"Theano: new features and speed improvements","author":"Bastien","year":"2012","journal-title":"CoRR Abs\/1211.5590"},{"key":"2023013107275638500_bty752-B7","first-page":"281","article-title":"Random search for hyper-parameter optimization","volume":"13","author":"Bergstra","year":"2012","journal-title":"J. Machine Learn. Res"},{"key":"2023013107275638500_bty752-B8","volume-title":"Genome, Chapter 7","author":"Brown","year":"2002"},{"key":"2023013107275638500_bty752-B9","author":"Burge","year":"1997"},{"key":"2023013107275638500_bty752-B10","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1016\/j.ab.2014.06.022","article-title":"iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition","volume":"462","author":"Chen","year":"2014","journal-title":"Anal. Biochem"},{"key":"2023013107275638500_bty752-B11","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1109\/ACCESS.2014.2325029","article-title":"Big data deep learning: challenges and perspectives","volume":"2","author":"Chen","year":"2014","journal-title":"IEEE Access"},{"key":"2023013107275638500_bty752-B12","author":"Chollet","year":"2015"},{"key":"2023013107275638500_bty752-B13","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1016\/B978-0-12-410471-6.00007-4","volume-title":"Bioinformatics for Beginners, Chapter 7","author":"Choudhuri","year":"2014"},{"key":"2023013107275638500_bty752-B14","doi-asserted-by":"crossref","first-page":"364.","DOI":"10.2174\/138920209789177593","article-title":"Genomic signal processing","volume":"10","author":"Dougherty","year":"2009","journal-title":"Curr. Genomics"},{"key":"2023013107275638500_bty752-B15","doi-asserted-by":"crossref","first-page":"496","DOI":"10.1038\/nrg3482","article-title":"Alternative cleavage and polyadenylation: extent, regulation and function","volume":"14","author":"Elkon","year":"2013","journal-title":"Nat. Rev. Genet"},{"key":"2023013107275638500_bty752-B16","doi-asserted-by":"crossref","first-page":"D37","DOI":"10.1093\/nar\/gkn597","article-title":"DiProDB: a database for dinucleotide properties","volume":"37","author":"Friedel","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023013107275638500_bty752-B17","first-page":"249","volume-title":"Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics","author":"Glorot","year":"2010"},{"key":"2023013107275638500_bty752-B18","first-page":"315","volume-title":"Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics","author":"Glorot","year":"2011"},{"key":"2023013107275638500_bty752-B19","doi-asserted-by":"crossref","first-page":"D663","DOI":"10.1093\/nar\/gkw1016","article-title":"FlyBase at 25: looking to the future","volume":"45","author":"Gramates","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023013107275638500_bty752-B20","first-page":"105","volume-title":"Systemic Approaches in Bioinformatics and Computational Systems Biology: Recent Advances","author":"Haitham","year":"2012"},{"key":"2023013107275638500_bty752-B21","doi-asserted-by":"crossref","first-page":"767","DOI":"10.1093\/bioinformatics\/btv661","article-title":"BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS","volume":"32","author":"Hoff","year":"2016","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B22","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1109\/TCBB.2008.95","article-title":"SCS: signal, context, and structure features for genome-wide human promoter recognition","volume":"7","author":"Jia","year":"2010","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"key":"2023013107275638500_bty752-B23","doi-asserted-by":"crossref","first-page":"1484","DOI":"10.1093\/bioinformatics\/btt161","article-title":"Dragon PolyA Spotter: predictor of poly(A) motifs within human genomic DNA sequences","volume":"29","author":"Kalkatawi","year":"2013","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B24","doi-asserted-by":"crossref","first-page":"2605","DOI":"10.1093\/bioinformatics\/bty166","article-title":"DeepSol: a deep learning framework for sequence-based protein solubility prediction","volume":"34","author":"Khurana","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B25","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2023013107275638500_bty752-B26","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1093\/bioinformatics\/btx680","article-title":"DEEPre: sequence-based enzyme EC number prediction by deep learning","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B27","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1093\/oxfordjournals.molbev.a025626","article-title":"Asymmetric substitution patterns in the two DNA strands of bacteria","volume":"13","author":"Lobry","year":"1996","journal-title":"Mol. Biol. Evol"},{"key":"2023013107275638500_bty752-B28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2017\/7027016","article-title":"Feature extraction and fusion using deep convolutional neural networks for face detection","volume":"2017","author":"Lu","year":"2017","journal-title":"Math. Problems Eng"},{"key":"2023013107275638500_bty752-B29","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1093\/bioinformatics\/bts638","article-title":"Dragon TIS Spotter: an Arabidopsis-derived predictor of translation initiation sites in plants","volume":"29","author":"Magana-Mora","year":"2013","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B30","doi-asserted-by":"crossref","DOI":"10.1038\/s41598-017-04281-9","article-title":"OmniGA: optimized omnivariate decision trees for generalizable classication models","volume":"7","author":"Magana-Mora","year":"2017","journal-title":"Sci. Rep"},{"key":"2023013107275638500_bty752-B31","doi-asserted-by":"crossref","first-page":"620.","DOI":"10.1186\/s12864-017-4033-7","article-title":"Omni-PolyA: a method and tool for accurate recognition of Poly(A) signals in human genomic DNA","volume":"18","author":"Magana-Mora","year":"2017","journal-title":"BMC Genomics"},{"key":"2023013107275638500_bty752-B32","first-page":"851","article-title":"Deep learning in bioinformatics","volume":"18","author":"Min","year":"2016","journal-title":"Brief Bioinform"},{"key":"2023013107275638500_bty752-B33","first-page":"197","article-title":"A coding measure scheme employing electron-ion interaction pseudopotential (EIIP)","volume":"1","author":"Nair","year":"2006","journal-title":"Bioinformation"},{"key":"2023013107275638500_bty752-B34","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1145\/1365490.1365500","article-title":"Scalable parallel programming with CUDA","volume":"6","author":"Nickolls","year":"2008","journal-title":"Queue"},{"key":"2023013107275638500_bty752-B35","volume-title":"Neural Networks and Deep Learning","author":"Nielsen","year":"2015"},{"key":"2023013107275638500_bty752-B36","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1101\/gr.10.4.511","article-title":"GeneID in Drosophila","volume":"10","author":"Parra","year":"2000","journal-title":"Genome Res"},{"key":"2023013107275638500_bty752-B37","first-page":"55","article-title":"Early stopping - But when?","volume":"1524","author":"Prechelt","year":"1998","journal-title":"Neural Networks"},{"key":"2023013107275638500_bty752-B38","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1007\/978-1-84628-780-0_9","volume-title":"Networks: From Biology to Theory","author":"Prohaska","year":"2007"},{"key":"2023013107275638500_bty752-B39","doi-asserted-by":"crossref","first-page":"e107.","DOI":"10.1093\/nar\/gkw226","article-title":"DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences","volume":"44","author":"Quang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013107275638500_bty752-B40","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1093\/bioinformatics\/btq033","article-title":"BEDTools: a flexible suite of utilities for comparing genomic features","volume":"26","author":"Quinlan","year":"2010","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B41","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1101\/gr.10.4.529","article-title":"Gene finding in Drosophila melanogaster","volume":"10","author":"Reese","year":"2000","journal-title":"Genome Res"},{"key":"2023013107275638500_bty752-B42","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1007\/3-540-45727-5_10","volume-title":"Computational Biology","author":"Schiex","year":"2001"},{"key":"2023013107275638500_bty752-B43","doi-asserted-by":"crossref","first-page":"i387","DOI":"10.1093\/bioinformatics\/bti1002","article-title":"A motif-based framework for recognizing sequence families","volume":"21","author":"Sharan","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B44","doi-asserted-by":"crossref","first-page":"i639","DOI":"10.1093\/bioinformatics\/btw427","article-title":"DeepChrome: deep-learning for predicting gene expression from histone modifications","volume":"32","author":"Singh","year":"2016","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B45","doi-asserted-by":"crossref","first-page":"i6","DOI":"10.1093\/bioinformatics\/btn170","article-title":"POIMs: positional oligomer importance matrices \u2014understanding support vector machine-based signal detectors","volume":"24","author":"Sonnenburg","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B46","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Machine Learn. Res"},{"key":"2023013107275638500_bty752-B47","doi-asserted-by":"crossref","first-page":"W309","DOI":"10.1093\/nar\/gkh379","article-title":"AUGUSTUS: a web server for gene finding in eukaryotes","volume":"32","author":"Stanke","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023013107275638500_bty752-B48","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1126\/science.286.5439.455","article-title":"The mammalian gene collection","volume":"286","author":"Strausberg","year":"1999","journal-title":"Science"},{"key":"2023013107275638500_bty752-B49","doi-asserted-by":"crossref","first-page":"2324","DOI":"10.1101\/gr.095976.109","article-title":"The completion of the mammalian gene collection (MGC)","volume":"19","author":"Temple","year":"2009","journal-title":"Genome Res"},{"key":"2023013107275638500_bty752-B50","doi-asserted-by":"crossref","first-page":"e0171410.","DOI":"10.1371\/journal.pone.0171410","article-title":"Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks","volume":"12","author":"Umarov","year":"2017","journal-title":"PLoS One"},{"key":"2023013107275638500_bty752-B51","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/0375-9601(73)90506-9","article-title":"General model pseudopotential for positive ions","volume":"45","author":"Veljkovi\u0107","year":"1973","journal-title":"Phys. Lett"},{"key":"2023013107275638500_bty752-B52","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1103\/PhysRevLett.29.105","article-title":"Simple general-model pseudopotential","volume":"29","author":"Veljkovi\u0107","year":"1972","journal-title":"Phys. Rev. Lett"},{"key":"2023013107275638500_bty752-B53","doi-asserted-by":"crossref","first-page":"2740","DOI":"10.1093\/bioinformatics\/bty179","article-title":"Deep learning improves antimicrobial peptide recognition","volume":"34","author":"Veltri","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B54","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1002\/humu.21547","article-title":"Single base-pair substitutions at the translation initiation sites of human genes as a cause of inherited disease","volume":"32","author":"Wolf","year":"2011","journal-title":"Human Mutat"},{"key":"2023013107275638500_bty752-B55","doi-asserted-by":"crossref","first-page":"1859","DOI":"10.1093\/bioinformatics\/bti310","article-title":"GMAP: a genomic mapping and alignment program for mRNA and EST sequences","volume":"21","author":"Wu","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B56","doi-asserted-by":"crossref","first-page":"i316","DOI":"10.1093\/bioinformatics\/btt218","article-title":"Poly(A) motif prediction using spectral latent features from human DNA sequences","volume":"29","author":"Xie","year":"2013","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B57","doi-asserted-by":"crossref","first-page":"2675","DOI":"10.1093\/bioinformatics\/btx296","article-title":"A deep learning framework for improving long-range residue\u2013residue contact prediction using a hierarchical strategy","volume":"33","author":"Xiong","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B58","doi-asserted-by":"crossref","first-page":"i121","DOI":"10.1093\/bioinformatics\/btw255","article-title":"Convolutional neural network architectures for predicting DNA-protein binding","volume":"32","author":"Zeng","year":"2016","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B59","doi-asserted-by":"crossref","first-page":"i234","DOI":"10.1093\/bioinformatics\/btx247","article-title":"TITER: predicting translation initiation sites by deep learning","volume":"33","author":"Zhang","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013107275638500_bty752-B60","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1038\/nmeth.3547","article-title":"Predicting effects of noncoding variants with deep learning-based sequence model","volume":"12","author":"Zhou","year":"2015","journal-title":"Nat Methods"},{"key":"2023013107275638500_bty752-B61","first-page":"18","author":"Zuo","year":"2015"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/7\/1125\/48968470\/bioinformatics_35_7_1125.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/7\/1125\/48968470\/bioinformatics_35_7_1125.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T10:32:33Z","timestamp":1675161153000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/7\/1125\/5089227"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2018,9,1]]},"references-count":61,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2019,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty752","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,4,1]]},"published":{"date-parts":[[2018,9,1]]}}}