{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T14:00:17Z","timestamp":1761487217738},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Owing to the complete sequencing of human and many other genomes, huge amounts of DNA sequence data have been accumulated. In bioinformatics, an important issue is how to predict the complete structure of genes from the genomic DNA sequence, especially the human genome. A crucial part in the gene structure prediction is to determine the precise exon\u2013intron boundaries, i.e. the splice sites, in the coding region.<\/jats:p>\n               <jats:p>Results: We have developed a dependency graph model to fully capture the intrinsic interdependency between base positions in a splice site. The establishment of dependency between two position is based on a \u03c72-test from known sample data. To facilitate statistical inference, we have expanded the dependency graph (which is usually a graph with cycles that make probabilistic reasoning very difficult, if not impossible) into a Bayesian network (which is a directed acyclic graph that facilitates statistical reasoning).<\/jats:p>\n               <jats:p>When compared with the existing models such as weight matrix model, weight array model, maximal dependence decomposition, Cai et al.'s tree model as well as the less-studied second-order and third-order Markov chain models, the expanded Bayesian networks from our dependency graph models perform the best in nearly all the cases studied.<\/jats:p>\n               <jats:p>Availability: Software (a program called DGSplicer) and datasets used are available at http:\/\/csrl.ee.nthu.edu.tw\/bioinf\/<\/jats:p>\n               <jats:p>Contact: \u00a0cclu@ee.nthu.edu.tw<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti025","type":"journal-article","created":{"date-parts":[[2004,9,17]],"date-time":"2004-09-17T00:13:37Z","timestamp":1095380017000},"page":"471-482","source":"Crossref","is-referenced-by-count":58,"title":["Prediction of splice sites with dependency graphs and their expanded bayesian networks"],"prefix":"10.1093","volume":"21","author":[{"given":"Te-Ming","family":"Chen","sequence":"first","affiliation":[]},{"given":"Chung-Chin","family":"Lu","sequence":"additional","affiliation":[]},{"given":"Wen-Hsiung","family":"Li","sequence":"additional","affiliation":[]}],"member":"286","published-online":{"date-parts":[[2004,9,16]]},"reference":[{"key":"2023013107235703800_B1","doi-asserted-by":"crossref","unstructured":"Arita, M., Tsuda, K., Asai, K. 2002Modeling splicing sites with pairwise correlations. Bioinformatics18(Suppl. 2),S27\u2013S34","DOI":"10.1093\/bioinformatics\/18.suppl_2.S27"},{"key":"2023013107235703800_B2","doi-asserted-by":"crossref","unstructured":"Brunak, S., Engelbrecht, J., Knudsen, S. 1991Prediction of human mRNA donor and acceptor sites from the DNA sequence. J. Mol. Biol.22049\u201365","DOI":"10.1016\/0022-2836(91)90380-O"},{"key":"2023013107235703800_B3","unstructured":"Burge, C. and Karlin, S. 1997Prediction of complete gene structures in human genomic DNA. J. Mol. Biol.26878\u201394"},{"key":"2023013107235703800_B4","doi-asserted-by":"crossref","unstructured":"Cai, D., Delcher, A., Kao, B., Kasif, S. 2000Modeling splice sites with Bayes networks. Bioinformatics16152\u2013158","DOI":"10.1093\/bioinformatics\/16.2.152"},{"key":"2023013107235703800_B5","doi-asserted-by":"crossref","unstructured":"Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G. Biological Sequence Analysis: Probabilistic Models of Protein and Nucleic Acids1998, Cambridge, MA  Cambridge University Press","DOI":"10.1017\/CBO9780511790492"},{"key":"2023013107235703800_B6","doi-asserted-by":"crossref","unstructured":"Ewens, W.J. and Grant, G.R. Statistical Methods in Bioinformatics: An Introduction2001, NY  Springer-Verlag","DOI":"10.1007\/978-1-4757-3247-4"},{"key":"2023013107235703800_B7","doi-asserted-by":"crossref","unstructured":"Hebsgaard, S.M., Korning, P.G., Tolstrup, N., Engelbrecht, J., Rouz\u00e9, P., Brunak, S. 1996Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res.24,  pp. 3439\u20133452","DOI":"10.1093\/nar\/24.17.3439"},{"key":"2023013107235703800_B8","unstructured":"Henderson, J., Salzberg, S., Fasman, K. 1997Finding genes in human DNA with a hidden Markov model. J. Comput. Biol.4127\u2013141"},{"key":"2023013107235703800_B9","doi-asserted-by":"crossref","unstructured":"Khodarev, N.N., Park, J., Kataoka, Y., Nodzenski, E., Khorasani, L., Hellman, S., Roizman, B., Weichselbaum, R.R., Pelizzari, C.A. 2003Receiver operating characteristic analysis: a general tool for DNA array data filtration and performance estimation. Genomics81202\u2013209","DOI":"10.1016\/S0888-7543(02)00042-3"},{"key":"2023013107235703800_B10","unstructured":"Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001Initial sequencing and analysis of the human genome. Nature409860\u2013921"},{"key":"2023013107235703800_B11","unstructured":"Mathe, C., Sagot, M., Schiex, T., Rouz\u00e9, P. 2002Current methods of gene prediction, their strengths and weaknesses. Nucleic Acids Res.304103\u20134117"},{"key":"2023013107235703800_B12","unstructured":"Mathews, C.K., van Holde, K.E., Ahern, K.G. Biochemistry2000 3rd edn. , San Francisco, CA  Addison Wesley Longman"},{"key":"2023013107235703800_B13","unstructured":"Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference1988, San Mateo, CA  Morgan Kaufmann"},{"key":"2023013107235703800_B14","doi-asserted-by":"crossref","unstructured":"Pertea, M., Lin, X., Salzberg, S.L. 2001GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res.29,  pp. 1185\u20131190","DOI":"10.1093\/nar\/29.5.1185"},{"key":"2023013107235703800_B15","doi-asserted-by":"crossref","unstructured":"Reese, M.G., Eeckman, F.H., Kulp, D., Haussler, D. 1997Improved splice site recognition in Genie. J. Comput. Biol.4311\u2013324","DOI":"10.1145\/267521.267766"},{"key":"2023013107235703800_B16","doi-asserted-by":"crossref","unstructured":"Salzberg, S., Delcher, A., Fasman, K., Henderson, J. 1998A decision tree system for finding genes in DNA. J. Comput. Biol.5667\u2013680","DOI":"10.1089\/cmb.1998.5.667"},{"key":"2023013107235703800_B17","doi-asserted-by":"crossref","unstructured":"Staden, R. 1984Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res.12505\u2013519","DOI":"10.1007\/978-1-4684-4973-0_4"},{"key":"2023013107235703800_B18","doi-asserted-by":"crossref","unstructured":"Tolstrup, N., Rouz\u00e9, P., Brunak, S. 1997A branch point consensus from Arabidopsis found by non-circular analysis allows for better prediction of acceptor sites. Nucleic Acids Res.253159\u20133163","DOI":"10.1093\/nar\/25.15.3159"},{"key":"2023013107235703800_B19","unstructured":"Weaver, R.F. Molecular Biology1999, NY  WCB McGraw-Hill"},{"key":"2023013107235703800_B20","doi-asserted-by":"crossref","unstructured":"Yeo, G. and Burge, C.B. 2004Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comput. Biol.11,  pp. 377\u2013394","DOI":"10.1089\/1066527041410418"},{"key":"2023013107235703800_B21","unstructured":"Zhang, M.Q. 2002Computational prediction of eukaryotic protein-coding genes. Nat. Rev. Genet.3698\u2013709"},{"key":"2023013107235703800_B22","unstructured":"Zhang, M.Q. and Marr, T.G. 1993A weight array method for splicing signal analysis. Comput. Appl. Biosci.9499\u2013509"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/4\/471\/48965206\/bioinformatics_21_4_471.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/4\/471\/48965206\/bioinformatics_21_4_471.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T10:14:56Z","timestamp":1675160096000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/4\/471\/203185"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,9,16]]},"references-count":22,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2005,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti025","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2005,2,15]]},"published":{"date-parts":[[2004,9,16]]}}}