{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:42Z","timestamp":1772138082082,"version":"3.50.1"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2017,1,31]],"date-time":"2017-01-31T00:00:00Z","timestamp":1485820800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Sichuan Youth Science and Technology Foundation of China"},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities of China","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Previously constructed classifiers in predicting eukaryotic essential genes integrated a variety of features including experimental ones. If we can obtain satisfactory prediction using only nucleotide (sequence) information, it would be more promising. Three groups recently identified essential genes in human cancer cell lines using wet experiments and it provided wonderful opportunity to accomplish our idea. Here we improved the Z curve method into the \u03bb-interval form to denote nucleotide composition and association information and used it to construct the SVM classifying model.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Our model accurately predicted human gene essentiality with an AUC higher than 0.88 both for 5-fold cross-validation and jackknife tests. These results demonstrated that the essentiality of human genes could be reliably reflected by only sequence information. We re-predicted the negative dataset by our Pheg server and 118 genes were additionally predicted as essential. Among them, 20 were found to be homologues in mouse essential genes, indicating that some of the 118 genes were indeed essential, however previous experiments overlooked them. As the first available server, Pheg could predict essentiality for anonymous gene sequences of human. It is also hoped the \u03bb-interval Z curve method could be effectively extended to classification issues of other DNA elements.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and Implementation<\/jats:title>\n                    <jats:p>http:\/\/cefg.uestc.edu.cn\/Pheg<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx055","type":"journal-article","created":{"date-parts":[[2017,1,30]],"date-time":"2017-01-30T08:28:22Z","timestamp":1485764902000},"page":"1758-1764","source":"Crossref","is-referenced-by-count":68,"title":["Accurate prediction of human essential genes using only nucleotide composition and association information"],"prefix":"10.1093","volume":"33","author":[{"given":"Feng-Biao","family":"Guo","sequence":"first","affiliation":[{"name":"School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China"}]},{"given":"Chuan","family":"Dong","sequence":"additional","affiliation":[{"name":"School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China"}]},{"given":"Hong-Li","family":"Hua","sequence":"additional","affiliation":[{"name":"School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China"}]},{"given":"Shuo","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China"}]},{"given":"Hao","family":"Luo","sequence":"additional","affiliation":[{"name":"Department of Physics, Tianjin University, Tianjin, China"}]},{"given":"Hong-Wan","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China"}]},{"given":"Yan-Ting","family":"Jin","sequence":"additional","affiliation":[{"name":"School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China"}]},{"given":"Kai-Yue","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China"}]}],"member":"286","published-online":{"date-parts":[[2017,1,31]]},"reference":[{"key":"2023020205312701400_btx055-B1","doi-asserted-by":"crossref","first-page":"1092","DOI":"10.1126\/science.aac7557","article-title":"Gene essentiality and synthetic lethality in haploid human cells","volume":"350","author":"Blomen","year":"2015","journal-title":"Science"},{"key":"2023020205312701400_btx055-B2","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1016\/S0006-291X(03)01192-6","article-title":"ZCURVE_CoV: a new system to recognize protein coding genes in coronavirus genomes, and its applications in analyzing SARS-CoV genomes","volume":"307","author":"Chen","year":"2003","journal-title":"Biochem. Biophys. Res. Commun"},{"key":"2023020205312701400_btx055-B3","doi-asserted-by":"crossref","first-page":"D901","DOI":"10.1093\/nar\/gkr986","article-title":"OGEE: an online gene essentiality database","volume":"40","author":"Chen","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B4","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1093\/bioinformatics\/bti058","article-title":"Understanding protein dispensability through machine-learning analysis of high-throughput data","volume":"21","author":"Chen","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020205312701400_btx055-B5","doi-asserted-by":"crossref","first-page":"910","DOI":"10.1186\/1471-2164-14-910","article-title":"A new computational strategy for predicting essential genes","volume":"14","author":"Cheng","year":"2013","journal-title":"BMC Genomics"},{"key":"2023020205312701400_btx055-B6","doi-asserted-by":"crossref","first-page":"102.","DOI":"10.1186\/1752-0509-3-102","article-title":"How to identify essential genes from molecular networks?","volume":"3","author":"del Rio","year":"2009","journal-title":"BMC Syst. Biol"},{"key":"2023020205312701400_btx055-B7","doi-asserted-by":"crossref","first-page":"795","DOI":"10.1093\/nar\/gkq784","article-title":"Investigating the predictability of essential genes across distantly related organisms using an integrative approach","volume":"39","author":"Deng","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B8","doi-asserted-by":"crossref","first-page":"2893","DOI":"10.1039\/C6MB00374E","article-title":"Combining the pseudo dinucleotide composition with the Z curve method to improve the accuracy of predicting DNA elements: a case study in recombination spots","volume":"12","author":"Dong","year":"2016","journal-title":"Mol. Biosyst"},{"key":"2023020205312701400_btx055-B9","first-page":"1871","article-title":"LIBLINEAR: a library for large linear classification","volume":"9","author":"Fan","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023020205312701400_btx055-B10","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1016\/j.cels.2015.12.007","article-title":"Essential human genes","volume":"1","author":"Fraser","year":"2015","journal-title":"Cell Syst"},{"key":"2023020205312701400_btx055-B11","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1038\/nature08497","article-title":"An oestrogen-receptor-alpha-bound human chromatin interactome","volume":"462","author":"Fullwood","year":"2009","journal-title":"Nature"},{"key":"2023020205312701400_btx055-B12","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1093\/bioinformatics\/btg467","article-title":"Comparison of various algorithms for recognizing short coding sequences of human genes","volume":"20","author":"Gao","year":"2004","journal-title":"Bioinformatics"},{"key":"2023020205312701400_btx055-B13","doi-asserted-by":"crossref","first-page":"10738","DOI":"10.1038\/srep10738","article-title":"Flux balance analysis predicts essential genes in clear cell renal cell carcinoma metabolism","volume":"5","author":"Gatto","year":"2015","journal-title":"Sci. Rep"},{"key":"2023020205312701400_btx055-B14","doi-asserted-by":"crossref","first-page":"1780","DOI":"10.1093\/nar\/gkg254","article-title":"ZCURVE: a new system for recognizing protein-coding genes in bacterial and archaeal genomes","volume":"31","author":"Guo","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B15","doi-asserted-by":"crossref","first-page":"9.","DOI":"10.1186\/1471-2105-7-9","article-title":"ZCURVE_V: a new self-training system for recognizing protein-coding genes in viral and phage genomes","volume":"7","author":"Guo","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020205312701400_btx055-B16","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1023\/A:1012487302797","article-title":"Gene selection for cancer classification using support vector machines","volume":"46","author":"Guyon","year":"2002","journal-title":"Mach. Learn"},{"key":"2023020205312701400_btx055-B17","doi-asserted-by":"crossref","first-page":"1515","DOI":"10.1016\/j.cell.2015.11.015","article-title":"High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities","volume":"163","author":"Hart","year":"2015","journal-title":"Cell"},{"key":"2023020205312701400_btx055-B18","doi-asserted-by":"crossref","first-page":"164.","DOI":"10.1186\/s12859-016-1015-8","article-title":"BAGEL: a computational framework for identifying essential genes from pooled library screens","volume":"17","author":"Hart","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023020205312701400_btx055-B19","doi-asserted-by":"crossref","first-page":"bas008","DOI":"10.1093\/database\/bas008","article-title":"Tracking and coordinating an international curation effort for the CCDS Project","volume":"2012","author":"Harte","year":"2012","journal-title":"Database (Oxford)"},{"key":"2023020205312701400_btx055-B20","doi-asserted-by":"crossref","first-page":"W85","DOI":"10.1093\/nar\/gkv491","article-title":"ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes","volume":"43","author":"Hua","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B21","doi-asserted-by":"crossref","first-page":"562","DOI":"10.1016\/j.tcb.2011.07.005","article-title":"Essence of life: essential genes of minimal genomes","volume":"21","author":"Juhas","year":"2011","journal-title":"Trends Cell Biol"},{"key":"2023020205312701400_btx055-B22","doi-asserted-by":"crossref","first-page":"1421","DOI":"10.1101\/gr.3992505","article-title":"Metabolic functions of duplicate genes in Saccharomyces cerevisiae","volume":"1515","author":"Kuepfer","year":"2005","journal-title":"Genome Res"},{"key":"2023020205312701400_btx055-B23","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1038\/nature19057","article-title":"Analysis of protein-coding genetic variation in 60,706 humans","volume":"536","author":"Lek","year":"2016","journal-title":"Nature"},{"key":"2023020205312701400_btx055-B24","doi-asserted-by":"crossref","first-page":"2133","DOI":"10.1105\/tpc.15.00051","article-title":"Characteristics of plant essential genes allow for within- and between-Species prediction of lethal mutant phenotypes","volume":"27","author":"Lloyd","year":"2015","journal-title":"Plant Cell"},{"key":"2023020205312701400_btx055-B25","doi-asserted-by":"crossref","first-page":"D574","DOI":"10.1093\/nar\/gkt1131","article-title":"DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements","volume":"42","author":"Luo","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B26","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1016\/j.biocel.2003.08.013","article-title":"GS-Finder: a program to find bacterial gene start sites with a self-training method","volume":"36","author":"Ou","year":"2004","journal-title":"Int. J. Biochem. Cell Biol"},{"key":"2023020205312701400_btx055-B27","doi-asserted-by":"crossref","first-page":"87.","DOI":"10.1186\/1752-0509-6-87","article-title":"Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks","volume":"6","author":"Peng","year":"2012","journal-title":"BMC Syst. Biol"},{"key":"2023020205312701400_btx055-B28","doi-asserted-by":"crossref","first-page":"1126","DOI":"10.1101\/gr.5144106","article-title":"Predicting essential genes in fungal genomes","volume":"16","author":"Seringhaus","year":"2006","journal-title":"Genome Res"},{"key":"2023020205312701400_btx055-B29","doi-asserted-by":"crossref","first-page":"2949","DOI":"10.1093\/bioinformatics\/btm479","article-title":"Improved BLAST searches using longer words for protein seeding","volume":"23","author":"Shiryev","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020205312701400_btx055-B30","doi-asserted-by":"crossref","first-page":"D535","DOI":"10.1093\/nar\/gkj109","article-title":"BioGRID: a general repository for interaction datasets","volume":"34","author":"Stark","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B31","doi-asserted-by":"crossref","first-page":"1096","DOI":"10.1126\/science.aac7041","article-title":"Identification and characterization of essential genes in the human genome","volume":"350","author":"Wang","year":"2015","journal-title":"Science"},{"key":"2023020205312701400_btx055-B32","doi-asserted-by":"crossref","first-page":"e72343.","DOI":"10.1371\/journal.pone.0072343","article-title":"Geptop: a gene essentiality prediction tool for sequenced bacterial genomes based on orthology and phylogeny","volume":"8","author":"Wei","year":"2013","journal-title":"PLoS One"},{"key":"2023020205312701400_btx055-B33","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.gene.2013.08.018","article-title":"Z curve theory-based analysis of the dynamic nature of nucleosome positioning in Saccharomyces cerevisiae","volume":"530","author":"Wu","year":"2013","journal-title":"Gene"},{"key":"2023020205312701400_btx055-B34","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1016\/j.snb.2015.02.025","article-title":"Feature selection and analysis on correlated gas sensor data with recursive feature elimination","volume":"212","author":"Yan","year":"2015","journal-title":"Sens. Actuators. B. Chem"},{"key":"2023020205312701400_btx055-B35","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1186\/1471-2105-9-113","article-title":"Human Pol II promoter recognition based on primary sequences and free energy of dinucleotides","volume":"9","author":"Yang","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023020205312701400_btx055-B36","doi-asserted-by":"crossref","first-page":"1246","DOI":"10.1093\/bioinformatics\/bts120","article-title":"Predicting the lethal phenotype of the knockout mouse by integrating comprehensive genomic data","volume":"28","author":"Yuan","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020205312701400_btx055-B37","doi-asserted-by":"crossref","first-page":"2804","DOI":"10.1093\/nar\/28.14.2804","article-title":"Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve","volume":"28","author":"Zhang","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B38","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1006\/jtbi.1997.0401","article-title":"A symmetrical theory of DNA sequences and its applications","volume":"187","author":"Zhang","year":"1997","journal-title":"J. Theor. Biol"},{"key":"2023020205312701400_btx055-B39","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1006\/jmbi.1994.1263","article-title":"A graphic approach to analyzing codon usage in 1562 Escherichia coli protein coding sequences","volume":"238","author":"Zhang","year":"1994","journal-title":"J. Mol. Biol"},{"key":"2023020205312701400_btx055-B40","doi-asserted-by":"crossref","first-page":"6313","DOI":"10.1093\/nar\/19.22.6313","article-title":"Analysis of distribution of bases in the coding sequences by a diagrammatic technique","volume":"19","author":"Zhang","year":"1991","journal-title":"Nucleic Acids Res"},{"key":"2023020205312701400_btx055-B41","doi-asserted-by":"crossref","first-page":"767","DOI":"10.1080\/07391102.1994.10508031","article-title":"Z curves, an intutive tool for visualizing and analyzing the DNA sequences","volume":"11","author":"Zhang","year":"1994","journal-title":"J. Biomol. Struct. Dyn"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/12\/1758\/49039793\/bioinformatics_33_12_1758.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/12\/1758\/49039793\/bioinformatics_33_12_1758.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T00:35:13Z","timestamp":1675298113000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/12\/1758\/2964734"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,1,31]]},"references-count":41,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2017,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx055","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/084129","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,6,15]]},"published":{"date-parts":[[2017,1,31]]}}}