{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T05:46:59Z","timestamp":1675316819655},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"17","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Massive amounts of genome-wide gene expression data have become available, motivating the development of computational approaches that leverage this information to predict gene function. Among successful approaches, supervised machine learning methods, such as Support Vector Machines (SVMs), have shown superior prediction accuracy. However, these methods lack the simple biological intuition provided by co-expression networks (CNs), limiting their practical usefulness.<\/jats:p>\n               <jats:p>Results: In this work, we present Discriminative Local Subspaces (DLS), a novel method that combines supervised machine learning and co-expression techniques with the goal of systematically predict genes involved in specific biological processes of interest. Unlike traditional CNs, DLS uses the knowledge available in Gene Ontology (GO) to generate informative training sets that guide the discovery of expression signatures: expression patterns that are discriminative for genes involved in the biological process of interest. By linking genes co-expressed with these signatures, DLS is able to construct a discriminative CN that links both, known and previously uncharacterized genes, for the selected biological process. This article focuses on the algorithm behind DLS and shows its predictive power using an Arabidopsis thaliana dataset and a representative set of 101 GO terms from the Biological Process Ontology. Our results show that DLS has a superior average accuracy than both SVMs and CNs. Thus, DLS is able to provide the prediction accuracy of supervised learning methods while maintaining the intuitive understanding of CNs.<\/jats:p>\n               <jats:p>Availability: A MATLAB\u00ae implementation of DLS is available at http:\/\/virtualplant.bio.puc.cl\/cgi-bin\/Lab\/tools.cgi<\/jats:p>\n               <jats:p>Contact: \u00a0tfpuelma@uc.cl<\/jats:p>\n               <jats:p>Supplementary Information: Supplementary data are available at http:\/\/bioinformatics.mpimp-golm.mpg.de\/.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts455","type":"journal-article","created":{"date-parts":[[2012,7,21]],"date-time":"2012-07-21T13:29:51Z","timestamp":1342877391000},"page":"2256-2264","source":"Crossref","is-referenced-by-count":7,"title":["Discriminative local subspaces in gene expression data for effective gene function prediction"],"prefix":"10.1093","volume":"28","author":[{"given":"Tomas","family":"Puelma","sequence":"first","affiliation":[{"name":"1 Department of Molecular Genetics and Microbiology, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Functional Genomics and 2Department of Computer Science, Millennium Nucleus Center for Plant Functional Genomics, Pontificia Universidad Catolica de Chile, Santiago, Chile"},{"name":"1 Department of Molecular Genetics and Microbiology, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Functional Genomics and 2Department of Computer Science, Millennium Nucleus Center for Plant Functional Genomics, Pontificia Universidad Catolica de Chile, Santiago, Chile"}]},{"given":"Rodrigo A.","family":"Guti\u00e9rrez","sequence":"additional","affiliation":[{"name":"1 Department of Molecular Genetics and Microbiology, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Functional Genomics and 2Department of Computer Science, Millennium Nucleus Center for Plant Functional Genomics, Pontificia Universidad Catolica de Chile, Santiago, Chile"}]},{"given":"Alvaro","family":"Soto","sequence":"additional","affiliation":[{"name":"1 Department of Molecular Genetics and Microbiology, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Functional Genomics and 2Department of Computer Science, Millennium Nucleus Center for Plant Functional Genomics, Pontificia Universidad Catolica de Chile, Santiago, Chile"}]}],"member":"286","published-online":{"date-parts":[[2012,7,20]]},"reference":[{"key":"2023012512561683300_bts455-B1","doi-asserted-by":"crossref","first-page":"1866","DOI":"10.1126\/science.1089072","article-title":"Biological networks: the tinkerer as an engineer","volume":"301","author":"Alon","year":"2003","journal-title":"Science"},{"key":"2023012512561683300_bts455-B2","doi-asserted-by":"crossref","first-page":"6745","DOI":"10.1073\/pnas.96.12.6745","article-title":"Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays","volume":"96","author":"Alon","year":"1999","journal-title":"Proc. Nat. Acad. Sci. USA"},{"key":"2023012512561683300_bts455-B3","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023012512561683300_bts455-B4","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1016\/j.neucom.2010.02.016","article-title":"Rule extraction from support vector machines: a review","volume":"74","author":"Barakat","year":"2010","journal-title":"Neurocomputing"},{"key":"2023012512561683300_bts455-B5","doi-asserted-by":"crossref","first-page":"830","DOI":"10.1093\/bioinformatics\/btk048","article-title":"Hierarchical multi-label prediction of gene function","volume":"22","author":"Barutcuoglu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012512561683300_bts455-B6","doi-asserted-by":"crossref","first-page":"9709","DOI":"10.1073\/pnas.1100958108","article-title":"Genome-wide network model capturing seed germination reveals coordinated regulation of plant cellular phase transitions","volume":"108","author":"Bassel","year":"2011","journal-title":"Proc.Natl. Acad. Sci. USA."},{"key":"2023012512561683300_bts455-B7","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1186\/1471-2164-9-495","article-title":"Prosecutor: parameter-free inference of gene function for prokaryotes using DNA microarray data, genomic context and multiple gene annotation sources","volume":"9","author":"Blom","year":"2008","journal-title":"BMC Genomics"},{"key":"2023012512561683300_bts455-B8","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1016\/j.febslet.2004.07.055","article-title":"Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments","volume":"573","author":"Breitling","year":"2004","journal-title":"FEBS Lett."},{"key":"2023012512561683300_bts455-B9","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1073\/pnas.97.1.262","article-title":"Knowledge-based analysis of microarray gene expression data by using support vector machines","volume":"97","author":"Brown","year":"2000","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012512561683300_bts455-B10","first-page":"1","volume":"3","author":"Chang","year":"2011","journal-title":"ACM Transactions on Intelligent Systems and Technology. LIBSVM: a library for support vector machines"},{"key":"2023012512561683300_bts455-B11","first-page":"93","article-title":"Biclustering of expression data","volume":"8","author":"Cheng","year":"2000","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol."},{"key":"2023012512561683300_bts455-B12","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"2023012512561683300_bts455-B13","doi-asserted-by":"crossref","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","article-title":"Cluster analysis and display of genome-wide expression patterns","volume":"95","author":"Eisen","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512561683300_bts455-B14","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1145\/1081870.1081878","article-title":"Rule extraction from linear support vector machines","volume-title":"Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, KDD \u201905","author":"Fung","year":"2005"},{"key":"2023012512561683300_bts455-B15","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1023\/A:1012487302797","article-title":"Gene selection for cancer classification using support vector machines","volume":"46","author":"Guyon","year":"2002","journal-title":"Mach. Learn."},{"key":"2023012512561683300_bts455-B16","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1104\/pp.108.117366","article-title":"Annotating genes of known and unknown function by large-scale coexpression analysis","volume":"147","author":"Horan","year":"2008","journal-title":"Plant Physiol."},{"key":"2023012512561683300_bts455-B17","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1016\/j.mib.2004.08.012","article-title":"Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction","volume":"7","author":"Jansen","year":"2004","journal-title":"Curr. Opin. Microbiol."},{"key":"2023012512561683300_bts455-B18","doi-asserted-by":"crossref","first-page":"2087","DOI":"10.1126\/science.1061603","article-title":"A gene expression map for Caenorhabditis elegans","volume":"293","author":"Kim","year":"2001","journal-title":"Science"},{"key":"2023012512561683300_bts455-B19","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1038\/nbt.1603","article-title":"Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana","volume":"28","author":"Lee","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012512561683300_bts455-B20","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1109\/TCBB.2004.2","article-title":"Biclustering algorithms for biological data analysis: a survey","volume":"1","author":"Madeira","year":"2004","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023012512561683300_bts455-B21","doi-asserted-by":"crossref","first-page":"1703","DOI":"10.1101\/gr.192502","article-title":"Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons","volume":"12","author":"Mateos","year":"2002","journal-title":"Genome Res."},{"key":"2023012512561683300_bts455-B22","first-page":"1","volume-title":"Machine Learning","author":"Mitchell","year":"1997","edition":"1"},{"key":"2023012512561683300_bts455-B23","doi-asserted-by":"crossref","first-page":"1267","DOI":"10.1093\/bioinformatics\/btq121","article-title":"CoP: a database for characterizing co-expressed gene modules with biological information in plants","volume":"26","author":"Ogata","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512561683300_bts455-B24","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1214\/aoms\/1177704472","article-title":"On estimation of a probability density function and mode","volume":"33","author":"Parzen","year":"1962","journal-title":"Ann. Math. Stat."},{"key":"2023012512561683300_bts455-B25","doi-asserted-by":"crossref","first-page":"1122","DOI":"10.1093\/bioinformatics\/btl060","article-title":"A systematic comparison and evaluation of biclustering methods for gene expression data","volume":"22","author":"Preli\u0107","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012512561683300_bts455-B26","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1038\/35065725","article-title":"Exploring complex networks","volume":"410","author":"Strogatz","year":"2001","journal-title":"Nature"},{"key":"2023012512561683300_bts455-B27","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1126\/science.1087447","article-title":"A gene-coexpression network for global discovery of conserved genetic modules","volume":"302","author":"Stuart","year":"2003","journal-title":"Science"},{"key":"2023012512561683300_bts455-B28","doi-asserted-by":"crossref","DOI":"10.1201\/9781420036275.ch26","article-title":"Biclustering algorithms: a survey","volume-title":"Handbook of Computational Molecular Biology","author":"Tanay","year":"2005"},{"key":"2023012512561683300_bts455-B29","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1111\/j.1749-6632.2002.tb04888.x","article-title":"Pattern recognition techniques in microarray data analysis: a survey","volume":"980","author":"Valafar","year":"2002","journal-title":"Ann. NY Acad. Sci."},{"key":"2023012512561683300_bts455-B30","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1104\/pp.109.136028","article-title":"Unraveling transcriptional control in arabidopsis using cis-regulatory elements and coexpression networks","volume":"150","author":"Vandepoele","year":"2009","journal-title":"Plant Physiol."},{"key":"2023012512561683300_bts455-B31","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1186\/1471-2105-7-91","article-title":"Bias in error estimation when using cross-validation for model selection","volume":"7","author":"Varma","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012512561683300_bts455-B32","doi-asserted-by":"crossref","first-page":"1198","DOI":"10.1101\/gr.9.12.1198","article-title":"Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes","volume":"9","author":"Walker","year":"1999","journal-title":"Genome Res."},{"key":"2023012512561683300_bts455-B33","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1109\/ICCV.2009.5459207","article-title":"An HOG-LBP human detector with partial occlusion handling","volume-title":"Computer Vision, 2009 IEEE 12th International Conference on","author":"Wang","year":"2009"},{"key":"2023012512561683300_bts455-B34","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1093\/bib\/5.4.328","article-title":"Biological applications of support vector machines","volume":"5","author":"Yang","year":"2004","journal-title":"Brief. Bioinform."},{"key":"2023012512561683300_bts455-B35","doi-asserted-by":"crossref","first-page":"517","DOI":"10.1007\/s00726-008-0077-y","article-title":"Protein function prediction with high-throughput data","volume":"35","author":"Zhao","year":"2008","journal-title":"Amino Acids"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/17\/2256\/48874807\/bioinformatics_28_17_2256.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/17\/2256\/48874807\/bioinformatics_28_17_2256.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T18:16:14Z","timestamp":1674670574000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/17\/2256\/246581"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,7,20]]},"references-count":35,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2012,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts455","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,9,1]]},"published":{"date-parts":[[2012,7,20]]}}}