{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T16:38:24Z","timestamp":1770741504454,"version":"3.49.0"},"reference-count":16,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2018,6,27]],"date-time":"2018-06-27T00:00:00Z","timestamp":1530057600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000057","name":"NIGMS","doi-asserted-by":"publisher","award":["U54GM114838"],"award-info":[{"award-number":["U54GM114838"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Big Data to Knowledge"},{"name":"BD2K"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Several large-scale efforts have been made to collect gene expression signatures from a variety of biological conditions, such as response of cell lines to treatment with drugs, or tumor samples with different characteristics. These gene signature collections are utilized through bioinformatics tools for \u2018signature matching\u2019, whereby a researcher studying an expression profile can identify previously cataloged biological conditions most related to their profile. Signature matching tools typically retrieve from the collection the signature that has highest similarity to the user-provided profile. Alternatively, classification models may be applied where each biological condition in the signature collection is a class label; however, such models are trained on the collection of available signatures and may not generalize to the novel cellular context or cell line of the researcher\u2019s expression profile.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present an advanced multi-way classification algorithm for signature matching, called SigMat, that is trained on a large signature collection from a well-studied cellular context, but can also classify signatures from other cell types by relying on an additional, small collection of signatures representing the target cell type. It uses these \u2018tuning data\u2019 to learn two additional parameters that help adapt its predictions for other cellular contexts. SigMat outperforms other similarity scores and classification methods in identifying the correct label of a query expression profile from as many as 244 or 500 candidate classes (drug treatments) cataloged by the LINCS L1000 project. SigMat retains its high accuracy in cross-cell line applications even when the amount of tuning data is severely limited.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>SigMat is available on GitHub at https:\/\/github.com\/JinfengXiao\/SigMat.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty251","type":"journal-article","created":{"date-parts":[[2018,4,17]],"date-time":"2018-04-17T19:10:38Z","timestamp":1523992238000},"page":"i547-i554","source":"Crossref","is-referenced-by-count":10,"title":["SigMat: a classification scheme for gene signature matching"],"prefix":"10.1093","volume":"34","author":[{"given":"Jinfeng","family":"Xiao","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA"}]},{"given":"Charles","family":"Blatti","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA"}]},{"given":"Saurabh","family":"Sinha","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA"},{"name":"Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, USA"}]}],"member":"286","published-online":{"date-parts":[[2018,6,27]]},"reference":[{"key":"2023051605131670800_bty251-B1","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1186\/1471-2105-15-79","article-title":"The characteristic direction: a geometrical approach to identify differentially expressed genes","volume":"15","author":"Clark","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023051605131670800_bty251-B2","doi-asserted-by":"crossref","first-page":"2652","DOI":"10.1038\/srep02652","article-title":"Exploring TCGA Pan-Cancer data at the UCSC Cancer Genomics Browser","volume":"3","author":"Cline","year":"2013","journal-title":"Sci. Rep"},{"key":"2023051605131670800_bty251-B3","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511801389","volume-title":"An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods","author":"Cristianini","year":"2000"},{"key":"2023051605131670800_bty251-B4","doi-asserted-by":"crossref","first-page":"3.","DOI":"10.1186\/1471-2105-7-3","article-title":"Gene selection and classification of microarray data using random forest","volume":"7","author":"Diaz-Uriarte","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023051605131670800_bty251-B5","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1093\/nar\/30.1.207","article-title":"Gene Expression Omnibus: nCBI gene expression and hybridization array data repository","volume":"30","author":"Edgar","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023051605131670800_bty251-B6","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1186\/1471-2105-12-381","article-title":"Discovering biological connections between experimental conditions based on common patterns of differential gene expression","volume":"12","author":"Gower","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051605131670800_bty251-B7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v011.i09","article-title":"kernlab\u2014an S4 package for kernel methods in R","volume":"11","author":"Karatzoglou","year":"2004","journal-title":"J. Stat. Softw"},{"key":"2023051605131670800_bty251-B8","doi-asserted-by":"crossref","first-page":"1929","DOI":"10.1126\/science.1132939","article-title":"The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease","volume":"313","author":"Lamb","year":"2006","journal-title":"Science"},{"key":"2023051605131670800_bty251-B9","first-page":"18","article-title":"Classification and regression by randomForest","volume":"2","author":"Liaw","year":"2002","journal-title":"R. News"},{"key":"2023051605131670800_bty251-B10","doi-asserted-by":"crossref","first-page":"1739","DOI":"10.1093\/bioinformatics\/btr260","article-title":"Molecular signatures database (MSigDB) 3.0","volume":"27","author":"Liberzon","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051605131670800_bty251-B11","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1109\/21.97458","article-title":"A survey of decision tree classifier methodology","volume":"21","author":"Safavian","year":"1991","journal-title":"IEEE Trans. Syst. Man Cybern"},{"key":"2023051605131670800_bty251-B12","doi-asserted-by":"crossref","first-page":"1437","DOI":"10.1016\/j.cell.2017.10.049","article-title":"A next generation connectivity map: l 1000 platform and the first 1,000,000 profiles","volume":"171","author":"Subramanian","year":"2017","journal-title":"Cell"},{"key":"2023051605131670800_bty251-B13","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1186\/1755-8794-1-51","article-title":"Expression-based Pathway Signature Analysis (EPSA): mining publicly available microarray data for insight into human disease","volume":"1","author":"Tenenbaum","year":"2008","journal-title":"BMC Med. Genomics"},{"key":"2023051605131670800_bty251-B14","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1056\/NEJMoa021967","article-title":"A gene-expression signature as a predictor of survival in breast cancer","volume":"347","author":"van de Vijver","year":"2002","journal-title":"N. Engl. J. Med"},{"key":"2023051605131670800_bty251-B15","doi-asserted-by":"crossref","first-page":"R133.","DOI":"10.1186\/gb-2007-8-7-r133","article-title":"Strategy for encoding and comparison of gene expression signatures","volume":"8","author":"Yi","year":"2007","journal-title":"Genome Biol"},{"key":"2023051605131670800_bty251-B16","doi-asserted-by":"crossref","first-page":"bar026","DOI":"10.1093\/database\/bar026","article-title":"International Cancer Genome Consortium Data Portal\u2014a one-stop shop for cancer genomics data","volume":"2011","author":"Zhang","year":"2011","journal-title":"Database (Oxford)"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i547\/50316126\/bioinformatics_34_13_i547.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i547\/50316126\/bioinformatics_34_13_i547.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T05:14:14Z","timestamp":1684214054000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/13\/i547\/5045781"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,6,27]]},"references-count":16,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2018,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty251","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,7,1]]},"published":{"date-parts":[[2018,6,27]]}}}