{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T01:45:45Z","timestamp":1768009545680,"version":"3.49.0"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2021,5,13]],"date-time":"2021-05-13T00:00:00Z","timestamp":1620864000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["P41 GM103712"],"award-info":[{"award-number":["P41 GM103712"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"One-Bit Matrix Completion","award":["1BMC"],"award-info":[{"award-number":["1BMC"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,10,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>High throughput and high content screening are extensively used to determine the effect of small molecule compounds and other potential therapeutics upon particular targets as part of the early drug development process. However, screening is typically used to find compounds that have a desired effect but not to identify potential undesirable side effects. This is because the size of the search space precludes measuring the potential effect of all compounds on all targets. Active machine learning has been proposed as a solution to this problem.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>In this article, we describe an improved imputation method, Impute by Committee, for completion of matrices containing categorical values. We compare this method to existing approaches in the context of modeling the effects of many compounds on many targets using latent similarities between compounds and conditions. We also compare these methods for the task of driving active learning in well-characterized settings for synthetic and real datasets. Our new approach performed the best overall both in the accuracy of matrix completion itself and in the number of experiments needed to train an accurate predictive model compared to random selection of experiments. We further improved upon the performance of our new method by developing an adaptive switching strategy for active learning that iteratively chooses between different matrix completion methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>A Reproducible Research Archive containing all data and code is available at http:\/\/murphylab.cbd.cmu.edu\/software.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab322","type":"journal-article","created":{"date-parts":[[2021,4,29]],"date-time":"2021-04-29T19:14:33Z","timestamp":1619723673000},"page":"3538-3545","source":"Crossref","is-referenced-by-count":4,"title":["Evaluation of categorical matrix completion algorithms: toward improved active learning for drug discovery"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8206-511X","authenticated-orcid":false,"given":"Huangqingbo","family":"Sun","sequence":"first","affiliation":[{"name":"Computational Biology Department, Carnegie Mellon University , Pittsburgh, PA 15213, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0358-901X","authenticated-orcid":false,"given":"Robert F","family":"Murphy","sequence":"additional","affiliation":[{"name":"Computational Biology Department, Carnegie Mellon University , Pittsburgh, PA 15213, USA"},{"name":"Department of Biological Sciences, Carnegie Mellon University , Pittsburgh, PA 15213, USA"},{"name":"Department of Biomedical Engineering, Carnegie Mellon University , Pittsburgh, PA 15213, USA"},{"name":"Machine Learning Department, Carnegie Mellon University , Pittsburgh, PA 15213, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,5,13]]},"reference":[{"key":"2023051609025423400_btab322-B1","first-page":"1","article-title":"Survey on multiclass classification methods","volume":"19","author":"Aly","year":"2005","journal-title":"Neural. Netw"},{"key":"2023051609025423400_btab322-B2","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1016\/j.jcss.2008.07.003","article-title":"Agnostic active learning","volume":"75","author":"Balcan","year":"2009","journal-title":"J. Comput. Syst. Sci"},{"key":"2023051609025423400_btab322-B3","doi-asserted-by":"crossref","first-page":"1956","DOI":"10.1137\/080738970","article-title":"A singular value thresholding algorithm for matrix completion","volume":"20","author":"Cai","year":"2010","journal-title":"SIAM J. Optim"},{"key":"2023051609025423400_btab322-B4","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1109\/JPROC.2009.2035722","article-title":"Matrix completion with noise","volume":"98","author":"Candes","year":"2010","journal-title":"Proc. IEEE"},{"key":"2023051609025423400_btab322-B5","doi-asserted-by":"crossref","first-page":"2053","DOI":"10.1109\/TIT.2010.2044061","article-title":"The power of convex relaxation: near-optimal matrix completion","volume":"56","author":"Candes","year":"2010","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023051609025423400_btab322-B6","first-page":"369","author":"Cao","year":"2015"},{"key":"2023051609025423400_btab322-B7","author":"Chen","year":"2020"},{"key":"2023051609025423400_btab322-B8","first-page":"3447","author":"Chiang","year":"2015"},{"key":"2023051609025423400_btab322-B9","first-page":"705","author":"Cohn","year":"1995"},{"key":"2023051609025423400_btab322-B10","doi-asserted-by":"crossref","first-page":"22858","DOI":"10.1002\/anie.201909987","article-title":"Autonomous discovery in the chemical sciences part I: progress","volume":"59","author":"Coley","year":"2020","journal-title":"Angew. Chem. Int. Ed"},{"key":"2023051609025423400_btab322-B11","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1093\/imaiai\/iau006","article-title":"1-bit matrix completion","volume":"3","author":"Davenport","year":"2014","journal-title":"Inf. Inference J. IMA"},{"key":"2023051609025423400_btab322-B12","doi-asserted-by":"crossref","first-page":"770","DOI":"10.3389\/fphar.2020.00770","article-title":"Accelerating therapeutics for opportunities in medicine: a paradigm shift in drug discovery","volume":"11","author":"Hinkson","year":"2020","journal-title":"Front. Pharmacol"},{"key":"2023051609025423400_btab322-B13","doi-asserted-by":"crossref","first-page":"3195","DOI":"10.1093\/bioinformatics\/btx390","article-title":"Matrix completion with side information and its applications in predicting the antigenicity of influenza viruses","volume":"33","author":"Huang","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051609025423400_btab322-B14","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1186\/1471-2105-15-143","article-title":"Efficient discovery of responses of proteins to compounds using active learning","volume":"15","author":"Kangas","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023051609025423400_btab322-B15","doi-asserted-by":"crossref","first-page":"2950","DOI":"10.1214\/15-EJS1093","article-title":"Adaptive multinomial matrix completion","volume":"9","author":"Klopp","year":"2015","journal-title":"Electron. J. Stat"},{"key":"2023051609025423400_btab322-B16","first-page":"1727","author":"Lafond","year":"2014"},{"key":"2023051609025423400_btab322-B17","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1021\/acs.jcim.5b00332","article-title":"Feasibility of active machine learning for multiclass compound classification","volume":"56","author":"Lang","year":"2016","journal-title":"J. Chem. Inf. Model"},{"key":"2023051609025423400_btab322-B18","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1038\/nature11159","article-title":"Large-scale prediction and testing of drug activity on side-effect targets","volume":"486","author":"Lounkine","year":"2012","journal-title":"Nature"},{"key":"2023051609025423400_btab322-B19","first-page":"2287","article-title":"Spectral regularization algorithms for learning large incomplete matrices","volume":"11","author":"Mazumder","year":"2010","journal-title":"J. Mach. Learn. Res"},{"key":"2023051609025423400_btab322-B20","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1038\/nchembio.576","article-title":"An active role for machine learning in drug development","volume":"7","author":"Murphy","year":"2011","journal-title":"Nat. Chem. Biol"},{"key":"2023051609025423400_btab322-B21","doi-asserted-by":"crossref","first-page":"e83996","DOI":"10.1371\/journal.pone.0083996","article-title":"Efficient modeling and active learning discovery of biological responses","volume":"8","author":"Naik","year":"2013","journal-title":"PLoS One"},{"key":"2023051609025423400_btab322-B22","doi-asserted-by":"crossref","first-page":"e10047","DOI":"10.7554\/eLife.10047","article-title":"Active machine learning-driven experimentation to determine compound effects on protein patterns","volume":"5","author":"Naik","year":"2016","journal-title":"Elife"},{"key":"2023051609025423400_btab322-B23","first-page":"73","article-title":"Practical considerations for active machine learning in drug discovery","volume":"32\u201333","author":"Reker","year":"2020","journal-title":"Drug Discov. Today Technol"},{"key":"2023051609025423400_btab322-B24","doi-asserted-by":"crossref","first-page":"381","DOI":"10.4155\/fmc-2016-0197","article-title":"Active learning for computational chemogenomics","volume":"9","author":"Reker","year":"2017","journal-title":"Future Med. Chem"},{"key":"2023051609025423400_btab322-B25","author":"Settles","year":"2009"},{"key":"2023051609025423400_btab322-B26","author":"Sun","year":"2020"},{"key":"2023051609025423400_btab322-B27","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1016\/j.chembiol.2017.11.009","article-title":"Drug target commons: a community effort to build a consensus knowledge base for drug-target interactions","volume":"25","author":"Tang","year":"2018","journal-title":"Cell Chem. Biol"},{"key":"2023051609025423400_btab322-B28","author":"Wang","year":"2018"},{"key":"2023051609025423400_btab322-B29","doi-asserted-by":"crossref","first-page":"667","DOI":"10.1021\/ci025620t","article-title":"Active learning with support vector machines in the drug discovery process","volume":"43","author":"Warmuth","year":"2003","journal-title":"J. Chem. Inf. Comput. Sci"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab322\/38657095\/btab322.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3538\/50338374\/btab322.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3538\/50338374\/btab322.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T09:04:54Z","timestamp":1684227894000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/20\/3538\/6275258"}},"subtitle":[],"editor":[{"given":"Jinbo","family":"Xu","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,5,13]]},"references-count":29,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2021,10,25]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab322","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,10,15]]},"published":{"date-parts":[[2021,5,13]]}}}