{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T15:19:06Z","timestamp":1771341546180,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2020,8,7]],"date-time":"2020-08-07T00:00:00Z","timestamp":1596758400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["K99 LM012926-02"],"award-info":[{"award-number":["K99 LM012926-02"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 LM010098"],"award-info":[{"award-number":["R01 LM010098"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 AI116794"],"award-info":[{"award-number":["R01 AI116794"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["UC4 DK112217"],"award-info":[{"award-number":["UC4 DK112217"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["P30 ES013508"],"award-info":[{"award-number":["P30 ES013508"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["UL1 TR001878"],"award-info":[{"award-number":["UL1 TR001878"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,4,19]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Many researchers with domain expertise are unable to easily apply machine learning (ML) to their bioinformatics data due to a lack of ML and\/or coding expertise. Methods that have been proposed thus far to automate ML mostly require programming experience as well as expert knowledge to tune and apply the algorithms correctly.\u00a0Here, we study a method of automating biomedical data science using a web-based AI platform to recommend model choices and conduct experiments. We have two goals in mind: first, to make it easy to construct sophisticated models of biomedical processes; and second, to provide a fully automated AI agent that can choose and conduct promising experiments for the user, based on the user\u2019s experiments as well as prior knowledge.\u00a0To validate this framework, we conduct an experiment on 165 classification problems, comparing to state-of-the-art, automated approaches. Finally, we use this tool to develop predictive models of septic shock in critical care patients.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We find that matrix factorization-based recommendation systems outperform metalearning methods for automating ML. This result mirrors the results of earlier recommender systems research in other domains. The proposed AI is competitive with state-of-the-art automated ML methods in terms of choosing optimal algorithm configurations for datasets. In our application to prediction of septic shock, the AI-driven analysis produces a competent ML model (AUROC 0.85\u00b10.02) that performs on par with state-of-the-art deep learning results for this task, with much less computational effort.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>PennAI is available free of charge and open-source. It is distributed under the GNU public license (GPL) version 3.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa698","type":"journal-article","created":{"date-parts":[[2020,8,4]],"date-time":"2020-08-04T19:31:10Z","timestamp":1596569470000},"page":"250-256","source":"Crossref","is-referenced-by-count":17,"title":["Evaluating recommender systems for AI-driven biomedical informatics"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1332-2960","authenticated-orcid":false,"given":"William","family":"La Cava","sequence":"first","affiliation":[{"name":"University of Pennsylvania Institute for Biomedical Informatics, Department of Biostatistics, Epidemiology and Informatics, , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Heather","family":"Williams","sequence":"additional","affiliation":[{"name":"University of Pennsylvania Institute for Biomedical Informatics, Department of Biostatistics, Epidemiology and Informatics, , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weixuan","family":"Fu","sequence":"additional","affiliation":[{"name":"University of Pennsylvania Institute for Biomedical Informatics, Department of Biostatistics, Epidemiology and Informatics, , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Steve","family":"Vitale","sequence":"additional","affiliation":[{"name":"University of Pennsylvania Institute for Biomedical Informatics, Department of Biostatistics, Epidemiology and Informatics, , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Durga","family":"Srivatsan","sequence":"additional","affiliation":[{"name":"University of Pennsylvania Institute for Biomedical Informatics, Department of Biostatistics, Epidemiology and Informatics, , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jason H","family":"Moore","sequence":"additional","affiliation":[{"name":"University of Pennsylvania Institute for Biomedical Informatics, Department of Biostatistics, Epidemiology and Informatics, , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2020,8,7]]},"reference":[{"key":"2023051511003747900_btaa698-B1","volume-title":"Big Data: Creating the Power to Move Heaven and Earth","year":"2014"},{"key":"2023051511003747900_btaa698-B2","first-page":"43","author":"Bell","year":"2007"},{"key":"2023051511003747900_btaa698-B3","author":"Bell","year":"2007"},{"key":"2023051511003747900_btaa698-B4","first-page":"35","author":"Bennett","year":"2007"},{"key":"2023051511003747900_btaa698-B5","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023051511003747900_btaa698-B6","first-page":"293","author":"Davidson","year":"2010"},{"key":"2023051511003747900_btaa698-B7","first-page":"2962","author":"Feurer","year":"2015"},{"key":"2023051511003747900_btaa698-B8","author":"Feurer","year":"2018"},{"key":"2023051511003747900_btaa698-B9","first-page":"3348","author":"Fusi","year":"2018"},{"key":"2023051511003747900_btaa698-B10","author":"Gorrell","year":"2006"},{"key":"2023051511003747900_btaa698-B11","first-page":"21","author":"Guyon","year":"2016"},{"key":"2023051511003747900_btaa698-B12","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1038\/s41597-019-0103-9","article-title":"Multitask learning and benchmarking with clinical time series data","volume":"6","author":"Harutyunyan","year":"2019","journal-title":"Sci. Data"},{"key":"2023051511003747900_btaa698-B13","doi-asserted-by":"crossref","first-page":"299ra122","DOI":"10.1126\/scitranslmed.aab3719","article-title":"A targeted real-time early warning score (TREWScore) for septic shock","volume":"7","author":"Henry","year":"2015","journal-title":"Sci. Transl. Med"},{"key":"2023051511003747900_btaa698-B14","author":"Hug","year":"2017"},{"key":"2023051511003747900_btaa698-B15","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1007\/978-3-642-25566-3_40","volume-title":"Learning and Intelligent Optimization, Lecture Notes in Computer Science","author":"Hutter","year":"2011"},{"key":"2023051511003747900_btaa698-B16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci. Data"},{"key":"2023051511003747900_btaa698-B17","volume-title":"ICML Workshop on AutoML","author":"Komer","year":"2014"},{"key":"2023051511003747900_btaa698-B18","first-page":"826","article-title":"Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA","volume":"18","author":"Kotthoff","year":"2017","journal-title":"J. Mach. Learn. Res"},{"key":"2023051511003747900_btaa698-B19","first-page":"485","author":"Olson","year":"2016"},{"key":"2023051511003747900_btaa698-B20","author":"Olson","year":"2017"},{"key":"2023051511003747900_btaa698-B21","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1186\/s13040-017-0154-4","article-title":"PMLB: a large benchmark suite for machine learning evaluation and comparison","volume":"10","author":"Olson","year":"2017","journal-title":"BioData Mining"},{"key":"2023051511003747900_btaa698-B22","author":"Olson","year":"2017"},{"key":"2023051511003747900_btaa698-B23","first-page":"93","author":"Pil\u00e1szy","year":"2009"},{"key":"2023051511003747900_btaa698-B24","author":"Real","year":"2018"},{"key":"2023051511003747900_btaa698-B25","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1001\/jama.2016.0287","article-title":"The third international consensus definitions for sepsis and septic shock (Sepsis-3)","volume":"315","author":"Singer","year":"2016","journal-title":"JAMA"},{"key":"2023051511003747900_btaa698-B26","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1109\/MIC.2017.72","article-title":"Two decades of recommender systems at Amazon.com","volume":"21","author":"Smith","year":"2017","journal-title":"IEEE Internet Comput"},{"key":"2023051511003747900_btaa698-B27","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1145\/2641190.2641198","article-title":"OpenML: networked Science in Machine Learning","volume":"15","author":"Vanschoren","year":"2014","journal-title":"SIGKDD Explor. Newsl"},{"key":"2023051511003747900_btaa698-B28","first-page":"306","article-title":"A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction","volume":"31","author":"Velez","year":"2007","journal-title":"Genet. Epidemiol. Off. Publ. Int. Genet. Epidemiol. Soc"},{"key":"2023051511003747900_btaa698-B29","first-page":"1173","author":"Yang"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa698\/33979445\/btaa698.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/2\/250\/50321770\/btaa698.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/2\/250\/50321770\/btaa698.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T11:01:22Z","timestamp":1684148482000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/2\/250\/5885079"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2020,8,7]]},"references-count":29,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,4,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa698","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,1,15]]},"published":{"date-parts":[[2020,8,7]]}}}