{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T16:47:54Z","timestamp":1770828474410,"version":"3.50.1"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2021,3,4]],"date-time":"2021-03-04T00:00:00Z","timestamp":1614816000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Novartis Pharma AG"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,6,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>To develop a computer model to predict patients with nonalcoholic steatohepatitis (NASH) using machine learning (ML).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>This retrospective study utilized two databases: a) the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) nonalcoholic fatty liver disease (NAFLD) adult database (2004-2009), and b) the Optum\u00ae de-identified Electronic Health Record dataset (2007-2018), a real-world dataset representative of common electronic health records in the United States. We developed an ML model to predict NASH, using confirmed NASH and non-NASH based on liver histology results in the NIDDK dataset to train the model.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Models were trained and tested on NIDDK NAFLD data (704 patients) and the best-performing models evaluated on Optum data (~3,000,000 patients). An eXtreme Gradient Boosting model (XGBoost) consisting of 14 features exhibited high performance as measured by area under the curve (0.82), sensitivity (81%), and precision (81%) in predicting NASH. Slightly reduced performance was observed with an abbreviated feature set of 5 variables (0.79, 80%, 80%, respectively). The full model demonstrated good performance (AUC 0.76) to predict NASH in Optum data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>The proposed model, named NASHmap, is the first ML model developed with confirmed NASH and non-NASH cases as determined through liver biopsy and validated on a large, real-world patient dataset. Both the 14 and 5-feature versions exhibit high performance.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>The NASHmap model is a convenient and high performing tool that could be used to identify patients likely to have NASH in clinical settings, allowing better patient management and optimal allocation of clinical resources.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocab003","type":"journal-article","created":{"date-parts":[[2021,1,14]],"date-time":"2021-01-14T23:44:01Z","timestamp":1610667841000},"page":"1235-1241","source":"Crossref","is-referenced-by-count":46,"title":["Development of a novel machine learning model to predict presence of nonalcoholic steatohepatitis"],"prefix":"10.1093","volume":"28","author":[{"given":"Matt","family":"Docherty","sequence":"first","affiliation":[{"name":"ZS, Princeton, New Jersey, USA"}]},{"given":"Stephane A","family":"Regnier","sequence":"additional","affiliation":[{"name":"Novartis Pharma AG, Basel, Switzerland"}]},{"given":"Gorana","family":"Capkun","sequence":"additional","affiliation":[{"name":"Novartis Pharma AG, Basel, Switzerland"}]},{"given":"Maria-Magdalena","family":"Balp","sequence":"additional","affiliation":[{"name":"Novartis Pharma AG, Basel, Switzerland"}]},{"given":"Qin","family":"Ye","sequence":"additional","affiliation":[{"name":"ZS, Princeton, New Jersey, USA"}]},{"given":"Nico","family":"Janssens","sequence":"additional","affiliation":[{"name":"Novartis Pharma AG, Basel, Switzerland"}]},{"given":"Andreas","family":"Tietz","sequence":"additional","affiliation":[{"name":"Novartis Pharma AG, Basel, Switzerland"}]},{"given":"J\u00fcrgen","family":"L\u00f6ffler","sequence":"additional","affiliation":[{"name":"Novartis Pharma AG, Basel, Switzerland"}]},{"given":"Jennifer","family":"Cai","sequence":"additional","affiliation":[{"name":"Novartis Pharmaceuticals Inc, East Hanover, USA"}]},{"given":"Marcos C","family":"Pedrosa","sequence":"additional","affiliation":[{"name":"Novartis Pharma AG, Basel, Switzerland"}]},{"given":"J\u00f6rn M","family":"Schattenberg","sequence":"additional","affiliation":[{"name":"Metabolic Liver Research Program. I. Department of Medicine, University Medical Center, Mainz, Germany"}]}],"member":"286","published-online":{"date-parts":[[2021,3,4]]},"reference":[{"issue":"1","key":"2021061318593276100_ocab003-B1","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1097\/TP.0000000000002484","article-title":"Epidemiology of nonalcoholic fatty liver disease and nonalcoholic steatohepatitis: implications for liver transplantation","volume":"103","author":"Younossi","year":"2019","journal-title":"Transplantation"},{"issue":"1","key":"2021061318593276100_ocab003-B2","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1146\/annurev-med-051215-031109","article-title":"Nonalcoholic steatohepatitis","volume":"68","author":"Suzuki","year":"2017","journal-title":"Annu Rev Med"},{"issue":"1","key":"2021061318593276100_ocab003-B3","doi-asserted-by":"crossref","first-page":"15080","DOI":"10.1038\/nrdp.2015.80","article-title":"Nonalcoholic fatty liver disease","volume":"1","author":"Brunt","year":"2015","journal-title":"Nat Rev Dis Primers"},{"issue":"1","key":"2021061318593276100_ocab003-B4","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1002\/hep.29367","article-title":"The diagnosis and management of nonalcoholic fatty liver disease: practice guidance from the American Association for the Study of Liver Diseases","volume":"67","author":"Chalasani","year":"2018","journal-title":"Hepatology"},{"issue":"3","key":"2021061318593276100_ocab003-B5","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1002\/hep.22742","article-title":"American Association for the Study of Liver D. Liver biopsy","volume":"49","author":"Rockey","year":"2009","journal-title":"Hepatology"},{"key":"2021061318593276100_ocab003-B6","doi-asserted-by":"crossref","first-page":"154320","DOI":"10.1016\/j.metabol.2020.154320","article-title":"The role of omics in the pathophysiology, diagnosis and treatment of non-alcoholic fatty liver disease","volume":"111","author":"Perakakis","year":"2020","journal-title":"Metabolism"},{"issue":"1","key":"2021061318593276100_ocab003-B7","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1016\/S0933-3657(01)00077-X","article-title":"Machine learning for medical diagnosis: history, state of the art and perspective","volume":"23","author":"Kononenko","year":"2001","journal-title":"Artif Intell Med"},{"issue":"13","key":"2021061318593276100_ocab003-B8","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.1001\/jama.2017.18391","article-title":"Big data and machine learning in health care","volume":"319","author":"Beam","year":"2018","journal-title":"JAMA"},{"issue":"7","key":"2021061318593276100_ocab003-B9","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1038\/s41575-020-0315-7","article-title":"NAFLD - sounding the alarm on a silent epidemic","volume":"17","author":"Lazarus","year":"2020","journal-title":"Nat Rev Gastroenterol Hepatol"},{"issue":"S3","key":"2021061318593276100_ocab003-B10","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1186\/s12911-016-0318-z","article-title":"Nearest neighbor imputation algorithms: a critical evaluation","volume":"16","author":"Beretta","year":"2016","journal-title":"BMC Med Inform Decis Mak"},{"issue":"1","key":"2021061318593276100_ocab003-B11","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach Learn"},{"key":"2021061318593276100_ocab003-B12","first-page":"10","author":"Chen","year":"2016"},{"key":"2021061318593276100_ocab003-B13","first-page":"22","author":"Kotsiantis","year":"2007"},{"key":"2021061318593276100_ocab003-B14","author":"Safavian","year":"1990"},{"issue":"6","key":"2021061318593276100_ocab003-B15","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1021\/ci034160g","article-title":"Random forest: a classification and regression tool for compound classification and QSAR modeling","volume":"43","author":"Svetnik","year":"2003","journal-title":"J Chem Inf Comput Sci"},{"key":"2021061318593276100_ocab003-B16","first-page":"118","article-title":"Application of machine learning methods to predict non-alcohol fatty liver disease in Taiwanese high-tech industry workers","author":"Cheng","year":"2017","journal-title":"International Conference on Data Mining"},{"issue":"6","key":"2021061318593276100_ocab003-B17","doi-asserted-by":"crossref","first-page":"e1003149","DOI":"10.1371\/journal.pmed.1003149","article-title":"Predicting and elucidating the etiology of fatty liver disease: a machine learning modeling and validation study in the IMI DIRECT cohorts","volume":"17","author":"Atabaki-Pasdar","year":"2020","journal-title":"PLoS Med"},{"issue":"3","key":"2021061318593276100_ocab003-B18","doi-asserted-by":"crossref","first-page":"e0214436","DOI":"10.1371\/journal.pone.0214436","article-title":"Non-invasive assessment of NAFLD as systemic disease-A machine learning perspective","volume":"14","author":"Canbay","year":"2019","journal-title":"PLoS One"},{"key":"2021061318593276100_ocab003-B19","first-page":"430","article-title":"Application of machine learning methods to predict Non-Alcoholic Steatohepatitis (NASH) in Non-Alcoholic Fatty Liver (NAFL) patients","volume":"2018","author":"Fialoke","year":"2018","journal-title":"AMIA Annu Symp Proc"},{"key":"2021061318593276100_ocab003-B20","doi-asserted-by":"crossref","first-page":"154005","DOI":"10.1016\/j.metabol.2019.154005","article-title":"Non-invasive diagnosis of non-alcoholic steatohepatitis and fibrosis with the use of omics and supervised learning: a proof of concept study","volume":"101","author":"Perakakis","year":"2019","journal-title":"Metabolism"},{"key":"2021061318593276100_ocab003-B21","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1186\/s13040-017-0142-8","article-title":"EFS: an ensemble feature selection tool implemented as R-package and web-application","volume":"10","author":"Neumann","year":"2017","journal-title":"BioData Min"},{"issue":"5","key":"2021061318593276100_ocab003-B22","first-page":"389","article-title":"Non-alcoholic fatty liver disease: a narrative review of genetics","volume":"32","author":"Danford","year":"2018","journal-title":"J Biomed Res"},{"issue":"6","key":"2021061318593276100_ocab003-B23","doi-asserted-by":"crossref","first-page":"1592","DOI":"10.1016\/j.jhep.2020.07.020","article-title":"On the value and limitations of liver histology in assessing non-alcoholic steatohepatitis","volume":"73","author":"Schattenberg","year":"2020","journal-title":"J Hepatol"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/6\/1235\/38615433\/ocab003.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/6\/1235\/38615433\/ocab003.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,6,13]],"date-time":"2021-06-13T19:00:39Z","timestamp":1623610839000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/28\/6\/1235\/6158325"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,4]]},"references-count":23,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2021,3,4]]},"published-print":{"date-parts":[[2021,6,12]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocab003","relation":{},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,6,1]]},"published":{"date-parts":[[2021,3,4]]}}}