{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T13:09:57Z","timestamp":1770815397571,"version":"3.50.1"},"reference-count":81,"publisher":"Institute for Operations Research and the Management Sciences (INFORMS)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Management Science"],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:p>Optimizing the treatment regimen is a fundamental medical decision-making problem. This can be thought of as a two-dimensional decision-making problem with a nested structure because it involves determining both the optimal medication and its optimal dose. Identifying the most effective medication for an individual often poses considerable difficulty, and even when a suitable medication is ascertained, dosing it optimally remains a significant challenge. Making these two nested decisions necessitates the adaptive learning of a personalized disease progression control model. To address this problem, we propose a novel contextual multiarmed bandit model under a two-dimensional control with a nested structure. For this model, we develop a new joint contextual learning and optimization algorithm, termed the stochastic subgradient descent atop contextual multiarmed bandit (SGD-MAB) algorithm. It sequentially selects for a patient (i) the best medication based on their contextual information and (ii) the corresponding dose optimized over the prior history of those patients who received the same medication. We prove that it admits a sublinear regret, which is tight up to a logarithmic factor. Our regret analysis leverages the strengths of both contextual bandit approaches and online convex optimization techniques in a seamless fashion. We substantiate the practicality of SGD-MAB using clinical data on patients with hypertension and heightened cardiovascular risks. Our analysis indicates that SGD-MAB has the potential to surpass current practices. We benchmark several policies to show the advantages of our approach and offer critical insights. Our framework holds promise for various applications beyond healthcare that require nested decision-making.<\/jats:p>\n                  <jats:p>This paper was accepted by J. George Shanthikumar, data science.<\/jats:p>\n                  <jats:p>Funding: This work was supported by the National Science Foundation (CMMI-1548201, CMMI-1634505) and the National Eye Institute (NIH Grant R01EY026641).<\/jats:p>\n                  <jats:p>Supplemental Material: The online appendix and data files are available at https:\/\/doi.org\/10.1287\/mnsc.2019.03211 .<\/jats:p>","DOI":"10.1287\/mnsc.2019.03211","type":"journal-article","created":{"date-parts":[[2025,5,2]],"date-time":"2025-05-02T11:07:37Z","timestamp":1746184057000},"page":"10442-10464","source":"Crossref","is-referenced-by-count":3,"title":["Contextual Learning with Online Convex Optimization: Theory and Application to Medical Decision-Making"],"prefix":"10.1287","volume":"71","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9634-3806","authenticated-orcid":false,"given":"Esmaeil","family":"Keyvanshokooh","sequence":"first","affiliation":[{"name":"Information and Operations Management, Mays Business School, Texas A&M University, College Station, Texas 77845"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1174-6102","authenticated-orcid":false,"given":"Mohammad","family":"Zhalechian","sequence":"additional","affiliation":[{"name":"Operations and Decision Technologies, Kelley School of Business, Indiana University, Bloomington, Indiana 47405"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3564-3391","authenticated-orcid":false,"given":"Cong","family":"Shi","sequence":"additional","affiliation":[{"name":"Management, Herbert Business School, University of Miami, Coral Gables, Florida 33146"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8685-7843","authenticated-orcid":false,"given":"Mark P.","family":"Van Oyen","sequence":"additional","affiliation":[{"name":"Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48105"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2846-3862","authenticated-orcid":false,"given":"Pooyan","family":"Kazemian","sequence":"additional","affiliation":[{"name":"Operations, Weatherhead School of Management, Case Western Reserve University, Cleveland, Ohio 44106"}]}],"member":"109","reference":[{"key":"B1","unstructured":"Abbasi-Yadkori Y, P\u00e1l D, Szepesv\u00e1ri C (2011) Improved algorithms for linear stochastic bandits.\n                      Proc. 24th Internat. Conf. Neural Inform. Processing Systems,\n                      NIPS\u201911 (Curran Associates Inc. Red Hook, NY), 2312\u20132320."},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1214\/17-EJS1341SI"},{"key":"B3","unstructured":"Agrawal S, Goyal N (2013) Thompson sampling for contextual bandits with linear payoffs.\n                      Proc. 30th Internat. Conf. Internat. Conf. Machine Learn.\n                      , vol. 28 III, ICML\u201913 (JMLR.org, Atlanta, GA), 1220\u20131228."},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2015.06.077"},{"key":"B5","doi-asserted-by":"publisher","DOI":"10.2337\/dc19-S010"},{"key":"B6","doi-asserted-by":"crossref","unstructured":"Anderer A, Bastani H, Silberholz J (2020) Adaptive clinical trial designs with surrogates: When should we bother? Working paper, University of Michigan, Ann Arbor, MI.","DOI":"10.2139\/ssrn.3397464"},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2020.4984"},{"key":"B8","doi-asserted-by":"publisher","DOI":"10.2337\/diacare.27.2007.S65"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.1002\/ana.25280"},{"issue":"11","key":"B10","first-page":"397","volume":"3","author":"Auer P","year":"2002","journal-title":"J. Machine Learn. Res."},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.2015.2180"},{"key":"B12","volume":"32","author":"Bai Y","year":"2019","journal-title":"Adv. Neural Inform. Processing Systems"},{"key":"B13","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2019.1902"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.2020.3605"},{"key":"B15","doi-asserted-by":"publisher","DOI":"10.1016\/S2213-8587(17)30221-8"},{"key":"B16","doi-asserted-by":"crossref","unstructured":"Baucum M, Khojandi A, Vasudevan R, Ramdhani R (2023) Optimizing patient-specific medication regimen policies using wearable sensors in parkinson\u2019s disease.\n                      Management Sci.\n                      69(10):5964\u20135982.","DOI":"10.1287\/mnsc.2023.4747"},{"key":"B17","doi-asserted-by":"publisher","DOI":"10.1016\/S0140-6736(09)60731-5"},{"key":"B18","doi-asserted-by":"crossref","unstructured":"Bertsimas D, Borenstein ARA, Dauvin A, Orfanoudaki A (2021) Ensemble machine learning for personalized antihypertensive treatment.\n                      Naval Res. Logist. (NRL)\n                      69(5):669\u2013688.","DOI":"10.1002\/nav.22040"},{"key":"B19","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.2015.2363"},{"key":"B20","doi-asserted-by":"publisher","DOI":"10.1287\/ijoo.2019.0019"},{"key":"B22","unstructured":"Bouneffouf D, Rish I (2019) A survey on practical applications of multi-armed and contextual bandits. Preprint, submitted April 2, https:\/\/arxiv.org\/abs\/1904.10040."},{"issue":"5","key":"B23","first-page":"1655","volume":"12","author":"Bubeck S","year":"2011","journal-title":"J. Machine Learn. Res."},{"key":"B24","unstructured":"Cao J, Gao R, Keyvanshokooh E (2025a) Hr-bandit: Human-ai collaborated linear recourse bandit.\n                      Internat. Conf. Artificial Intelligence Statist.\n                      (PMLR, New York)."},{"key":"B25","doi-asserted-by":"crossref","unstructured":"Cao J, Keyvanshokooh E, Liu T (2025b) Safe reinforcement learning with contextual information: Theory and applications. Preprint, submitted September 25, http:\/\/dx.doi.org\/10.2139\/ssrn.4583667.","DOI":"10.2139\/ssrn.4583667"},{"key":"B26","doi-asserted-by":"publisher","DOI":"10.1177\/0272989X17705636"},{"key":"B27","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pmed.1001971"},{"key":"B28","unstructured":"Centers for Disease Control and Prevention (2020) Blood pressure medicines. Accessed June 6, 2022, https:\/\/www.cdc.gov\/bloodpressure\/medicines.htm."},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.1120.1640"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2018.1817"},{"key":"B32","doi-asserted-by":"publisher","DOI":"10.1287\/isre.2021.1002"},{"key":"B33","doi-asserted-by":"crossref","unstructured":"Cheung WC, Simchi-Levi D, Zhu R (2021) Hedging the drift: Learning to optimize under nonstationarity.\n                      Management Sci.\n                      68(3):1696\u20131713.","DOI":"10.1287\/mnsc.2021.4024"},{"key":"B34","doi-asserted-by":"crossref","unstructured":"Chick SE, Gans N, Yapar \u00d6 (2021) Bayesian sequential learning for clinical trials of multiple correlated medical interventions.\n                      Management Sci.\n                      68(7):4919\u20134938.","DOI":"10.1287\/mnsc.2021.4137"},{"key":"B35","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-med-092012-112310"},{"key":"B36","unstructured":"Chu W, Li L, Reyzin L, Schapire R (2011) Contextual bandits with linear payoff functions.\n                      Proc. Fourteenth Internat. Conf. Artificial Intelligence Statist. AISTATS\u201911\n                      (JLMR, Ft. Lauderdale, FL), 208\u2013214."},{"key":"B37","unstructured":"Dani V, Hayes TP, Kakade SM (2008) Stochastic linear optimization under bandit feedback. Working paper, University of Chicago, Chicago, IL."},{"key":"B38","doi-asserted-by":"publisher","DOI":"10.1287\/educ.2018.0184"},{"key":"B39","doi-asserted-by":"publisher","DOI":"10.1080\/19488300.2011.619157"},{"key":"B40","unstructured":"Epstein CCL (2014) An analytics approach to hypertension treatment. PhD thesis, Massachusetts Institute of Technology, Cambridge."},{"key":"B41","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2018.1755"},{"key":"B42","unstructured":"Filippi S, Capp\u00e9 O, Garivier A, Szepesv\u00e1ri C (2010) Parametric bandits: The generalized linear case.\n                      Proc. 23rd Internat. Conf. Neural Inform. Processing Systems\n                      , vol. 1, NIPS\u201910 (Curran Associates Inc. Red Hook, NY), 586\u2013594."},{"key":"B44","unstructured":"Gan K, Keyvanshokooh E, Liu X, Murphy S (2024) Contextual bandits with budgeted information reveal.\n                      Internat. Conf. Artificial Intelligence Statist.\n                      (PMLR, New York), 3970\u20133978."},{"key":"B45","doi-asserted-by":"publisher","DOI":"10.1016\/j.jacc.2013.11.005"},{"key":"B46","doi-asserted-by":"publisher","DOI":"10.1287\/11-SSY032"},{"key":"B47","doi-asserted-by":"publisher","DOI":"10.1056\/NEJMoa1001286"},{"key":"B48","doi-asserted-by":"publisher","DOI":"10.1183\/13993003.00547-2020"},{"key":"B49","unstructured":"Hamidi N, Bayati M (2020) A general framework to analyze stochastic linear bandit. Working paper, Stanford University, Stanford, CA."},{"key":"B51","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2015.1405"},{"key":"B52","doi-asserted-by":"publisher","DOI":"10.1111\/poms.12891"},{"key":"B53","doi-asserted-by":"publisher","DOI":"10.1200\/JCO.2013.54.6051"},{"key":"B54","doi-asserted-by":"publisher","DOI":"10.1111\/poms.12514"},{"key":"B55","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2013.284427"},{"key":"B56","unstructured":"Keyvanshokooh E (2021) Personalized data-driven learning and optimization: theory and applications to healthcare. PhD thesis, University of Michigan at Ann Arbor, Ann Arbor."},{"key":"B57","doi-asserted-by":"publisher","DOI":"10.1016\/j.phrs.2017.11.003"},{"key":"B58","doi-asserted-by":"crossref","unstructured":"Kleinberg R, Slivkins A, Upfal E (2008) Multi-armed bandits in metric spaces.\n                      Proc. Fortieth Annual ACM Sympos. Theory Comput., STOC\u201908\n                      (Association for Computing Machinery, New York), 681\u2013690.","DOI":"10.1145\/1374376.1374475"},{"key":"B59","doi-asserted-by":"publisher","DOI":"10.1017\/9781108571401"},{"key":"B60","doi-asserted-by":"crossref","unstructured":"Law M, Morris J, Wald N (2009) Use of blood pressure lowering drugs in the prevention of cardiovascular disease: Meta-analysis of 147 randomised trials in the context of expectations from prospective epidemiological studies.\n                      BMJ\n                      338:1\u201319.","DOI":"10.1136\/bmj.b1665"},{"key":"B61","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1080.0613"},{"key":"B62","doi-asserted-by":"publisher","DOI":"10.1287\/inte.2018.0964"},{"key":"B63","unstructured":"Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits.\n                      Proc. 34th Internat. Conf. Machine Learn.\n                      , vol. 70, ICML\u201917 (JMLR.org, Sydney, Australia), 2071\u20132080."},{"key":"B64","unstructured":"Liu X, Gan K, Keyvanshokooh E, Murphy S (2025) Online uniform sampling: Randomized learning-augmented approximation algorithms with application to digital health. Preprint, submitted February 3, https:\/\/arxiv.org\/abs\/2402.01995."},{"key":"B65","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2013.09.018"},{"key":"B66","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2019.1918"},{"issue":"1","key":"B67","first-page":"655","volume":"15","author":"Moore BL","year":"2014","journal-title":"J. Machine Learn. Res."},{"key":"B68","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.2017.2793"},{"key":"B69","doi-asserted-by":"publisher","DOI":"10.1016\/j.mbs.2017.08.004"},{"key":"B70","doi-asserted-by":"publisher","DOI":"10.1016\/j.mbs.2019.01.012"},{"key":"B71","doi-asserted-by":"publisher","DOI":"10.1016\/S0140-6736(14)62459-4"},{"key":"B72","unstructured":"Qiao D, Yin M, Min M, Wang YX (2022) Sample-efficient reinforcement learning with loglog (t) switching cost.\n                      Internat. Conf. Machine Learn.\n                      (PMLR), 18031\u201318061."},{"key":"B73","doi-asserted-by":"publisher","DOI":"10.1001\/jamacardio.2016.3517"},{"key":"B74","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2021.3224"},{"key":"B75","doi-asserted-by":"publisher","DOI":"10.1287\/moor.1100.0446"},{"key":"B76","doi-asserted-by":"publisher","DOI":"10.1287\/moor.2014.0650"},{"key":"B77","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2020.2011"},{"issue":"3","key":"B79","first-page":"59","volume":"29","author":"Tunc S","year":"2014","journal-title":"IEEE Intell. Syst."},{"key":"B80","doi-asserted-by":"publisher","DOI":"10.1161\/CIR.0000000000000950"},{"key":"B83","doi-asserted-by":"publisher","DOI":"10.1016\/j.jacc.2017.11.006"},{"key":"B84","unstructured":"WHO (2020) Hearts: Technical package for cardiovascular disease management in primary health care."},{"key":"B85","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.1120.1587"},{"key":"B86","volume-title":"Clinical Trial Design: Bayesian and Frequentist Adaptive Methods","author":"Yin G","year":"2012"},{"key":"B87","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.2020.3799"},{"key":"B88","doi-asserted-by":"crossref","unstructured":"Zhou J, Liu J, Narayan VA, Ye J (2012) Modeling disease progression via fused sparse group lasso.\n                      Proc. 18th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining, KDD \u201812\n                      (Association for Computing Machinery, New York), 1095\u20131103.","DOI":"10.1145\/2339530.2339702"}],"container-title":["Management Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/pubsonline.informs.org\/doi\/pdf\/10.1287\/mnsc.2019.03211","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T09:42:15Z","timestamp":1764927735000},"score":1,"resource":{"primary":{"URL":"https:\/\/pubsonline.informs.org\/doi\/10.1287\/mnsc.2019.03211"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12]]},"references-count":81,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["10.1287\/mnsc.2019.03211"],"URL":"https:\/\/doi.org\/10.1287\/mnsc.2019.03211","relation":{},"ISSN":["0025-1909","1526-5501"],"issn-type":[{"value":"0025-1909","type":"print"},{"value":"1526-5501","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12]]}}}