{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T04:57:41Z","timestamp":1776747461533,"version":"3.51.2"},"reference-count":22,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2019,3,6]],"date-time":"2019-03-06T00:00:00Z","timestamp":1551830400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,3,6]],"date-time":"2019-03-06T00:00:00Z","timestamp":1551830400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["KAKENHI 17H00757"],"award-info":[{"award-number":["KAKENHI 17H00757"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["KAKENHI 16H00881"],"award-info":[{"award-number":["KAKENHI 16H00881"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003382","name":"Core Research for Evolutional Science and Technology","doi-asserted-by":"publisher","award":["JPMJCR1662"],"award-info":[{"award-number":["JPMJCR1662"]}],"id":[{"id":"10.13039\/501100003382","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2019,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We consider a novel stochastic multi-armed bandit problem called <jats:italic>good arm identification<\/jats:italic> (GAI), where a good arm is defined as an arm with expected reward greater than or equal to a given threshold. GAI is a pure-exploration problem in which a single agent repeats a process of outputting an arm as soon as it is identified as a good one before confirming the other arms are actually not good. The objective of GAI is to minimize the number of samples for each process. We find that GAI faces a new kind of dilemma, the <jats:italic>exploration-exploitation dilemma of confidence<\/jats:italic>, which is different from the best arm identification. As a result, an efficient design of algorithms for GAI is quite different from that for the best arm identification. We derive a lower bound on the sample complexity of GAI that is tight up to the logarithmic factor <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathrm {O}(\\log \\frac{1}{\\delta })$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:mo>log<\/mml:mo>\n                    <mml:mfrac>\n                      <mml:mn>1<\/mml:mn>\n                      <mml:mi>\u03b4<\/mml:mi>\n                    <\/mml:mfrac>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> for acceptance error rate <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\delta $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>\u03b4<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>. We also develop an algorithm whose sample complexity almost matches the lower bound. We also confirm experimentally that our proposed algorithm outperforms naive algorithms in synthetic settings based on a conventional bandit problem and clinical trial researches for rheumatoid arthritis.<\/jats:p>","DOI":"10.1007\/s10994-019-05784-4","type":"journal-article","created":{"date-parts":[[2019,3,6]],"date-time":"2019-03-06T20:46:11Z","timestamp":1551905171000},"page":"721-745","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Good arm identification via bandit feedback"],"prefix":"10.1007","volume":"108","author":[{"given":"Hideaki","family":"Kano","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junya","family":"Honda","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kentaro","family":"Sakamaki","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kentaro","family":"Matsuura","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Atsuyoshi","family":"Nakamura","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masashi","family":"Sugiyama","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2019,3,6]]},"reference":[{"key":"5784_CR1","unstructured":"Agrawal, S., & Goyal, N. (2012). Analysis of thompson sampling for the multi-armed bandit problem. In Proceedings of the 25th annual conference on learning theory (vol.\u00a023, pp. 39.1\u201339.26)."},{"issue":"2","key":"5784_CR2","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1023\/A:1013689704352","volume":"47","author":"P Auer","year":"2002","unstructured":"Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2), 235\u2013256.","journal-title":"Machine Learning"},{"issue":"3","key":"5784_CR3","first-page":"1516","volume":"41","author":"O Capp\u00e9","year":"2012","unstructured":"Capp\u00e9, O., Garivier, A., Maillard, O. A., Munos, R., & Stoltz, G. (2012). Kullback-leibler upper confidence bounds for optimal sequential allocation. The Annals of Statistics, 41(3), 1516\u20131541.","journal-title":"The Annals of Statistics"},{"issue":"5","key":"5784_CR4","doi-asserted-by":"publisher","first-page":"R132","DOI":"10.1186\/ar4312","volume":"15","author":"E Choy","year":"2013","unstructured":"Choy, E., Bendit, M., McAleer, D., Liu, F., Feeney, M., Brett, S., et al. (2013). Safety, tolerability, pharmacokinetics and pharmacodynamics of an anti- oncostatin m monoclonal antibody in rheumatoid arthritis: Results from phase ii randomized, placebo-controlled trials. Arthritis Research & Therapy, 15(5), R132.","journal-title":"Arthritis Research & Therapy"},{"issue":"10","key":"5784_CR5","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1002\/acr.22606","volume":"67","author":"J Curtis","year":"2015","unstructured":"Curtis, J., Yang, S., Chen, L., Pope, J., Keystone, E., Haraoui, B., et al. (2015). Determining the minimally important difference in the clinical disease activity index for improvement and worsening in early rheumatoid arthritis patients. Arthritis Care & Research, 67(10), 1345\u20131353.","journal-title":"Arthritis Care & Research"},{"key":"5784_CR6","first-page":"1079","volume":"7","author":"E Even-Dar","year":"2006","unstructured":"Even-Dar, E., Mannor, S., & Mansour, Y. (2006). Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7, 1079\u20131105.","journal-title":"Journal of Machine Learning Research"},{"issue":"6","key":"5784_CR7","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1136\/annrheumdis-2012-201601","volume":"72","author":"MC Genovese","year":"2013","unstructured":"Genovese, M. C., Durez, P., Richards, H. B., Supronik, J., Dokoupilova, E., Mazurov, V., et al. (2013). Efficacy and safety of secukinumab in patients with rheumatoid arthritis: a phase ii, dose-finding, double-blind, randomised, placebo controlled study. Annals of the Rheumatic Diseases, 72(6), 863\u2013869.","journal-title":"Annals of the Rheumatic Diseases"},{"issue":"4","key":"5784_CR8","doi-asserted-by":"publisher","first-page":"340","DOI":"10.1191\/1740774505cn094oa","volume":"2","author":"AP Grieve","year":"2005","unstructured":"Grieve, A. P., & Krams, M. (2005). ASTIN: A bayesian adaptive dose-response trial in acute stroke. Clinical Trials, 2(4), 340\u2013351.","journal-title":"Clinical Trials"},{"key":"5784_CR9","unstructured":"Jamieson, K., Malloy, M., Nowak, R., & Bubeck, S. (2014). lil\u2019 ucb : An optimal exploration algorithm for multi-armed bandits. In Proceedings of The 27th conference on learning theory (vol.\u00a035, pp. 423\u2013439)."},{"key":"5784_CR10","unstructured":"Jun, K. S., Jamieson, K., Nowak, R., & Zhu, X. (2016). Top arm identification in multi-armed bandits with batch arm pulls. In Proceedings of the 19th international conference on artificial intelligence and statistics (pp. 139\u2013148)."},{"key":"5784_CR11","unstructured":"Kalyanakrishnan, S., Tewari, A., Auer, P., & Stone, P. (2012). PAC subset selection in stochastic multi-armed bandits. In Proceedings of the 29th international conference on machine learning (pp. 655\u2013662)."},{"issue":"1","key":"5784_CR12","first-page":"1","volume":"17","author":"E Kaufmann","year":"2016","unstructured":"Kaufmann, E., Capp\u00e9, O., & Garivier, A. (2016). On the complexity of best-arm identification in multi-armed bandit models. Journal of Machine Learning Research, 17(1), 1\u201342.","journal-title":"Journal of Machine Learning Research"},{"issue":"1","key":"5784_CR13","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1158\/2159-8274.CD-10-0010","volume":"1","author":"ES Kim","year":"2011","unstructured":"Kim, E. S., Herbst, R. S., Wistuba, I. I., Lee, J. J., Blumenschein, G. R., Tsao, A., et al. (2011). The BATTLE trial: Personalizing therapy for lung cancer. Cancer Discovery, 1(1), 44\u201353.","journal-title":"Cancer Discovery"},{"issue":"3","key":"5784_CR14","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1080\/03610918508812467","volume":"14","author":"LW Koenig","year":"1985","unstructured":"Koenig, L. W., & Law, A. M. (1985). A procedure for selecting a subset of size m containing the l best of k independent normal populations, with applications to simulation. Communications in Statistics\u2013Simulation and Computation, 14(3), 719\u2013734.","journal-title":"Communications in Statistics\u2013Simulation and Computation"},{"issue":"1","key":"5784_CR15","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1016\/0196-8858(85)90002-8","volume":"6","author":"T Lai","year":"1985","unstructured":"Lai, T., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4\u201322.","journal-title":"Advances in Applied Mathematics"},{"issue":"1","key":"5784_CR16","doi-asserted-by":"publisher","first-page":"149","DOI":"10.1186\/s12874-017-0416-3","volume":"17","author":"F Liu","year":"2017","unstructured":"Liu, F., Walters, S. J., & Julious, S. A. (2017). Design considerations and analysis planning of a phase 2a proof of concept study in rheumatoid arthritis in the presence of possible non-monotonicity. BMC Medical Research Methodology, 17(1), 149.","journal-title":"BMC Medical Research Methodology"},{"key":"5784_CR17","unstructured":"Locatelli, A., Gutzeit, M., & Carpentier, A. (2016). An optimal algorithm for the thresholding bandit problem. In Proceedings of the 33rd international conference on machine learning (pp. 1690\u20131698)."},{"key":"5784_CR18","doi-asserted-by":"crossref","unstructured":"Mukherjee, S., Naveen, K. P., Sudarsanam, N., & Ravindran, B. (2017). Thresholding bandits with augmented UCB. In Proceedings of the 26th international joint conference on artificial intelligence (pp. 2515\u20132521).","DOI":"10.24963\/ijcai.2017\/350"},{"key":"5784_CR19","doi-asserted-by":"crossref","unstructured":"Schmidt, C., Branke, J., & Chick, S. E. (2006). Integrating techniques from statistical ranking into evolutionary algorithms. In Applications of evolutionary computing (pp. 752\u2013763). Springer: Heidelberg.","DOI":"10.1007\/11732242_73"},{"key":"5784_CR20","volume-title":"Introduction to reinforcement learning","author":"RS Sutton","year":"1998","unstructured":"Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (1st ed.). Cambridge: MIT Press.","edition":"1"},{"key":"5784_CR21","doi-asserted-by":"crossref","unstructured":"Tang, L., Jiang, Y., Li, L., Zeng, C., & Li, T. (2015). Personalized recommendation via parameter-free contextual bandits. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (pp. 323\u2013332).","DOI":"10.1145\/2766462.2767707"},{"key":"5784_CR22","unstructured":"Zhou, Y., Chen, X., & Li, J. (2014). Optimal pac multiple arm identification with applications to crowdsourcing. In Proceedings of the 31st international conference on machine learning (pp. 217\u2013225)."}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-019-05784-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-019-05784-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-019-05784-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,5]],"date-time":"2023-04-05T17:08:20Z","timestamp":1680714500000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-019-05784-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,6]]},"references-count":22,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2019,5,15]]}},"alternative-id":["5784"],"URL":"https:\/\/doi.org\/10.1007\/s10994-019-05784-4","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,3,6]]},"assertion":[{"value":"19 April 2018","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 January 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 March 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}