{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T14:54:15Z","timestamp":1770044055810,"version":"3.49.0"},"reference-count":14,"publisher":"SAGE Publications","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IFS"],"published-print":{"date-parts":[[2021,9,15]]},"abstract":"<jats:p>Online advertisements are bought through a mechanism called real-time bidding (RTB). In RTB, the ads are auctioned in real-time on every webpage load. The ad auctions can be of two types: second-price or first-price auctions. In second-price auctions, the bidder with the highest bid wins the auction, but they only pay the second-highest bid. This paper focuses on first-price auctions, where the buyer pays the amount that they bid. This research evaluates how multi-armed bandit strategies optimize the bid size in a commercial demand-side platform (DSP) that buys inventory through ad exchanges. First, we analyze seven multi-armed bandit algorithms on two different offline real datasets gathered from real second-price auctions. Then, we test and compare the performance of three algorithms in a production environment. Our results show that real data from second-price auctions can be used successfully to model first-price auctions. Moreover, we found that the trained multi-armed bandit algorithms reduce the bidding costs considerably compared to the baseline (na\u00efve approach) on average 29%and optimize the whole budget by slightly reducing the win rate (on average 7.7%). Our findings, tested in a real scenario, show a clear and substantial economic benefit for ad buyers using DSPs.<\/jats:p>","DOI":"10.3233\/jifs-202665","type":"journal-article","created":{"date-parts":[[2021,8,27]],"date-time":"2021-08-27T12:18:32Z","timestamp":1630066712000},"page":"6111-6125","source":"Crossref","is-referenced-by-count":0,"title":["Multi-armed bandits for bid shading in first-price real-time bidding auctions"],"prefix":"10.1177","volume":"41","author":[{"given":"Tuomo","family":"Tilli","sequence":"first","affiliation":[{"name":"ReadPeak Oy, Helsinki, Finland"},{"name":"Department of Bussiness Management and Analytics, Arcada University of Applied Sciences, Helsinki, Finland"}]},{"given":"Leonardo","family":"Espinosa-Leal","sequence":"additional","affiliation":[{"name":"Department of Bussiness Management and Analytics, Arcada University of Applied Sciences, Helsinki, Finland"}]}],"member":"179","reference":[{"issue":"6","key":"10.3233\/JIFS-202665_ref1","doi-asserted-by":"crossref","first-page":"6","DOI":"10.9781\/ijimai.2016.361","article-title":"Operating anadvertising programmatic buying platform: A case study","volume":"3","author":"Gonzalvez-Caba\u00f1as","year":"2016","journal-title":"IJIMAI"},{"issue":"2","key":"10.3233\/JIFS-202665_ref3","first-page":"1","article-title":"AdX: a model for ad exchanges","volume":"8","author":"Muthukrishnan","year":"2009","journal-title":"ACM SIGecomExchanges"},{"issue":"1","key":"10.3233\/JIFS-202665_ref4","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1257\/aer.97.1.242","article-title":"Internet advertising andthe generalized second-price auction: Selling billions of dollars worthof keywords","volume":"97","author":"Edelman","year":"2007","journal-title":"American Economic Review"},{"key":"10.3233\/JIFS-202665_ref8","doi-asserted-by":"crossref","unstructured":"Lattimore T. and Szepesv\u00e1ri C. , Bandit algorithms, Cambridge University Press (2020).","DOI":"10.1017\/9781108571401"},{"issue":"282","key":"10.3233\/JIFS-202665_ref14","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1080\/01621459.1958.10501452","article-title":"Nonparametric estimation from incompleteobservations","volume":"53","author":"Kaplan","year":"1958","journal-title":"Journal of the American Statistical Association"},{"key":"10.3233\/JIFS-202665_ref15","unstructured":"Sutton R.S. and Barto A.G. , Reinforcement learning: An introduction, MIT press (2018)."},{"issue":"1","key":"10.3233\/JIFS-202665_ref16","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1137\/S0097539701398375","article-title":"Thenonstochastic multiarmed bandit problem","volume":"32","author":"Auer","year":"2002","journal-title":"SIAM Journal onComputing"},{"issue":"1998","key":"10.3233\/JIFS-202665_ref28","first-page":"377","article-title":"Optimizing Production ManufacturingUsing Reinforcement Learning., in:","volume":"372","author":"Mahadevan","journal-title":"FLAIRS Conference"},{"issue":"7","key":"10.3233\/JIFS-202665_ref29","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1016\/j.engappai.2009.01.014","article-title":"Dynamic scheduling ofmaintenance tasks in the petroleum industry: A reinforcementapproach","volume":"22","author":"Aissani","year":"2009","journal-title":"Engineering Applications of Artificial Intelligence"},{"key":"10.3233\/JIFS-202665_ref30","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.engappai.2019.01.010","article-title":"learning for pricing strategy optimization in theinsurance industry","volume":"80","author":"Krasheninnikova","year":"2019","journal-title":"Engineering Applications of ArtificialIntelligence"},{"key":"10.3233\/JIFS-202665_ref32","doi-asserted-by":"crossref","unstructured":"Espinosa-LealL., ChapmanA. and WesterlundM., Autonomous Industrial Management via Reinforcement Learning, Journal of intelligent & Fuzzy systems 39(6) (2020), 8427\u20138439.","DOI":"10.3233\/JIFS-189161"},{"issue":"1","key":"10.3233\/JIFS-202665_ref33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2200000024","article-title":"Regret analysis of stochasticand nonstochastic multi-armed bandit problems,\u00aes","volume":"5","author":"Bubeck","year":"2012","journal-title":"in Machine Learning"},{"issue":"2\u20133","key":"10.3233\/JIFS-202665_ref34","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1023\/A:1013689704352","article-title":"Finite-time analysis of themultiarmed bandit problem","volume":"47","author":"Auer","year":"2002","journal-title":"Machine Learning"},{"key":"10.3233\/JIFS-202665_ref39","doi-asserted-by":"crossref","unstructured":"Nguyen H.T. and Kofod-Petersen A. , Using multi-armed bandit to solve cold-start problems in recommender systems at telco, in: Mining Intelligence and Knowledge Exploration, Springer (2014), 21\u201330.","DOI":"10.1007\/978-3-319-13817-6_3"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JIFS-202665","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T03:24:34Z","timestamp":1770002674000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JIFS-202665"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,15]]},"references-count":14,"journal-issue":{"issue":"6"},"URL":"https:\/\/doi.org\/10.3233\/jifs-202665","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,15]]}}}