{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:14:28Z","timestamp":1760710468961,"version":"3.41.2"},"reference-count":33,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,1,30]],"date-time":"2021-01-30T00:00:00Z","timestamp":1611964800000},"content-version":"vor","delay-in-days":29,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic\u2010review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost\u2010sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near\u2010optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular\u2010based Q\u2010learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of \u201ccurse of dimension\u201d is circumvented.<\/jats:p>","DOI":"10.1155\/2021\/6643131","type":"journal-article","created":{"date-parts":[[2021,1,30]],"date-time":"2021-01-30T19:35:06Z","timestamp":1612035306000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9037-1312","authenticated-orcid":false,"given":"Rui","family":"Wang","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7242-9787","authenticated-orcid":false,"given":"Xianghua","family":"Gan","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1440-428X","authenticated-orcid":false,"given":"Qing","family":"Li","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8770-4531","authenticated-orcid":false,"given":"Xiao","family":"Yan","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2021,1,30]]},"reference":[{"key":"e_1_2_10_1_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4419-6485-4_15"},{"key":"e_1_2_10_2_2","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2014.1261"},{"key":"e_1_2_10_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2008.03.007"},{"key":"e_1_2_10_4_2","doi-asserted-by":"publisher","DOI":"10.1111\/exsy.12054"},{"key":"e_1_2_10_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2017.08.046"},{"key":"e_1_2_10_6_2","unstructured":"KeJ. XiaoF. YangH. andYeJ. Optimizing online matching for ride-sourcing services with multi-agent deep reinforcement learning 2019 http:\/\/arxiv.org\/abs\/1902.06228."},{"key":"e_1_2_10_7_2","unstructured":"ShihabS. A. M. LogemannC. ThomasD.-G. andWeiP. Autonomous airline revenue management: a deep reinforcement learning approach to seat inventory control and overbooking 2019 http:\/\/arxiv.org\/abs\/1902.06824."},{"key":"e_1_2_10_8_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1937-5956.2006.tb00245.x"},{"key":"e_1_2_10_9_2","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1110.1033"},{"key":"e_1_2_10_10_2","doi-asserted-by":"publisher","DOI":"10.1287\/msom.2013.0450"},{"key":"e_1_2_10_11_2","doi-asserted-by":"publisher","DOI":"10.1155\/2017\/1460163"},{"key":"e_1_2_10_12_2","doi-asserted-by":"publisher","DOI":"10.1155\/2018\/6348413"},{"key":"e_1_2_10_13_2","doi-asserted-by":"publisher","DOI":"10.1155\/2018\/1967398"},{"key":"e_1_2_10_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.omega.2018.10.010"},{"key":"e_1_2_10_15_2","doi-asserted-by":"publisher","DOI":"10.1002\/nav.3800200202"},{"key":"e_1_2_10_16_2","doi-asserted-by":"publisher","DOI":"10.1287\/opre.23.4.735"},{"key":"e_1_2_10_17_2","doi-asserted-by":"publisher","DOI":"10.1287\/opre.23.1.46"},{"key":"e_1_2_10_18_2","doi-asserted-by":"publisher","DOI":"10.1287\/opre.30.4.680"},{"key":"e_1_2_10_19_2","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.30.7.777"},{"key":"e_1_2_10_20_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1937-5956.2007.tb00261.x"},{"key":"e_1_2_10_21_2","doi-asserted-by":"publisher","DOI":"10.1155\/2019\/9253605"},{"key":"e_1_2_10_22_2","doi-asserted-by":"publisher","DOI":"10.1287\/msom.1080.0238"},{"key":"e_1_2_10_23_2","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.3060847"},{"key":"e_1_2_10_24_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1937-5956.2011.01288.x"},{"key":"e_1_2_10_25_2","doi-asserted-by":"publisher","DOI":"10.1515\/9783110670905"},{"key":"e_1_2_10_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/s0925-5273(00)00156-0"},{"key":"e_1_2_10_27_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2008.07.036"},{"key":"e_1_2_10_28_2","doi-asserted-by":"publisher","DOI":"10.1080\/10429247.2010.11431878"},{"key":"e_1_2_10_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00170-012-4195-z"},{"key":"e_1_2_10_30_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2014.07.007"},{"key":"e_1_2_10_31_2","unstructured":"WatkinsC. J. C. H. Learning from delayed rewards 1989 Ph.D. thesis."},{"key":"e_1_2_10_32_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_2_10_33_2","unstructured":"DarkenC. ChangJ. andMoodyJ. Learning rate schedules for faster stochastic gradient search Proceedings of the 1992 IEEE Workshop on Neural Networks for Signal Processing II 1992 Helsingoer Denmark IEEE 3\u201312."}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/6643131.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/6643131.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/6643131","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,9]],"date-time":"2024-08-09T22:16:51Z","timestamp":1723241811000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/6643131"}},"subtitle":[],"editor":[{"given":"Abd E.I.-Baset","family":"Hassanien","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/6643131"],"URL":"https:\/\/doi.org\/10.1155\/2021\/6643131","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"type":"print","value":"1076-2787"},{"type":"electronic","value":"1099-0526"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2020-10-24","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-01-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-01-30","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"6643131"}}