{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T07:01:48Z","timestamp":1768978908364,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":36,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,11,18]],"date-time":"2020-11-18T00:00:00Z","timestamp":1605657600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,11,18]]},"DOI":"10.1145\/3408308.3427986","type":"proceedings-article","created":{"date-parts":[[2020,11,23]],"date-time":"2020-11-23T03:20:52Z","timestamp":1606101652000},"page":"50-59","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":49,"title":["MB2C"],"prefix":"10.1145","author":[{"given":"Xianzhong","family":"Ding","sequence":"first","affiliation":[{"name":"University of California, Merced"}]},{"given":"Wan","family":"Du","sequence":"additional","affiliation":[{"name":"University of California, Merced"}]},{"given":"Alberto E.","family":"Cerpa","sequence":"additional","affiliation":[{"name":"University of California, Merced"}]}],"member":"320","published-online":{"date-parts":[[2020,11,18]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Ltd. DR International.2012. 2011 building energy data book. https:\/\/openei.org\/doe-opendata\/dataset\/buildings-energy-data-book.  Ltd. DR International.2012. 2011 building energy data book. https:\/\/openei.org\/doe-opendata\/dataset\/buildings-energy-data-book."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.apenergy.2015.10.036"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2674061.2674072"},{"key":"e_1_3_2_1_4_1","volume-title":"Office: Optimization framework for improved comfort & efficiency","author":"Winkler Daniel A","year":"2020","unstructured":"Daniel A Winkler , Ashish Yadav , Claudia Chitu , and Alberto E Cerpa . Office: Optimization framework for improved comfort & efficiency . In ACM\/IEEE IPSN , 2020 . Daniel A Winkler, Ashish Yadav, Claudia Chitu, and Alberto E Cerpa. Office: Optimization framework for improved comfort & efficiency. In ACM\/IEEE IPSN, 2020."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1099-1514(199601\/03)17:1<71::AID-OCA561>3.0.CO;2-E"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360322.3360857"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276774.3276775"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.4324\/9781315142074-37"},{"key":"e_1_3_2_1_9_1","volume-title":"ACMe-Energy","author":"Park June Young","year":"2020","unstructured":"June Young Park and Zoltan Nagy . Hvaclearn : A reinforcement learning based occupant-centric control for thermostat set-points . In ACMe-Energy , 2020 . June Young Park and Zoltan Nagy. Hvaclearn: A reinforcement learning based occupant-centric control for thermostat set-points. In ACMe-Energy, 2020."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360322.3360861"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.enbuild.2011.12.005"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8463189"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989202"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360322.3361011"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9197465"},{"key":"e_1_3_2_1_16_1","volume-title":"ACM TOSN","author":"Fraternali Francesco","year":"2020","unstructured":"Francesco Fraternali , Bharathan Balaji , Yuvraj Agarwal , and Rajesh K Gupta . Aces : Automatic configuration of energy harvesting sensors with reinforcement learning . ACM TOSN , 2020 . Francesco Fraternali, Bharathan Balaji, Yuvraj Agarwal, and Rajesh K Gupta. Aces: Automatic configuration of energy harvesting sensors with reinforcement learning. ACM TOSN, 2020."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3421461"},{"key":"e_1_3_2_1_18_1","volume-title":"ACM SenSys","author":"Shen Zhihao","year":"2019","unstructured":"Zhihao Shen , Kang Yang , Wan Du , Xi Zhao , and Jianhua Zou . Deepapp : A deep reinforcement learning framework for mobile application usage prediction . In ACM SenSys , 2019 . Zhihao Shen, Kang Yang, Wan Du, Xi Zhao, and Jianhua Zou. Deepapp: A deep reinforcement learning framework for mobile application usage prediction. In ACM SenSys, 2019."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSN48538.2019.00018"},{"key":"e_1_3_2_1_20_1","volume-title":"IEEE ICDCS","author":"Liu Miaomiao","year":"2020","unstructured":"Miaomiao Liu , Xianzhong Ding , and Wan Du . Continuous, real-time object detection on mobiledevices without offloading . In IEEE ICDCS , 2020 . Miaomiao Liu, Xianzhong Ding, and Wan Du. Continuous, real-time object detection on mobiledevices without offloading. In IEEE ICDCS, 2020."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360322.3360849"},{"key":"e_1_3_2_1_22_1","volume-title":"Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347","author":"Schulman John","year":"2017","unstructured":"John Schulman , Filip Wolski , Prafulla Dhariwal , Alec Radford , and Oleg Klimov . Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 , 2017 . John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017."},{"key":"e_1_3_2_1_23_1","volume-title":"Thermal comfort. analysis and applications in environmental engineering. Thermal comfort. Analysis and applications in environmental engineering","author":"Fanger Poul O","year":"1970","unstructured":"Poul O Fanger Thermal comfort. analysis and applications in environmental engineering. Thermal comfort. Analysis and applications in environmental engineering ., 1970 . Poul O Fanger et al. Thermal comfort. analysis and applications in environmental engineering. Thermal comfort. Analysis and applications in environmental engineering., 1970."},{"key":"e_1_3_2_1_24_1","volume-title":"NeurIPS","author":"Lakshminarayanan Balaji","year":"2017","unstructured":"Balaji Lakshminarayanan , Alexander Pritzel , and Charles Blundell . Simple and scalable predictive uncertainty estimation using deep ensembles . In NeurIPS , 2017 . Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In NeurIPS, 2017."},{"key":"e_1_3_2_1_25_1","volume-title":"NeurIPS","author":"Chua Kurtland","year":"2018","unstructured":"Kurtland Chua , Roberto Calandra , Rowan McAllister , and Sergey Levine . Deep reinforcement learning in a handful of trials using probabilistic dynamics models . In NeurIPS , 2018 . Kurtland Chua, Roberto Calandra, Rowan McAllister, and Sergey Levine. Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In NeurIPS, 2018."},{"key":"e_1_3_2_1_26_1","unstructured":"2019 Sergey Levine. Model-based reinforcement learning. http:\/\/rail.eecs.berkeley.edu\/deeprlcourse\/.  2019 Sergey Levine. Model-based reinforcement learning. http:\/\/rail.eecs.berkeley.edu\/deeprlcourse\/."},{"key":"e_1_3_2_1_27_1","volume-title":"CoRL","author":"Nagabandi Anusha","year":"2020","unstructured":"Anusha Nagabandi , Kurt Konolige , Sergey Levine , and Vikash Kumar . Deep dynamics models for learning dexterous manipulation . In CoRL , 2020 . Anusha Nagabandi, Kurt Konolige, Sergey Levine, and Vikash Kumar. Deep dynamics models for learning dexterous manipulation. In CoRL, 2020."},{"key":"e_1_3_2_1_28_1","volume-title":"Standard 55--2004-thermal environmental conditions for human occupancy","author":"Standard A.","year":"2004","unstructured":"A. Standard . Standard 55--2004-thermal environmental conditions for human occupancy . ASHRAE Inc , 2004 . A. Standard. Standard 55--2004-thermal environmental conditions for human occupancy. ASHRAE Inc, 2004."},{"key":"e_1_3_2_1_29_1","first-page":"400","volume-title":"A stochastic approximation method. The annals of mathematical statistics","author":"Robbins Herbert","year":"1951","unstructured":"Herbert Robbins and Sutton Monro . A stochastic approximation method. The annals of mathematical statistics , pages 400 -- 407 , 1951 . Herbert Robbins and Sutton Monro. A stochastic approximation method. The annals of mathematical statistics, pages 400--407, 1951."},{"key":"e_1_3_2_1_30_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 , 2014 . Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014."},{"key":"e_1_3_2_1_31_1","volume-title":"AISTATS","author":"Glorot Xavier","year":"2010","unstructured":"Xavier Glorot and Yoshua Bengio . Understanding the difficulty of training deep feedforward neural networks . In AISTATS , 2010 . Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In AISTATS, 2010."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.enbuild.2013.01.008"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1080\/19401493.2010.518631"},{"key":"e_1_3_2_1_34_1","volume-title":"Handbook of statistics","author":"Botev Zdravko I","year":"2013","unstructured":"Zdravko I Botev , Dirk P Kroese , Reuven Y Rubinstein , and Pierre L'Ecuyer . The cross-entropy method for optimization . In Handbook of statistics . Elsevier , 2013 . Zdravko I Botev, Dirk P Kroese, Reuven Y Rubinstein, and Pierre L'Ecuyer. The cross-entropy method for optimization. In Handbook of statistics. Elsevier, 2013."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSTCC.2019.8885985"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2993422.2996394"}],"event":{"name":"BuildSys '20: The 7th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation","location":"Virtual Event Japan","acronym":"BuildSys '20","sponsor":["SIGEnergy ACM Special Interest Group on Energy Systems and Informatics"]},"container-title":["Proceedings of the 7th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3408308.3427986","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3408308.3427986","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:39:02Z","timestamp":1750199942000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3408308.3427986"}},"subtitle":["Model-Based Deep Reinforcement Learning for Multi-zone Building Control"],"short-title":[],"issued":{"date-parts":[[2020,11,18]]},"references-count":36,"alternative-id":["10.1145\/3408308.3427986","10.1145\/3408308"],"URL":"https:\/\/doi.org\/10.1145\/3408308.3427986","relation":{},"subject":[],"published":{"date-parts":[[2020,11,18]]},"assertion":[{"value":"2020-11-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}