{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T16:38:54Z","timestamp":1764175134752,"version":"3.41.0"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,12,31]],"date-time":"2020-12-31T00:00:00Z","timestamp":1609372800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Federal Ministry of Education and Research of Germany","award":["16ES1125"],"award-info":[{"award-number":["16ES1125"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Auton. Adapt. Syst."],"published-print":{"date-parts":[[2020,12,31]]},"abstract":"<jats:p>The size of sensor networks supporting smart cities is ever increasing. Sensor network resiliency becomes vital for critical networks such as emergency response and waste water treatment. One approach is to engineer \u201cself-aware\u201d sensors that can proactively change their component composition in response to changes in work load when critical devices fail. By extension, these devices could anticipate their own termination, such as battery depletion, and offload current tasks onto connected devices. These neighboring devices can then reconfigure themselves to process these tasks, thus avoiding catastrophic network failure. In this article, we compare and contrast two types of self-aware sensors. One set uses Q-learning to develop a policy that guides device reaction to various environmental stimuli, whereas the others use a set of shallow neural networks to select an appropriate reaction. The novelty lies in the use of field programmable gate arrays embedded on the sensors that take into account internal system state, configuration, and learned state-action pairs, which guide device decisions to meet system demands. Experiments show that even relatively simple reward functions develop both Q-learning policies and shallow neural networks that yield positive device behaviors in dynamic environments.<\/jats:p>","DOI":"10.1145\/3487920","type":"journal-article","created":{"date-parts":[[2021,12,20]],"date-time":"2021-12-20T19:13:58Z","timestamp":1640027638000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Developing Action Policies with Q-Learning and Shallow Neural Networks on Reconfigurable Embedded Devices"],"prefix":"10.1145","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7537-5665","authenticated-orcid":false,"given":"Alwyn","family":"Burger","sequence":"first","affiliation":[{"name":"University of Duisburg-Essen, Duisburg, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4266-4828","authenticated-orcid":false,"given":"Gregor","family":"Schiele","sequence":"additional","affiliation":[{"name":"University of Duisburg-Essen, Duisburg, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4571-0319","authenticated-orcid":false,"given":"David W.","family":"King","sequence":"additional","affiliation":[{"name":"Air Force Institute of Technology, Dayton, Ohio, USA"}]}],"member":"320","published-online":{"date-parts":[[2021,12,20]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Chloe M. Barnes Anik\u00f3 Ek\u00e1rt and Peter R. Lewis. 2019. Social action in socially situated agents. In Proceedings of the International Conference on Self-Adaptive and Self-Organizing Systems (SASO\u201919) . 97\u2013106. https:\/\/doi.org\/10.1109\/SASO.2019.00021","DOI":"10.1109\/SASO.2019.00021"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2013.2293637"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.5555\/554879"},{"key":"e_1_3_2_5_2","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan et al. 2020. Language models are few-shot learners. arxiv:2005.14165 [cs.CL]."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/PerComWorkshops48775.2020.9156260"},{"key":"e_1_3_2_7_2","first-page":"555","article-title":"Demo abstract: Deep learning on an elastic node for the Internet of Things","author":"Burger Alwyn","year":"2018","unstructured":"Alwyn Burger and Gregor Schiele. 2018. Demo abstract: Deep learning on an elastic node for the Internet of Things. In Proceedings of the 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops\u201918). 555\u2013557.","journal-title":"In Proceedings of the 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops\u201918)"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-52794-5_3"},{"key":"e_1_3_2_9_2","first-page":"3","article-title":"An algorithmic description of XCS","volume":"6","author":"Butz Martin","year":"2001","unstructured":"Martin Butz and Stewart W. Wilson. 2001. An algorithmic description of XCS. Soft Computing 6 (2001), 3\u20134.","journal-title":"Soft Computing"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/CloudNet.2016.52"},{"key":"e_1_3_2_11_2","first-page":"1","article-title":"Multi-user multi-task computation offloading in green mobile edge cloud computing","volume":"1374","author":"Chen Weiwei","year":"2018","unstructured":"Weiwei Chen, Dong Wang, and Keqin Li. 2018. Multi-user multi-task computation offloading in green mobile edge cloud computing. IEEE Transactions on Services Computing 1374, c (2018), 1\u201313. https:\/\/doi.org\/10.1109\/TSC.2018.2826544","journal-title":"IEEE Transactions on Services Computing"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2014.2316834"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2018.2876279"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.5555\/3086952"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1002\/cplx.10048"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1086\/424968"},{"key":"e_1_3_2_17_2","first-page":"1","article-title":"Energy-efficient dynamic offloading and resource scheduling in mobile cloud computing","author":"Guo Songtao","year":"2016","unstructured":"Songtao Guo, Bin Xiao, Yuanyuan Yang, and Yang Yang. 2016. Energy-efficient dynamic offloading and resource scheduling in mobile cloud computing. Proceedings of IEEE INFOCOM.1\u20139. https:\/\/doi.org\/10.1109\/INFOCOM.2016.7524497","journal-title":"Proceedings of IEEE INFOCOM."},{"key":"e_1_3_2_18_2","unstructured":"Liang Huang Suzhi Bi and Ying-Jun Angela Zhang. 2018. Deep reinforcement learning for online offloading in wireless powered mobile-edge computing networks. arxiv:1808.01977. http:\/\/arxiv.org\/abs\/1808.01977."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.anbehav.2005.03.004"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2946422"},{"key":"e_1_3_2_21_2","volume-title":"Proceedings of the Artificial Life Conference (ALIFE\u201919)","author":"King David W.","year":"2019","unstructured":"David W. King, Lukas Esterle, and Gilbert Peterson. 2019. Entropy-based team self-organization with signal suppression. In Proceedings of the Artificial Life Conference (ALIFE\u201919)."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/SASO.2019.00022"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/SASOW.2011.25"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/2764460"},{"key":"e_1_3_2_25_2","doi-asserted-by":"crossref","unstructured":"Ji Li Hui Gao Tiejun Lv and Yueming Lu. 2018. Deep reinforcement learning based computation offloading and resource allocation for MEC. In Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC \u201918) . 1\u20136. https:\/\/doi.org\/10.1109\/WCNC.2018.8377343","DOI":"10.1109\/WCNC.2018.8377343"},{"key":"e_1_3_2_26_2","doi-asserted-by":"crossref","unstructured":"Juan Liu Yuyi Mao Jun Zhang and Khaled B. Letaief. 2016. Delay-optimal computation task scheduling for mobile-edge computing systems. In Proceedings of the IEEE International Symposium on Information Theory . 1451\u20131455. https:\/\/doi.org\/10.1109\/ISIT.2016.7541539","DOI":"10.1109\/ISIT.2016.7541539"},{"key":"e_1_3_2_27_2","doi-asserted-by":"crossref","unstructured":"Xiao Ma Chuang Lin Xudong Xiang and Congjie Chen. 2015. Game-theoretic analysis of computation offloading for cloudlet-based mobile cloud computing. In Proceedings of the 18th ACM International Conference on Modeling Analysis and Simulation of Wireless and Mobile Systems (MSWiM\u201915) . 271\u2013278. https:\/\/doi.org\/10.1145\/2811587.2811598","DOI":"10.1145\/2811587.2811598"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2017.2745201"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2016.2611964"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0134254"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2011.58"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSAIT.2020.2991332"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2019.01.012"},{"key":"e_1_3_2_35_2","volume-title":"Artificial Intelligence: A Modern Approach","author":"Russell Stuart","year":"2003","unstructured":"Stuart Russell and Peter Norvig. 2003. Artificial Intelligence: A Modern Approach(3rd ed.). Prentice Hall."},{"key":"e_1_3_2_36_2","volume-title":"Proceedings of the 16th Canadian Workshop on Information Theory (CWIT\u201919)","author":"Salmani Mahsa","year":"2019","unstructured":"Mahsa Salmani, Foad Sohrabi, Timothy N. Davidson, and Wei Yu. 2019. Multiple access binary computation oflloading via reinforcement learning. In Proceedings of the 16th Canadian Workshop on Information Theory (CWIT\u201919)."},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICAC.2019.00020"},{"key":"e_1_3_2_38_2","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton Richard S.","year":"2018","unstructured":"Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2011.05.003"},{"key":"e_1_3_2_40_2","unstructured":"Christopher John Cornish Hellaby Watkins. 1989. Learning from Delayed Rewards . Ph.D. Dissertation. King\u2019s College."},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCOMM.2020.3007742"},{"key":"e_1_3_2_42_2","first-page":"1","article-title":"A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs","author":"Xu Zhiyuan","year":"2017","unstructured":"Zhiyuan Xu, Yanzhi Wang, Jian Tang, Jing Wang, and Mustafa Cenk Gursoy. 2017. A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs. In Proceedings of the IEEE International Conference on Communications.1\u20136. https:\/\/doi.org\/10.1109\/ICC.2017.7997286","journal-title":"Proceedings of the IEEE International Conference on Communications."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICC.2014.6883978"}],"container-title":["ACM Transactions on Autonomous and Adaptive Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3487920","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3487920","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:11:56Z","timestamp":1750191116000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3487920"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,31]]},"references-count":42,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,12,31]]}},"alternative-id":["10.1145\/3487920"],"URL":"https:\/\/doi.org\/10.1145\/3487920","relation":{},"ISSN":["1556-4665","1556-4703"],"issn-type":[{"type":"print","value":"1556-4665"},{"type":"electronic","value":"1556-4703"}],"subject":[],"published":{"date-parts":[[2020,12,31]]},"assertion":[{"value":"2021-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-12-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}