{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T07:12:31Z","timestamp":1760080351144,"version":"3.41.0"},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2019,12,6]],"date-time":"2019-12-06T00:00:00Z","timestamp":1575590400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["AI Matters"],"published-print":{"date-parts":[[2019,12,6]]},"abstract":"<jats:p>With the rapid pace of advancement in the field of artificial intelligence (AI), this essay aims to accentuate the importance of corrigibility in AI in order to stimulate and catalyze more effort and focus in this research area. We will first introduce the idea of corrigibility with its properties and describe the expected behavior for a corrigible AI. Afterwards, based on the established meaning of corrigibility, we will showcase the importance of corrigibility by going over some modern and near-futuristic examples that are specifically selected to be relatable and foreseeable. Then, we will explore existing methods of establishing corrigibility in agents and their respective limitations, using the reinforcement learning (RL) framework as a proxy framework to artificial general intelligence (AGI). At last, we will identify the central themes of potential research frontiers that we believe would be crucial to boosting quality research output in corrigibility.<\/jats:p>","DOI":"10.1145\/3362077.3362089","type":"journal-article","created":{"date-parts":[[2019,12,9]],"date-time":"2019-12-09T13:35:27Z","timestamp":1575898527000},"page":"77-84","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["The necessary roadblock to artificial general intelligence"],"prefix":"10.1145","volume":"5","author":[{"given":"Yat Long","family":"Lo","sequence":"first","affiliation":[{"name":"University of Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chung Yu","family":"Woo","sequence":"additional","affiliation":[{"name":"University of Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ka Lok","family":"Ng","sequence":"additional","affiliation":[{"name":"University of Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,12,6]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Evas R. Jumper J. Kirkpatric J. Sifre L. Green T.F.G. Zidek A. Nelson A. Bridgland A. Penedones H. Petersen S. Simonya K. Crossan S. Jones D.T. Silver D. Kavukcuoglu K. Hassabis D. Senior A.W.. (December 2018). De novo structure prediction with deep-learning based scoring. In Thirteenth Critical Assessment of Techniques for Protein Structure Prediction (Abstracts). Retrieved from https:\/\/deepmind.com\/documents\/262\/A7D_AlphaFold.pdf.  Evas R. Jumper J. Kirkpatric J. Sifre L. Green T.F.G. Zidek A. Nelson A. Bridgland A. Penedones H. Petersen S. Simonya K. Crossan S. Jones D.T. Silver D. Kavukcuoglu K. Hassabis D. Senior A.W.. (December 2018). De novo structure prediction with deep-learning based scoring. In Thirteenth Critical Assessment of Techniques for Protein Structure Prediction (Abstracts). Retrieved from https:\/\/deepmind.com\/documents\/262\/A7D_AlphaFold.pdf."},{"key":"e_1_2_1_2_1","unstructured":"Zhou L. Gao J. Li D. Shum H. Y. (2018). The Design and Implementation of XiaoIce an Empathetic Social Chatbot. arXiv preprint arXiv:1812.08989.  Zhou L. Gao J. Li D. Shum H. Y. (2018). The Design and Implementation of XiaoIce an Empathetic Social Chatbot. arXiv preprint arXiv:1812.08989."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.11222"},{"key":"e_1_2_1_4_1","unstructured":"Piper K.(2019 09 January). The American public is already worried about AI catastrophe. Retrieved from https:\/\/www.vox.com\/future-perfect\/2019\/1\/9\/18174081\/fhi-govai-ai-safety-american-public-worried-ai-catastrophe.  Piper K.(2019 09 January). The American public is already worried about AI catastrophe. Retrieved from https:\/\/www.vox.com\/future-perfect\/2019\/1\/9\/18174081\/fhi-govai-ai-safety-american-public-worried-ai-catastrophe."},{"key":"e_1_2_1_5_1","unstructured":"Hern\u00e1ndez-Orallo J. Martinez-Plumed F. Avin S. (n.d.). Surveying Safety-relevant AI Characteristics.  Hern\u00e1ndez-Orallo J. Martinez-Plumed F. Avin S. (n.d.). Surveying Safety-relevant AI Characteristics."},{"key":"e_1_2_1_6_1","unstructured":"Amodei D. Olah C. Steinhardt J. Christiano P. Schulman J. Man\u00e9 D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.  Amodei D. Olah C. Steinhardt J. Christiano P. Schulman J. Man\u00e9 D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565."},{"volume-title":"Corrigibility. In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence.","year":"2015","author":"Soares N.","key":"e_1_2_1_7_1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Kendall A. Hawke J. Janz D. Mazur P. Reda D. Allen J. M. ... Shah A. (2018). Learning to Drive in a Day. arXiv preprint arXiv:1807.00412.  Kendall A. Hawke J. Janz D. Mazur P. Reda D. Allen J. M. ... Shah A. (2018). Learning to Drive in a Day. arXiv preprint arXiv:1807.00412.","DOI":"10.1109\/ICRA.2019.8793742"},{"key":"e_1_2_1_9_1","unstructured":"Russell S. LaVictoire P. (2016). Corrigibility in AI systems Retrieved from https:\/\/intelligence.org\/files\/CorrigibilityAISystems.pdf.  Russell S. LaVictoire P. (2016). Corrigibility in AI systems Retrieved from https:\/\/intelligence.org\/files\/CorrigibilityAISystems.pdf."},{"key":"e_1_2_1_10_1","unstructured":"OpenAI. (2017 March 20). Faulty Reward Functions in the Wild. Retrieved February 8 2019 from https:\/\/blog.openai.com\/faulty-reward-functions\/  OpenAI. (2017 March 20). Faulty Reward Functions in the Wild. Retrieved February 8 2019 from https:\/\/blog.openai.com\/faulty-reward-functions\/"},{"volume-title":"The Surgical Singularity Is Approaching. Retrieved","year":"2018","author":"Panesar S. S.","key":"e_1_2_1_11_1"},{"volume-title":"The Weaponization Of Artificial Intelligence. Retrieved","year":"2019","author":"Pandya J.","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","unstructured":"Sutton R. S. Barto A. G. (2018). Reinforcement learning: An introduction. MIT press.  Sutton R. S. Barto A. G. (2018). Reinforcement learning: An introduction. MIT press."},{"key":"e_1_2_1_14_1","first-page":"278","volume-title":"Proceedings of the 16th International Conference on Machine Learning","author":"Ng D.","year":"1999"},{"volume-title":"Thirty-Second AAAI Conference on Artificial Intelligence.","year":"2018","author":"Wu Y. H.","key":"e_1_2_1_15_1"},{"key":"e_1_2_1_16_1","unstructured":"Armstrong S. O'Rourke X. (2017).)`In-difference' methods for managing agent rewards. arXiv preprint arXiv:1712.06365.  Armstrong S. O'Rourke X. (2017).)`In-difference' methods for managing agent rewards. arXiv preprint arXiv:1712.06365."},{"volume-title":"Workshops at the Thirty-First AAAI Conference on Artificial Intelligence.","year":"2017","author":"Hadfield-Menell D.","key":"e_1_2_1_17_1"},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Carey R. (2017). Incorrigibility in the CIRL Framework. arXiv preprint arXiv:1709.06275.  Carey R. (2017). Incorrigibility in the CIRL Framework. arXiv preprint arXiv:1709.06275.","DOI":"10.1145\/3278721.3278750"},{"key":"e_1_2_1_19_1","unstructured":"Leike J. Martic M. Krakovna V. Ortega P. A. Everitt T. Lefrancq A. ... Legg S. (2017). safety gridworlds. arXiv preprint arXiv:1711.09883.  Leike J. Martic M. Krakovna V. Ortega P. A. Everitt T. Lefrancq A. ... Legg S. (2017). safety gridworlds. arXiv preprint arXiv:1711.09883."},{"key":"e_1_2_1_20_1","unstructured":"Orseau L. Armstrong M. S. (2016). Safely interruptible agents.  Orseau L. Armstrong M. S. (2016). Safely interruptible agents."},{"volume-title":"AAAI Workshop: AI, Ethics, and Society.","year":"2016","author":"Taylor J.","key":"e_1_2_1_21_1"},{"key":"e_1_2_1_22_1","first-page":"9785","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33","author":"Rossi F.","year":"2019"}],"container-title":["AI Matters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3362077.3362089","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3362077.3362089","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:44:54Z","timestamp":1750203894000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3362077.3362089"}},"subtitle":["corrigibility"],"short-title":[],"issued":{"date-parts":[[2019,12,6]]},"references-count":22,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2019,12,6]]}},"alternative-id":["10.1145\/3362077.3362089"],"URL":"https:\/\/doi.org\/10.1145\/3362077.3362089","relation":{},"ISSN":["2372-3483"],"issn-type":[{"type":"electronic","value":"2372-3483"}],"subject":[],"published":{"date-parts":[[2019,12,6]]},"assertion":[{"value":"2019-12-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}