{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T13:38:30Z","timestamp":1753882710508,"version":"3.41.2"},"reference-count":43,"publisher":"World Scientific Pub Co Pte Ltd","issue":"14","funder":[{"name":"Technological Development Program","award":["20200403130SF","20220201139GX","20230402049GH"],"award-info":[{"award-number":["20200403130SF","20220201139GX","20230402049GH"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61902142"],"award-info":[{"award-number":["61902142"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"NSF","award":["CCF-2130688","CCF-1900904"],"award-info":[{"award-number":["CCF-2130688","CCF-1900904"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J CIRCUIT SYST COMP"],"published-print":{"date-parts":[[2023,9,30]]},"abstract":"<jats:p> Simultaneously localization and mapping (SLAM) is a core component in many embedded domains, e.g., robots, augmented and virtual reality. Due to SLAM\u2019s high demand on computation resources, general-purpose graphic processing units (GPGPUs) are often used as its processing engine. Meanwhile, embedded systems usually have strict power constraint. Thus, how to deliver required performance for SLAM, yet still meet the power limit, is a great challenge faced by GPGPU designer. In this work, we discover the general principles of designing energy-efficient GPGPU for SLAM as \u201cmany SMs, enough SPs and registers, small caches\u201d, by analyzing the implication of individual design parameters on both performance and power. Then, we conduct large-scale design space exploration and fit the Pareto frontier with a two-term exponential model. Further, we construct gradient boosting decision tree (GBDT)-based design models to predict the performance and power given the design parameters. The evaluation shows that our GBDT-based models can achieve [Formula: see text]3% mean average percentage error, which significantly outperform other machine learning models. With these models, a kernel\u2019s requirement on hardware resources can be well understood. Based on such knowledge, we introduce design model guided power management strategies, including power gating and dynamic frequency and voltage scaling (DFVS). Overall, by combining these two power management strategies, we can improve the energy delay product by 36%. <\/jats:p>","DOI":"10.1142\/s0218126623502390","type":"journal-article","created":{"date-parts":[[2023,2,24]],"date-time":"2023-02-24T13:18:25Z","timestamp":1677244705000},"source":"Crossref","is-referenced-by-count":0,"title":["Architectural Design Model Guided On-Demand Power Management of Energy-Efficient GPGPU for SLAM"],"prefix":"10.1142","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9154-3294","authenticated-orcid":false,"given":"Kaige","family":"Yan","sequence":"first","affiliation":[{"name":"College of Communication Engineering, Jilin University, 2699 Qianjin Street, Changchun 130012, P. R. China"}]},{"given":"Zhujun","family":"Ma","sequence":"additional","affiliation":[{"name":"College of Communication Engineering, Jilin University, 2699 Qianjin Street, Changchun 130012, P. R. China"}]},{"given":"Caiwei","family":"Li","sequence":"additional","affiliation":[{"name":"College of Communication Engineering, Jilin University, 2699 Qianjin Street, Changchun 130012, P. R. China"}]},{"given":"Xin","family":"Fu","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Houston, N308 Engineering Building 1, Houston, TX 77004-4005, USA"}]},{"given":"Jingweijia","family":"Tan","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, 2699 Qianjin Street, Changchun 130012, P. R. China"}]}],"member":"219","published-online":{"date-parts":[[2023,4,21]]},"reference":[{"key":"S0218126623502390BIB001","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1007\/978-0-387-31439-6_280","volume-title":"Computer Vision: A Reference Guide","author":"Perera S.","year":"2014"},{"key":"S0218126623502390BIB002","doi-asserted-by":"crossref","first-page":"2020","DOI":"10.1109\/JPROC.2018.2856739","volume":"106","author":"Saeedi S.","year":"2018","journal-title":"Proc. IEEE"},{"key":"S0218126623502390BIB003","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.1109\/TRO.2016.2624754","volume":"32","author":"Cadena C.","year":"2016","journal-title":"IEEE Trans. Robot."},{"key":"S0218126623502390BIB004","doi-asserted-by":"crossref","first-page":"5783","DOI":"10.1109\/ICRA.2015.7140009","volume-title":"2015 IEEE Int. Conf. Robotics and Automation (ICRA)","author":"Nardi L.","year":"2015"},{"key":"S0218126623502390BIB006","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1109\/ISMAR.2011.6092378","volume-title":"Proc. 2011 10th IEEE Int. Symp. Mixed and Augmented Reality","author":"Newcombe R. A.","year":"2011"},{"key":"S0218126623502390BIB007","doi-asserted-by":"crossref","first-page":"1524","DOI":"10.1109\/ICRA.2014.6907054","volume-title":"2014 IEEE Int. Conf. Robotics and Automation (ICRA)","author":"Handa A.","year":"2014"},{"key":"S0218126623502390BIB008","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","volume":"38","author":"Friedman J. H.","year":"2002","journal-title":"Comput. Stat. Data Anal."},{"key":"S0218126623502390BIB009","doi-asserted-by":"crossref","DOI":"10.1109\/9780470544365","volume-title":"Design of High-Performance Microprocessor Circuits","author":"Chandrakasan A.","year":"2000"},{"key":"S0218126623502390BIB010","first-page":"300","volume-title":"2012 Design, Automation & Test in Europe Conf. & Exhibition (DATE)","author":"Wang Y.","year":"2012"},{"key":"S0218126623502390BIB011","doi-asserted-by":"crossref","first-page":"487","DOI":"10.1145\/2508148.2485964","volume":"41","author":"Leng J.","year":"2013","journal-title":"ACM SIGARCH Comput. Archit. News"},{"key":"S0218126623502390BIB012","first-page":"240","volume-title":"Proc. 15th ACM Int. Conf. Computing Frontiers","author":"Dutta B.","year":"2018"},{"key":"S0218126623502390BIB013","first-page":"152","volume-title":"Proc. 36th Annual Int. Symp. Computer Architecture","author":"Hong S.","year":"2009"},{"key":"S0218126623502390BIB014","first-page":"104","volume-title":"2017 Int. Conf. Cloud and Autonomic Computing (ICCAC)","author":"Gianelli S.","year":"2017"},{"key":"S0218126623502390BIB015","first-page":"3","volume-title":"2016 Int. Conf. Parallel Architecture and Compilation Techniques (PACT)","author":"Tan J.","year":"2016"},{"key":"S0218126623502390BIB016","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1016\/j.micpro.2019.06.001","volume":"69","author":"Jooya A.","year":"2019","journal-title":"Microprocess. Microsyst."},{"key":"S0218126623502390BIB017","first-page":"159","volume-title":"2015 IEEE Pacific Rim Conf. Communications, Computers and Signal Processing (PACRIM)","author":"Jooya A.","year":"2015"},{"key":"S0218126623502390BIB018","first-page":"163","volume-title":"2009 IEEE Int. Symp. Performance Analysis of Systems and Software","author":"Bakhoda A.","year":"2009"},{"volume-title":"The Elements of Statistical Learning: Data Mining, Inference, and Prediction","year":"2013","author":"Jerome R. T.","key":"S0218126623502390BIB020"},{"key":"S0218126623502390BIB021","first-page":"123","volume-title":"2008 IEEE 14th Int. Symp. High Performance Computer Architecture","author":"Kim W.","year":"2008"},{"key":"S0218126623502390BIB023","first-page":"541","volume-title":"Proc. 41st Annual Int. Symp. Computer Architecture (ISCA\u201914)","author":"Zhu Y.","year":"2014"},{"first-page":"97","volume-title":"2014 ACM\/IEEE 41st Int. Symp. Computer Architecture (ISCA)","author":"Shao Y. S.","key":"S0218126623502390BIB024"},{"key":"S0218126623502390BIB025","doi-asserted-by":"crossref","first-page":"101023","DOI":"10.1016\/j.pmcj.2019.05.004","volume":"58","author":"Yan K.","year":"2019","journal-title":"Pervasive Mob. Comput."},{"key":"S0218126623502390BIB026","doi-asserted-by":"publisher","DOI":"10.1142\/S0218126614300025"},{"key":"S0218126623502390BIB027","doi-asserted-by":"crossref","first-page":"1450021","DOI":"10.1142\/S0218126614500212","volume":"23","author":"Hsiao C.-C.","year":"2014","journal-title":"J. Circuits Syst. Comput."},{"key":"S0218126623502390BIB028","first-page":"1","volume-title":"2018 Ninth Int. Green and Sustainable Computing Conf. (IGSC)","author":"Lopez S.","year":"2018"},{"key":"S0218126623502390BIB029","doi-asserted-by":"crossref","first-page":"891","DOI":"10.1145\/3373376.3378505","volume-title":"Proc. Twenty-Fifth Int. Conf. Architectural Support for Programming Languages and Operating Systems","author":"Peng X.","year":"2020"},{"key":"S0218126623502390BIB030","doi-asserted-by":"crossref","first-page":"2823","DOI":"10.1109\/TPDS.2021.3078254","volume":"32","author":"Zhang X.","year":"2021","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"S0218126623502390BIB031","first-page":"137","volume-title":"2019 IEEE Int. Symp. Performance Analysis of Systems and Software (ISPASS)","author":"Karki A.","year":"2019"},{"key":"S0218126623502390BIB032","doi-asserted-by":"crossref","first-page":"847","DOI":"10.1109\/IPDPS49936.2021.00094","volume-title":"2021 IEEE Int. Parallel and Distributed Processing Symp. (IPDPS)","author":"Wang Z.","year":"2021"},{"key":"S0218126623502390BIB033","doi-asserted-by":"crossref","first-page":"1170","DOI":"10.1109\/HPCA53966.2022.00089","volume-title":"2022 IEEE Int. Symp. High-Performance Computer Architecture (HPCA)","author":"Yahya J. H.","year":"2022"},{"key":"S0218126623502390BIB034","doi-asserted-by":"publisher","DOI":"10.1142\/S0218126616500444"},{"key":"S0218126623502390BIB035","doi-asserted-by":"crossref","first-page":"1750041","DOI":"10.1142\/S0218126617500414","volume":"26","author":"Nag A.","year":"2017","journal-title":"J. Circuits Syst. Comput."},{"key":"S0218126623502390BIB036","doi-asserted-by":"crossref","first-page":"102","DOI":"10.5626\/JCSE.2020.14.3.102","volume":"14","author":"Wang X.","year":"2020","journal-title":"J. Comput. Sci. Eng."},{"key":"S0218126623502390BIB037","doi-asserted-by":"crossref","first-page":"2496","DOI":"10.1109\/TPDS.2022.3144614","volume":"33","author":"Nabavinejad S. M.","year":"2022","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"S0218126623502390BIB038","doi-asserted-by":"crossref","first-page":"7806","DOI":"10.1109\/TII.2021.3073066","volume":"17","author":"Cao K.","year":"2021","journal-title":"IEEE Trans. Ind. Inf."},{"key":"S0218126623502390BIB039","doi-asserted-by":"crossref","first-page":"22267","DOI":"10.1109\/JIOT.2021.3102421","volume":"9","author":"Cao K.","year":"2022","journal-title":"IEEE Internet Things J."},{"key":"S0218126623502390BIB040","doi-asserted-by":"crossref","first-page":"2502","DOI":"10.1109\/ICRA.2018.8460664","volume-title":"2018 IEEE Int. Conf. Robotics and Automation (ICRA)","author":"Delmerico J.","year":"2018"},{"key":"S0218126623502390BIB041","first-page":"1","volume":"27","author":"Bu T.","year":"2021","journal-title":"ACM Trans. Des. Autom. Electron. Syst. (TODAES)"},{"key":"S0218126623502390BIB042","doi-asserted-by":"crossref","first-page":"5716","DOI":"10.1109\/ICRA.2017.7989673","volume-title":"2017 IEEE Int. Conf. Robotics and Automation (ICRA)","author":"Saeedi S.","year":"2017"},{"key":"S0218126623502390BIB043","doi-asserted-by":"crossref","first-page":"1434","DOI":"10.1109\/IPDPSW.2017.107","volume-title":"2017 IEEE Int. Parallel and Distributed Processing Symp. Workshops (IPDPSW)","author":"Nardi L.","year":"2017"},{"key":"S0218126623502390BIB044","first-page":"57","volume-title":"2016 Int. Conf. Parallel Architecture and Compilation Techniques (PACT)","author":"Bodin B.","year":"2016"},{"key":"S0218126623502390BIB045","doi-asserted-by":"crossref","first-page":"1292","DOI":"10.1109\/ICRA.2016.7487261","volume-title":"2016 IEEE Int. Conf. Robotics and Automation (ICRA)","author":"Zia M. Z.","year":"2016"},{"key":"S0218126623502390BIB047","first-page":"397","volume-title":"2018 IEEE 36th Int. Conf. Computer Design (ICCD)","author":"Chen H.","year":"2018"}],"container-title":["Journal of Circuits, Systems and Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218126623502390","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,18]],"date-time":"2023-09-18T09:02:29Z","timestamp":1695027749000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S0218126623502390"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,21]]},"references-count":43,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2023,9,30]]}},"alternative-id":["10.1142\/S0218126623502390"],"URL":"https:\/\/doi.org\/10.1142\/s0218126623502390","relation":{},"ISSN":["0218-1266","1793-6454"],"issn-type":[{"type":"print","value":"0218-1266"},{"type":"electronic","value":"1793-6454"}],"subject":[],"published":{"date-parts":[[2023,4,21]]},"article-number":"2350239"}}