{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T19:23:11Z","timestamp":1774120991677,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":56,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Yonsei Signature Research Cluster Program (2022-22-0002)"},{"name":"National Research Foundation of Korea (NRF)","award":["2020R1F1A1069742, 2022R1C1C1008131, 2022R1C1C1011307, R1F1A1062902"],"award-info":[{"award-number":["2020R1F1A1069742, 2022R1C1C1008131, 2022R1C1C1011307, R1F1A1062902"]}]},{"name":"Institute of Information & communications Technology Planning & Evaluation (IITP)","award":["2021-0-00853, 2020-0-01361, 2021-0-02051)"],"award-info":[{"award-number":["2021-0-00853, 2020-0-01361, 2021-0-02051)"]}]},{"name":"Ministry of Education (MOE) of Korea"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,18]]},"DOI":"10.1145\/3470496.3527384","type":"proceedings-article","created":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T19:06:01Z","timestamp":1654023961000},"page":"424-436","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["GCoM"],"prefix":"10.1145","author":[{"given":"Jounghoo","family":"Lee","sequence":"first","affiliation":[{"name":"Yonsei University, Seoul, South Korea"}]},{"given":"Yeonan","family":"Ha","sequence":"additional","affiliation":[{"name":"Yonsei University, Seoul, South Korea"}]},{"given":"Suhyun","family":"Lee","sequence":"additional","affiliation":[{"name":"Yonsei University, Seoul, South Korea"}]},{"given":"Jinyoung","family":"Woo","sequence":"additional","affiliation":[{"name":"Ajou University, Suwon, South Korea"}]},{"given":"Jinho","family":"Lee","sequence":"additional","affiliation":[{"name":"Yonsei University, Seoul, South Korea"}]},{"given":"Hanhwi","family":"Jang","sequence":"additional","affiliation":[{"name":"Ajou University, Suwon, South Korea"}]},{"given":"Youngsok","family":"Kim","sequence":"additional","affiliation":[{"name":"Yonsei University, Seoul, South Korea"}]}],"member":"320","published-online":{"date-parts":[[2022,6,11]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"RADEON: Dissecting the Polaris Architecture. https:\/\/www.amd.com\/system\/files\/documents\/polaris-whitepaper.pdf.","author":"Devices Advanced Micro","year":"2016","unstructured":"Advanced Micro Devices , Inc . 2016 . RADEON: Dissecting the Polaris Architecture. https:\/\/www.amd.com\/system\/files\/documents\/polaris-whitepaper.pdf. Advanced Micro Devices, Inc. 2016. RADEON: Dissecting the Polaris Architecture. https:\/\/www.amd.com\/system\/files\/documents\/polaris-whitepaper.pdf."},{"key":"e_1_3_2_1_2_1","volume-title":"Proc. 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).","author":"Alsop Johnathan","unstructured":"Johnathan Alsop , Matthew D. Sinclair , Rakesh Komuravelli , and Sarita V. Adve . 2016. GSI: A GPU Stall Inspector to Characterize the Sources of Memory Stalls for Tightly Coupled GPUs . In Proc. 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). Johnathan Alsop, Matthew D. Sinclair, Rakesh Komuravelli, and Sarita V. Adve. 2016. GSI: A GPU Stall Inspector to Characterize the Sources of Memory Stalls for Tightly Coupled GPUs. In Proc. 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3476221"},{"key":"e_1_3_2_1_4_1","volume-title":"PPT-GPU: Scalable GPU Performance Modeling","author":"Arafa Yehia","year":"2019","unstructured":"Yehia Arafa , Abdel-Hameed A. Badawy , Gopinath Chennupati , Nandakishore Santhi , and Stephan Eidenbenz . 2019. PPT-GPU: Scalable GPU Performance Modeling . IEEE Computer Architecture Letters (CAL) 18 ( 2019 ). Yehia Arafa, Abdel-Hameed A. Badawy, Gopinath Chennupati, Nandakishore Santhi, and Stephan Eidenbenz. 2019. PPT-GPU: Scalable GPU Performance Modeling. IEEE Computer Architecture Letters (CAL) 18 (2019)."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2830772.2830780"},{"key":"e_1_3_2_1_6_1","volume-title":"Proc. 54th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Baddouh Cesar A.","unstructured":"Cesar A. Baddouh , Mahmoud Khairy , Roland Green , Mathias Payer , and Timothy G. Rogers . 2021. Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads . In Proc. 54th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). Cesar A. Baddouh, Mahmoud Khairy, Roland Green, Mathias Payer, and Timothy G. Rogers. 2021. Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads. In Proc. 54th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1693453.1693470"},{"key":"e_1_3_2_1_8_1","volume-title":"A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels. ACM Transactions on Architecture and Code Optimization (TACO) 18","author":"Braun Lorenz","year":"2020","unstructured":"Lorenz Braun , Sotirios Nikas , Chen Song , Vincent Heuveline , and Holger Fr\u00f6ning . 2020. A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels. ACM Transactions on Architecture and Code Optimization (TACO) 18 ( 2020 ). Lorenz Braun, Sotirios Nikas, Chen Song, Vincent Heuveline, and Holger Fr\u00f6ning. 2020. A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels. ACM Transactions on Architecture and Code Optimization (TACO) 18 (2020)."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2020.2971677"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2013.6704684"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_3_2_1_12_1","volume-title":"Volta: Performance and Programmability","author":"Choquette Jack","year":"2018","unstructured":"Jack Choquette , Olivier Giroux , and Denis Foley . 2018 . Volta: Performance and Programmability . IEEE Micro 38 (2018). Jack Choquette, Olivier Giroux, and Denis Foley. 2018. Volta: Performance and Programmability. IEEE Micro 38 (2018)."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2019.00061"},{"key":"e_1_3_2_1_14_1","volume-title":"Smith","author":"Eyerman Stijin","year":"2009","unstructured":"Stijin Eyerman , Lieven Eeckhout , Tejas Karkhanis , and James E . Smith . 2009 . A Mechanistic Performance Model for Superscalar Out-of-Order Processors. ACM Transactions on Computer Systems 27 (2009). Stijin Eyerman, Lieven Eeckhout, Tejas Karkhanis, and James E. Smith. 2009. A Mechanistic Performance Model for Superscalar Out-of-Order Processors. ACM Transactions on Computer Systems 27 (2009)."},{"key":"e_1_3_2_1_15_1","volume":"200","author":"Fields B.A.","unstructured":"B.A. Fields , R. Bodik , M.D. Hill , and C. J. Newburn. 200 3. Using interaction costs for microarchitectural bottleneck analysis. In Proc. 36th IEEE\/ACM International Symposium on Microarchitecture (MICRO). B.A. Fields, R. Bodik, M.D. Hill, and C.J. Newburn. 2003. Using interaction costs for microarchitectural bottleneck analysis. In Proc. 36th IEEE\/ACM International Symposium on Microarchitecture (MICRO).","journal-title":"J. Newburn."},{"key":"e_1_3_2_1_16_1","volume-title":"Proc. 28th IEEE\/ACM International Symposium on Computer Architecture (ISCA).","author":"Fields B.","unstructured":"B. Fields , S. Rubin , and R. Bodik . 2001. Focusing processor policies via critical-path prediction . In Proc. 28th IEEE\/ACM International Symposium on Computer Architecture (ISCA). B. Fields, S. Rubin, and R. Bodik. 2001. Focusing processor policies via critical-path prediction. In Proc. 28th IEEE\/ACM International Symposium on Computer Architecture (ISCA)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2017.37"},{"key":"e_1_3_2_1_18_1","volume-title":"John Nickolls, Joshua Anderson, Jim Hardwick, Scott Morton, Everett Phillips, Yao Zhang, and Vasily Volkov.","author":"Garland Michael","year":"2008","unstructured":"Michael Garland , Scott Le Grand , John Nickolls, Joshua Anderson, Jim Hardwick, Scott Morton, Everett Phillips, Yao Zhang, and Vasily Volkov. 2008 . Parallel Computing Experiences with CUDA. IEEE Micro 28 (2008). Michael Garland, Scott Le Grand, John Nickolls, Joshua Anderson, Jim Hardwick, Scott Morton, Everett Phillips, Yao Zhang, and Vasily Volkov. 2008. Parallel Computing Experiences with CUDA. IEEE Micro 28 (2008)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372799.3394359"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Scott Grauer-Gray Lifan Xu Robert Searles Sudhee Ayalasomayajula and John Cavazos. 2012. Auto-tuning a bhigh-level language targeted to GPU codes. In 2012 Innovative Parallel Computing (InPar).  Scott Grauer-Gray Lifan Xu Robert Searles Sudhee Ayalasomayajula and John Cavazos. 2012. Auto-tuning a bhigh-level language targeted to GPU codes. In 2012 Innovative Parallel Computing (InPar).","DOI":"10.1109\/InPar.2012.6339595"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2951218"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00058"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/RTAS48715.2020.000-8"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555775"},{"key":"e_1_3_2_1_25_1","volume-title":"Proc. 47th IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Huang Jen-Cheng","unstructured":"Jen-Cheng Huang , Joo Hwan Lee , Hyesoon Kim , and Hsien-Hsin S. Lee . 2014. GPUMech: GPU Performance Modeling Techniques based on Interval Analysis . In Proc. 47th IEEE\/ACM International Symposium on Microarchitecture (MICRO). Jen-Cheng Huang, Joo Hwan Lee, Hyesoon Kim, and Hsien-Hsin S. Lee. 2014. GPUMech: GPU Performance Modeling Techniques based on Interval Analysis. In Proc. 47th IEEE\/ACM International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_26_1","volume-title":"Proc. 28th IEEE International Parallel and Distributed Processing Symposium (IPDPS).","author":"Huang Jen-Cheng","unstructured":"Jen-Cheng Huang , Lifeng Nai , Hyesoon Kim , and Hsien-Hsin S. Lee . 2014. TB-Point: Reducing Simulation Time for Large-Scale GPGPU Kernels . In Proc. 28th IEEE International Parallel and Distributed Processing Symposium (IPDPS). Jen-Cheng Huang, Lifeng Nai, Hyesoon Kim, and Hsien-Hsin S. Lee. 2014. TB-Point: Reducing Simulation Time for Large-Scale GPGPU Kernels. In Proc. 28th IEEE International Parallel and Distributed Processing Symposium (IPDPS)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2018.00054"},{"key":"e_1_3_2_1_28_1","volume-title":"Proc. 31st IEEE\/ACM International Symposium on Computer Architecture (ISCA).","author":"Tejas","unstructured":"Tejas S. Karkhanis and James E. Smith. 2004. A First-Order Superscalar Processor Model . In Proc. 31st IEEE\/ACM International Symposium on Computer Architecture (ISCA). Tejas S. Karkhanis and James E. Smith. 2004. A First-Order Superscalar Processor Model. In Proc. 31st IEEE\/ACM International Symposium on Computer Architecture (ISCA)."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3300053.3319418"},{"key":"e_1_3_2_1_30_1","volume-title":"Proc. 47th IEEE\/ACM International Symposium on Computer Architecture (ISCA).","author":"Khairy Mahmoud","unstructured":"Mahmoud Khairy , Zhesheng Shen , Tor M. Aamodt , and Timothy G. Rogers . 2020. Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling . In Proc. 47th IEEE\/ACM International Symposium on Computer Architecture (ISCA). Mahmoud Khairy, Zhesheng Shen, Tor M. Aamodt, and Timothy G. Rogers. 2020. Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling. In Proc. 47th IEEE\/ACM International Symposium on Computer Architecture (ISCA)."},{"key":"e_1_3_2_1_31_1","volume-title":"Efficient Cache Performance Modeling in GPUs Using Reuse Distance Analysis. ACM Transactions on Architecture and Code Optimization (TACO) 15","author":"Kiani Mohsen","year":"2018","unstructured":"Mohsen Kiani and Amir Rajabzadeh . 2018. Efficient Cache Performance Modeling in GPUs Using Reuse Distance Analysis. ACM Transactions on Architecture and Code Optimization (TACO) 15 ( 2018 ). Mohsen Kiani and Amir Rajabzadeh. 2018. Efficient Cache Performance Modeling in GPUs Using Reuse Distance Analysis. ACM Transactions on Architecture and Code Optimization (TACO) 15 (2018)."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.26"},{"key":"e_1_3_2_1_33_1","volume-title":"Proc. Machine Learning and Systems 2 (MLSys).","author":"Mattson Peter","year":"2020","unstructured":"Peter Mattson , Christine Cheng , Gregory Diamos , Cody Coleman , Paulius Micikevicius , David Patterson , Hanlin Tang , Gu-Yeon Wei , Peter Bailis , Victor Bittorf , David Brooks , Dehao Chen , Debo Dutta , Udit Gupta , Kim Hazelwood , Andy Hock , Xinyuan Huang , Daniel Kang , David Kanter , Naveen Kumar , Jeffery Liao , Deepak Narayanan , Tayo Oguntebi , Gennady Pekhimenko , Lillian Pentecost , Vijay Janapa Reddi , Taylor Robie , Tom St John , Carole-Jean Wu , Lingjie Xu , Cliff Young , and Matei Zaharia . 2020 . MLPerf Training Benchmark . In Proc. Machine Learning and Systems 2 (MLSys). Peter Mattson, Christine Cheng, Gregory Diamos, Cody Coleman, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debo Dutta, Udit Gupta, Kim Hazelwood, Andy Hock, Xinyuan Huang, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao, Deepak Narayanan, Tayo Oguntebi, Gennady Pekhimenko, Lillian Pentecost, Vijay Janapa Reddi, Taylor Robie, Tom St John, Carole-Jean Wu, Lingjie Xu, Cliff Young, and Matei Zaharia. 2020. MLPerf Training Benchmark. In Proc. Machine Learning and Systems 2 (MLSys)."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTCHIPS.2009.7478342"},{"key":"e_1_3_2_1_35_1","unstructured":"Sharan Narang. 2016. DeepBench. https:\/\/svail.github.io\/DeepBench\/.  Sharan Narang. 2016. DeepBench. https:\/\/svail.github.io\/DeepBench\/."},{"key":"e_1_3_2_1_36_1","volume-title":"Proc. 20th IEEE International Symposium on High Performance Computer Architecture (HPCA).","author":"Nugteren Cedric","unstructured":"Cedric Nugteren , Gert-Jan van den Braak, Henk Corporaal, and Henri Bal. 2014. A Detailed GPU Cache Model Based on Reuse Distance Theory . In Proc. 20th IEEE International Symposium on High Performance Computer Architecture (HPCA). Cedric Nugteren, Gert-Jan van den Braak, Henk Corporaal, and Henri Bal. 2014. A Detailed GPU Cache Model Based on Reuse Distance Theory. In Proc. 20th IEEE International Symposium on High Performance Computer Architecture (HPCA)."},{"key":"e_1_3_2_1_37_1","unstructured":"NVIDIA Corporation. 2020. Nsight Compute CLI.  NVIDIA Corporation. 2020. Nsight Compute CLI."},{"key":"e_1_3_2_1_38_1","unstructured":"NVIDIA Corporation. 2021. NVIDIA Ampere GA102 GPU Architecture. https:\/\/www.nvidia.com\/content\/PDF\/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf.  NVIDIA Corporation. 2021. NVIDIA Ampere GA102 GPU Architecture. https:\/\/www.nvidia.com\/content\/PDF\/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf."},{"key":"e_1_3_2_1_39_1","unstructured":"NVIDIA Corporation. 2021. Parallel Thread Execution ISA: Application Guide (v7.4).  NVIDIA Corporation. 2021. Parallel Thread Execution ISA: Application Guide (v7.4)."},{"key":"e_1_3_2_1_40_1","volume-title":"Comput. Syst. 16","author":"O'neal Kenneth","year":"2017","unstructured":"Kenneth O'neal , Philip Brisk , Ahmed Abousamra , Zack Waters , and Emily Shriver . 2017. GPU Performance Estimation Using Software Rasterization and Machine Learning. ACM Trans. Embed. Comput. Syst. 16 ( 2017 ). Kenneth O'neal, Philip Brisk, Ahmed Abousamra, Zack Waters, and Emily Shriver. 2017. GPU Performance Estimation Using Software Rasterization and Machine Learning. ACM Trans. Embed. Comput. Syst. 16 (2017)."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00045"},{"key":"e_1_3_2_1_42_1","volume-title":"Proc. 45th IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Rogers Timothy G.","unstructured":"Timothy G. Rogers , Mike O'Connor , and Tor M. Aamodt . 2012. Cache-Conscious Wavefront Scheduling . In Proc. 45th IEEE\/ACM International Symposium on Microarchitecture (MICRO). Timothy G. Rogers, Mike O'Connor, and Tor M. Aamodt. 2012. Cache-Conscious Wavefront Scheduling. In Proc. 45th IEEE\/ACM International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2008.4510739"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2145816.2145819"},{"key":"e_1_3_2_1_45_1","volume-title":"OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems. Computing in Science & Engineering 12","author":"Stone John E.","year":"2010","unstructured":"John E. Stone , David Gohara , and Guochun Shi . 2010. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems. Computing in Science & Engineering 12 ( 2010 ). John E. Stone, David Gohara, and Guochun Shi. 2010. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems. Computing in Science & Engineering 12 (2010)."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322230"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2017.2684813"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA51647.2021.00077"},{"key":"e_1_3_2_1_49_1","volume-title":"Proc. 52nd IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Villa Oreste","unstructured":"Oreste Villa , Mark Stephenson , David Nellans , and Stephen W. Keckler . 2019. NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs . In Proc. 52nd IEEE\/ACM International Symposium on Microarchitecture (MICRO). Oreste Villa, Mark Stephenson, David Nellans, and Stephen W. Keckler. 2019. NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs. In Proc. 52nd IEEE\/ACM International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00085"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2019.2923618"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2019.00062"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2015.7056063"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/2465529.2465540"},{"key":"e_1_3_2_1_55_1","volume-title":"GPGPU-MiniBench: Accelerating GPGPU Micro-Architecture Simulation","author":"Yu Zhibin","year":"2015","unstructured":"Zhibin Yu , Lieven Eeckhout , Nilanjan Goswami , Tao Li , Lizy K John , Hai Jin , Chengzhong Xu , and Junmin Wu. 2015. GPGPU-MiniBench: Accelerating GPGPU Micro-Architecture Simulation . IEEE Trans. Comput . 64 ( 2015 ). Zhibin Yu, Lieven Eeckhout, Nilanjan Goswami, Tao Li, Lizy K John, Hai Jin, Chengzhong Xu, and Junmin Wu. 2015. GPGPU-MiniBench: Accelerating GPGPU Micro-Architecture Simulation. IEEE Trans. Comput. 64 (2015)."},{"key":"e_1_3_2_1_56_1","volume-title":"Proc. 17th IEEE International Symposium on High Performance Computer Architecture (HPCA).","author":"Zhang Yao","unstructured":"Yao Zhang and John D. Owens . 2011. A Quantitative Performance Analysis Model for GPU Architectures . In Proc. 17th IEEE International Symposium on High Performance Computer Architecture (HPCA). Yao Zhang and John D. Owens. 2011. A Quantitative Performance Analysis Model for GPU Architectures. In Proc. 17th IEEE International Symposium on High Performance Computer Architecture (HPCA)."}],"event":{"name":"ISCA '22: The 49th Annual International Symposium on Computer Architecture","location":"New York New York","acronym":"ISCA '22","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","IEEE CS TCAA IEEE CS technical committee on architectural acoustics"]},"container-title":["Proceedings of the 49th Annual International Symposium on Computer Architecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527384","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3527384","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:27Z","timestamp":1750188627000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527384"}},"subtitle":["a detailed GPU core model for accurate analytical modeling of modern GPUs"],"short-title":[],"issued":{"date-parts":[[2022,6,11]]},"references-count":56,"alternative-id":["10.1145\/3470496.3527384","10.1145\/3470496"],"URL":"https:\/\/doi.org\/10.1145\/3470496.3527384","relation":{},"subject":[],"published":{"date-parts":[[2022,6,11]]},"assertion":[{"value":"2022-06-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}