{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T05:16:04Z","timestamp":1767849364353,"version":"3.49.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2014,8,25]],"date-time":"2014-08-25T00:00:00Z","timestamp":1408924800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004963","name":"Seventh Framework Programme","doi-asserted-by":"publisher","award":["259295"],"award-info":[{"award-number":["259295"]}],"id":[{"id":"10.13039\/501100004963","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003132","name":"Agentschap voor Innovatie door Wetenschap en Technologie","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003132","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100002418","name":"Intel Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100002418","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2014,10,27]]},"abstract":"<jats:p>Large core counts and complex cache hierarchies are increasing the burden placed on commonly used simulation and modeling techniques. Although analytical models provide fast results, they do not apply to complex, many-core shared-memory systems. In contrast, detailed cycle-level simulation can be accurate but also tends to be slow, which limits the number of configurations that can be evaluated. A middle ground is needed that provides for fast simulation of complex many-core processors while still providing accurate results.<\/jats:p>\n          <jats:p>In this article, we explore, analyze, and compare the accuracy and simulation speed of high-abstraction core models as a potential solution to slow cycle-level simulation. We describe a number of enhancements to interval simulation to improve its accuracy while maintaining simulation speed. In addition, we introduce the instruction-window centric (IW-centric) core model, a new mechanistic core model that bridges the gap between interval simulation and cycle-accurate simulation by enabling high-speed simulations with higher levels of detail. We also show that using accurate core models like these are important for memory subsystem studies, and that simple, naive models, like a one-IPC core model, can lead to misleading and incorrect results and conclusions in practical design studies. Validation against real hardware shows good accuracy, with an average single-core error of 11.1% and a maximum of 18.8% for the IW-centric model with a 1.5\u00d7 slowdown compared to interval simulation.<\/jats:p>","DOI":"10.1145\/2629677","type":"journal-article","created":{"date-parts":[[2014,8,29]],"date-time":"2014-08-29T13:03:31Z","timestamp":1409317411000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":260,"title":["An Evaluation of High-Level Mechanistic Core Models"],"prefix":"10.1145","volume":"11","author":[{"given":"Trevor E.","family":"Carlson","sequence":"first","affiliation":[{"name":"Ghent University, Gent, Belgium"}]},{"given":"Wim","family":"Heirman","sequence":"additional","affiliation":[{"name":"Intel, ExaScience Lab, Leuven, Belgium"}]},{"given":"Stijn","family":"Eyerman","sequence":"additional","affiliation":[{"name":"Ghent University, Gent, Belgium"}]},{"given":"Ibrahim","family":"Hur","sequence":"additional","affiliation":[{"name":"Intel, ExaScience Lab, Leuven, Belgium"}]},{"given":"Lieven","family":"Eeckhout","sequence":"additional","affiliation":[{"name":"Ghent University, Gent, Belgium"}]}],"member":"320","published-online":{"date-parts":[[2014,8,25]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"CloudSuite on Flexus. Retrieved","author":"Adileh A.","year":"2014","unstructured":"A. Adileh , C. Kaynak , P. Lotfi-Kamran , and S. Volos . 2012 . CloudSuite on Flexus. Retrieved July 22, 2014 , from http:\/\/parsa.epfl.ch\/simflex\/doc\/CloudSuite-on-Flexus-isca12.pdf. A. Adileh, C. Kaynak, P. Lotfi-Kamran, and S. Volos. 2012. CloudSuite on Flexus. Retrieved July 22, 2014, from http:\/\/parsa.epfl.ch\/simflex\/doc\/CloudSuite-on-Flexus-isca12.pdf."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522340"},{"key":"e_1_2_1_3_1","volume-title":"Retrieved","author":"Beckmann B.","year":"2014","unstructured":"B. Beckmann , N. Binkert , A. Saidi , J. Hestness , G. Black , K. Sewell , and D. Hower . 2011. The gem5 Simulator . Retrieved July 22, 2014 , from http:\/\/www.gem5.org\/dist\/tutorials\/isca_pres_2011.pdf. B. Beckmann, N. Binkert, A. Saidi, J. Hestness, G. Black, K. Sewell, and D. Hower. 2011. The gem5 Simulator. Retrieved July 22, 2014, from http:\/\/www.gem5.org\/dist\/tutorials\/isca_pres_2011.pdf."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2006.82"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12","author":"Carlson T. E.","unstructured":"T. E. Carlson , W. Heirman , K. V. Craeynest , and L. Eeckhout . 2014. BarrierPoint: Sampled simulation of multi-threaded applications . In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12 . T. E. Carlson, W. Heirman, K. V. Craeynest, and L. Eeckhout. 2014. BarrierPoint: Sampled simulation of multi-threaded applications. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063454"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12","author":"Carlson T. E.","unstructured":"T. E. Carlson , W. Heirman , and L. Eeckhout . 2013. Sampled simulation of multi-threaded applications . In Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12 . T. E. Carlson, W. Heirman, and L. Eeckhout. 2013. Sampled simulation of multi-threaded applications. In Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.47"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2019608.2019609"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/1331699.1331723"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the International Symposium on Computer Architecture (ISCA). 76--87","author":"Chou Y.","unstructured":"Y. Chou , B. Fahs , and S. Abraham . 2004. Microarchitecture optimizations for exploiting memory-level parallelism . In Proceedings of the International Symposium on Computer Architecture (ISCA). 76--87 . Y. Chou, B. Fahs, and S. Abraham. 2004. Microarchitecture optimizations for exploiting memory-level parallelism. In Proceedings of the International Symposium on Computer Architecture (ISCA). 76--87."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1344671.1344684"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). 350--361","author":"Eeckhout L.","unstructured":"L. Eeckhout , R. H. Bell Jr , B. Stougie , K. De Bosschere , and L. K. John . 2004. Control flow modeling in statistical simulation for accurate and efficient processor design studies . In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). 350--361 . L. Eeckhout, R. H. Bell Jr, B. Stougie, K. De Bosschere, and L. K. John. 2004. Control flow modeling in statistical simulation for accurate and efficient processor design studies. In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). 350--361."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2003.1240210"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.982918"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/800015.808199"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.413.0215"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168880"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1534909.1534910"},{"key":"e_1_2_1_21_1","volume-title":"AMD and VIA CPUs. Retrieved","author":"Fog A.","year":"2013","unstructured":"A. Fog . 2013 . Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-Operation Breakdowns for Intel , AMD and VIA CPUs. Retrieved July 22, 2014, from http:\/\/www.agner.org\/optimize\/instruction_tables.pdf. A. Fog. 2013. Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-Operation Breakdowns for Intel, AMD and VIA CPUs. Retrieved July 22, 2014, from http:\/\/www.agner.org\/optimize\/instruction_tables.pdf."},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA). 307--318","author":"Genbrugge D.","unstructured":"D. Genbrugge , S. Eyerman , and L. Eeckhout . 2010. Interval simulation: Raising the level of abstraction in architectural simulation . In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA). 307--318 . D. Genbrugge, S. Eyerman, and L. Eeckhout. 2010. Interval simulation: Raising the level of abstraction in architectural simulation. In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA). 307--318."},{"key":"e_1_2_1_23_1","volume-title":"MARSS: Micro Architectural Systems Simulator. Retrieved","author":"Ghose K.","year":"2014","unstructured":"K. Ghose , A. Patel , F. Afram , H. Zheng , and J. Tringali . 2012 . MARSS: Micro Architectural Systems Simulator. Retrieved July 22, 2014 , from http:\/\/cloud.github.com\/downloads\/avadhpatel\/marss\/Marss_ISCA_2012_tutorial.pdf. K. Ghose, A. Patel, F. Afram, H. Zheng, and J. Tringali. 2012. MARSS: Micro Architectural Systems Simulator. Retrieved July 22, 2014, from http:\/\/cloud.github.com\/downloads\/avadhpatel\/marss\/Marss_ISCA_2012_tutorial.pdf."},{"key":"e_1_2_1_24_1","volume-title":"MLP yes&excl","author":"Glew A.","unstructured":"A. Glew . 1998. MLP yes&excl ; ILP no&excl; In Proceedings of the ASPLOS Wild and Crazy Idea Session . A. Glew. 1998. MLP yes&excl; ILP no&excl; In Proceedings of the ASPLOS Wild and Crazy Idea Session."},{"key":"e_1_2_1_25_1","unstructured":"P. Greenhalgh. 2011. big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. ARM white paper.  P. Greenhalgh. 2011. big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. ARM white paper."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1054907.1054914"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 4th Annual Workshop on Modeling, Benchmarking and Simulation (MoBS), co-located with ISCA","author":"Jaleel A.","year":"2008","unstructured":"A. Jaleel , R. S. Cohn , C.-K. Luk , and B. Jacob . 2008. CMP&dollar;im: A pin-based on-the-fly multi-core cache simulator . In Proceedings of the 4th Annual Workshop on Modeling, Benchmarking and Simulation (MoBS), co-located with ISCA 2008 . 28--36. A. Jaleel, R. S. Cohn, C.-K. Luk, and B. Jacob. 2008. CMP&dollar;im: A pin-based on-the-fly multi-core cache simulator. In Proceedings of the 4th Annual Workshop on Modeling, Benchmarking and Simulation (MoBS), co-located with ISCA 2008. 28--36."},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). 338--349","author":"Karkhanis T.","unstructured":"T. Karkhanis and J. E. Smith . 2004. A first-order superscalar processor model . In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). 338--349 . T. Karkhanis and J. E. Smith. 2004. A first-order superscalar processor model. In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). 338--349."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2007.4380625"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1287\/opre.9.3.383"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201909)","author":"Loh G.","unstructured":"G. Loh , S. Subramaniam , and Y. Xie . 2009. Zesto: A cycle-level simulator for highly detailed microarchitecture exploration . In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201909) . 53--64. G. Loh, S. Subramaniam, and Y. Xie. 2009. Zesto: A cycle-level simulator for highly detailed microarchitecture exploration. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201909). 53--64."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065034"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA). 1--12","author":"Miller J. E.","unstructured":"J. E. Miller , H. Kasture , G. Kurian , C. Gruenwald III, N. Beckmann , C. Celio , J. Eastep , and A. Agarwal . 2010. Graphite: A distributed parallel simulator for multicores . In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA). 1--12 . J. E. Miller, H. Kasture, G. Kurian, C. Gruenwald III, N. Beckmann, C. Celio, J. Eastep, and A. Agarwal. 2010. Graphite: A distributed parallel simulator for multicores. In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA). 1--12."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/4434.895100"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques (PACT). 15--24","author":"Nussbaum S.","unstructured":"S. Nussbaum and J. E. Smith . 2001. Modeling superscalar processors via statistical simulation . In Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques (PACT). 15--24 . S. Nussbaum and J. E. Smith. 2001. Modeling superscalar processors via statistical simulation. In Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques (PACT). 15--24."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339656"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024724.2024954"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the International Symposium on High Performance Computer Architecture (HPCA). 406--417","author":"Pellauer M.","unstructured":"M. Pellauer , M. Adler , M. Kinsy , A. Parashar , and J. Emer . 2011. HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing . In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA). 406--417 . M. Pellauer, M. Adler, M. Kinsy, A. Parashar, and J. Emer. 2011. HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA). 406--417."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485963"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605403"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2007.70817"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 207--217","author":"Uzelac V.","unstructured":"V. Uzelac and A. Milenkovic . 2009. Experiment flows and microbenchmarks for reverse engineering of branch predictor structures . In Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 207--217 . V. Uzelac and A. Milenkovic. 2009. Experiment flows and microbenchmarks for reverse engineering of branch predictor structures. In Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 207--217."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/223982.223990"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859629"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/369028.369059"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2629677","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2629677","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:13:30Z","timestamp":1750227210000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2629677"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8,25]]},"references-count":45,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2014,10,27]]}},"alternative-id":["10.1145\/2629677"],"URL":"https:\/\/doi.org\/10.1145\/2629677","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,8,25]]},"assertion":[{"value":"2013-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-08-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}