{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:20:08Z","timestamp":1750306808217,"version":"3.41.0"},"reference-count":35,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2013,12,1]],"date-time":"2013-12-01T00:00:00Z","timestamp":1385856000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100002855","name":"Ministry of Science and Technology of the People's Republic of China","doi-asserted-by":"publisher","award":["2012AA010905"],"award-info":[{"award-number":["2012AA010905"]}],"id":[{"id":"10.13039\/501100002855","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004963","name":"Seventh Framework Programme","doi-asserted-by":"publisher","award":["259295"],"award-info":[{"award-number":["259295"]}],"id":[{"id":"10.13039\/501100004963","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2013,12]]},"abstract":"<jats:p>\n            Computer architects rely heavily on microarchitecture simulation to evaluate design alternatives. Unfortunately, cycle-accurate simulation is extremely slow, being at least 4 to 6 orders of magnitude slower than real hardware. This longstanding problem is further exacerbated in the multi-\/many-core era, because single-threaded simulation performance has not improved much, while the design space has expanded substantially. Parallel simulation is a promising approach, yet does not completely solve the simulation challenge. Furthermore, existing sampling techniques, which are widely used for single-threaded applications, do not readily apply to multithreaded applications as thread interaction and synchronization must now be taken into account. This work presents\n            <jats:italic>PCantorSim<\/jats:italic>\n            , a novel Cantor set (a classic fractal)--based sampling scheme to accelerate parallel simulation of multithreaded applications. Through the use of the proposed methodology, only less than 5% of an application's execution time is simulated in detail. We have implemented our approach in\n            <jats:italic>Sniper<\/jats:italic>\n            (a parallel multicore simulator) and evaluated it by running the PARSEC benchmarks on a simulated 8-core system. The results show that\n            <jats:italic>PCantorSim<\/jats:italic>\n            increases simulation speed over detailed parallel simulation by a factor of 20\u00d7, on average, with an average absolute execution time prediction error of 5.3%.\n          <\/jats:p>","DOI":"10.1145\/2541228.2555305","type":"journal-article","created":{"date-parts":[[2014,1,14]],"date-time":"2014-01-14T13:39:57Z","timestamp":1389706797000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["PCantorSim"],"prefix":"10.1145","volume":"10","author":[{"given":"Chuntao","family":"Jiang","sequence":"first","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhibin","family":"Yu","sequence":"additional","affiliation":[{"name":"Shenzhen Institute of Advanced Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hai","family":"Jin","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chengzhong","family":"Xu","sequence":"additional","affiliation":[{"name":"Shenzhen Institute of Advanced Technology\/Wayne State University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lieven","family":"Eeckhout","sequence":"additional","affiliation":[{"name":"Ghent University, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wim","family":"Heirman","sequence":"additional","affiliation":[{"name":"Ghent University, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Trevor E.","family":"Carlson","sequence":"additional","affiliation":[{"name":"Ghent University, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaofei","family":"Liao","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2013,12]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2006.73"},{"volume-title":"Proceedings of 9th Annual International Symposium on High Performance Computer Architecture (HPCA\u201903)","author":"Alameldeen A. R.","key":"e_1_2_1_2_1","unstructured":"Alameldeen , A. R. and Wood , D. A . 2003. Variability in architectual simulations of multi-threaded workloads . In Proceedings of 9th Annual International Symposium on High Performance Computer Architecture (HPCA\u201903) . IEEE Computer Society, Washington, DC, 7--18. Alameldeen, A. R. and Wood, D. A. 2003. Variability in architectual simulations of multi-threaded workloads. In Proceedings of 9th Annual International Symposium on High Performance Computer Architecture (HPCA\u201903). IEEE Computer Society, Washington, DC, 7--18."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522340"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1496909.1496921"},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Aslot V. Domeika M. Eigenmann R. Gaertner G. Jones W. B. and Parady B. 2001. SPEComp: A new benchmark suite for measuring parallel computer performance. Shared Memory Parallel Programming R. Eigenmann and M. Voss Eds. 2104 1--19.   Aslot V. Domeika M. Eigenmann R. Gaertner G. Jones W. B. and Parady B. 2001. SPEComp: A new benchmark suite for measuring parallel computer performance. Shared Memory Parallel Programming R. Eigenmann and M. Voss Eds. 2104 1--19.","DOI":"10.1007\/3-540-44587-0_1"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306793"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2006.68"},{"volume-title":"Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904)","author":"Biesbrouck M. V.","key":"e_1_2_1_9_1","unstructured":"Biesbrouck , M. V. , Sherwood , T. , and Calder , B . 2004. A co-phase matrix to guide simultaneous multi-threading simulation . In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904) . IEEE Computer Society, 45--56. Biesbrouck, M. V., Sherwood, T., and Calder, B. 2004. A co-phase matrix to guide simultaneous multi-threading simulation. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904). IEEE Computer Society, 45--56."},{"volume-title":"Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201913)","author":"Carlson T. E.","key":"e_1_2_1_10_1","unstructured":"Carlson , T. E. , Heirman , W. , and Eeckhout , L . 2013. Sampled simulation of multi-Threaded applications . In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201913) . IEEE Computer Society. Carlson, T. E., Heirman, W., and Eeckhout, L. 2013. Sampled simulation of multi-Threaded applications. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201913). IEEE Computer Society."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063454"},{"volume-title":"Proceedings of the International Conference on Computer Design (ICCD\u201996)","author":"Conte T. M.","key":"e_1_2_1_12_1","unstructured":"Conte , T. M. , Hirsch , M. A. , and Menezes , K. N . 1996. Reducing state loss for effective trace sampling of superscalar processors . In Proceedings of the International Conference on Computer Design (ICCD\u201996) . IEEE, 468--477. Conte, T. M., Hirsch, M. A., and Menezes, K. N. 1996. Reducing state loss for effective trace sampling of superscalar processors. In Proceedings of the International Conference on Computer Design (ICCD\u201996). IEEE, 468--477."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2005.06.016"},{"key":"e_1_2_1_14_1","unstructured":"Feitelson D. G. 2013. Workload modeling for computer systems performance evaluation. Version 0.41.  Feitelson D. G. 2013. Workload modeling for computer systems performance evaluation. Version 0.41."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2012.121"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1061\/TACEAT.0006518"},{"key":"e_1_2_1_17_1","unstructured":"Jin H. Frumkin M. and Yan J. 1999. The OpenMP implementation of NAS parallel benchmarks and its performance. Tech. rep. NASA Ames Research Center.  Jin H. Frumkin M. and Yan J. 1999. The OpenMP implementation of NAS parallel benchmarks and its performance. Tech. rep. NASA Ames Research Center."},{"volume-title":"Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904)","author":"Lau J.","key":"e_1_2_1_18_1","unstructured":"Lau , J. , Schoemackers , S. , and Calder , B . 2004. Structures for phase classification . In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904) . IEEE Computer Society, 57--67. Lau, J., Schoemackers, S., and Calder, B. 2004. Structures for phase classification. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904). IEEE Computer Society, 57--67."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065034"},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Mandelbrot B. B. 1983. The fractal geometry of nature. Free man.  Mandelbrot B. B. 1983. The fractal geometry of nature. Free man.","DOI":"10.1119\/1.13295"},{"volume-title":"Proceedings of 16th Annual International Symposium on High Performance Computer Architecture (HPCA\u201910)","author":"Miller J. E.","key":"e_1_2_1_21_1","unstructured":"Miller , J. E. , Kasture , H. , Kurian , G. , Gruenwad III, C. , Bechmann , N. , Celio , C. , Eastep , J. , and Agarwal , A . 2010. Graphite: A distributed parallel simulator for multicores . In Proceedings of 16th Annual International Symposium on High Performance Computer Architecture (HPCA\u201910) . IEEE Computer Society, 1--12. Miller, J. E., Kasture, H., Kurian, G., Gruenwad III, C., Bechmann, N., Celio, C., Eastep, J., and Agarwal, A. 2010. Graphite: A distributed parallel simulator for multicores. In Proceedings of 16th Annual International Symposium on High Performance Computer Architecture (HPCA\u201910). IEEE Computer Society, 1--12."},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Peitgen H. J\u00fcrgens H. and Saupe D. 2004. Chaos and fractals: New frontiers of science. Springer.   Peitgen H. J\u00fcrgens H. and Saupe D. 2004. Chaos and fractals: New frontiers of science. Springer.","DOI":"10.1007\/b97624"},{"volume-title":"Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS\u201906)","author":"Perelman E.","key":"e_1_2_1_23_1","unstructured":"Perelman , E. , Polito , M. , Bouguet , J.-Y. , Sampson , J. , Calder , B. , and Dulong , C . 2006. Detecting phases in parallel applications on shared memory architectures . In Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS\u201906) . IEEE. Perelman, E., Polito, M., Bouguet, J.-Y., Sampson, J., Calder, B., and Dulong, C. 2006. Detecting phases in parallel applications on shared memory architectures. In Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS\u201906). IEEE."},{"key":"e_1_2_1_24_1","volume-title":"Retrieved","author":"Renau J.","year":"2013","unstructured":"Renau , J. , Basilio , F. , Tuck , J. , Liu , W. , Prvulovic , M. , Ceze , L. , Sarangi , S. , Sack , P. , Strauss , K. , and Montesinos , P . 2005. SESC: Cycle accurate architecutral simulator . Retrieved November 19, 2013 from http:&sol;&sol;sesc.sourceforge.net&sol;. Renau, J., Basilio, F., Tuck, J., Liu, W., Prvulovic, M., Ceze, L., Sarangi, S., Sack, P., Strauss, K., and Montesinos, P. 2005. SESC: Cycle accurate architecutral simulator. Retrieved November 19, 2013 from http:&sol;&sol;sesc.sourceforge.net&sol;."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485963"},{"volume-title":"Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT\u201901)","author":"Sherwood T.","key":"e_1_2_1_26_1","unstructured":"Sherwood , T. , Perelman , E. , and Calder , B . 2001. Basic block distribution analysis to find periodic behavior and simulation points in applications . In Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT\u201901) . ACM, New York, 3--14. Sherwood, T., Perelman, E., and Calder, B. 2001. Basic block distribution analysis to find periodic behavior and simulation points in applications. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT\u201901). ACM, New York, 3--14."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605403"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.30852"},{"volume-title":"Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201909)","author":"Uzelac V.","key":"e_1_2_1_29_1","unstructured":"Uzelac , V. and Milenkovic , A . 2009. Experiment flows and microbenchmarks for reverse engineering of branch predictor structures . In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201909) . IEEE Computer Society, 207--217. Uzelac, V. and Milenkovic, A. 2009. Experiment flows and microbenchmarks for reverse engineering of branch predictor structures. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201909). IEEE Computer Society, 207--217."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.272.0164"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2006.79"},{"volume-title":"Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201906)","author":"Wenisch T. F.","key":"e_1_2_1_32_1","unstructured":"Wenisch , T. F. , Wunderlich , R. E. , Falsafi , B. , and Hoe , J. C . 2006. Simulation sampling with live-points . In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201906) . IEEE Computer Society, 2--12. Wenisch, T. F., Wunderlich, R. E., Falsafi, B., and Hoe, J. C. 2006. Simulation sampling with live-points. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201906). IEEE Computer Society, 2--12."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/223982.223990"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859629"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASCOTS.2010.74"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2541228.2555305","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2541228.2555305","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T07:35:01Z","timestamp":1750232101000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2541228.2555305"}},"subtitle":["Accelerating parallel architecture simulation through fractal-based sampling"],"short-title":[],"issued":{"date-parts":[[2013,12]]},"references-count":35,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2013,12]]}},"alternative-id":["10.1145\/2541228.2555305"],"URL":"https:\/\/doi.org\/10.1145\/2541228.2555305","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2013,12]]},"assertion":[{"value":"2013-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-12-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}