{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T01:23:17Z","timestamp":1776993797818,"version":"3.51.4"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2015,3,9]],"date-time":"2015-03-09T00:00:00Z","timestamp":1425859200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National 863 Program of China","award":["2012AA010901"],"award-info":[{"award-number":["2012AA010901"]}]},{"DOI":"10.13039\/501100001809","name":"China National Natural Science Foundation","doi-asserted-by":"crossref","award":["61370081"],"award-info":[{"award-number":["61370081"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2015,3,25]]},"abstract":"<jats:p>Phase analysis, which classifies the set of execution intervals with similar execution behavior and resource requirements, has been widely used in a variety of systems, including dynamic cache reconfiguration, prefetching, race detection, and sampling simulation. Although phase granularity has been a major factor in the accuracy of phase analysis, it has not been well investigated, and most systems usually adopt a fine-grained scheme. However, such a scheme can only take account of recent local phase information and could be frequently interfered by temporary noise due to instant phase changes, which might notably limit the accuracy.<\/jats:p>\n          <jats:p>\n            In this article, we make the first investigation on the potential of multilevel phase analysis (MLPA), where different granularity phase analyses are combined together to improve the overall accuracy. The key observation is that the coarse-grained intervals belonging to the same phase usually consist of\n            <jats:italic>stably distributed<\/jats:italic>\n            fine-grained phases. Moreover, the phase of a coarse-grained interval can be accurately identified based on the fine-grained intervals at the beginning of its execution. Based on the observation, we design and implement an MLPA scheme. In such a scheme, a coarse-grained phase is first identified based on the fine-grained intervals at the beginning of its execution. The following fine-grained phases in it are then predicted based on the sequence of fine-grained phases in the coarse-grained phase. Experimental results show that such a scheme can notably improve the prediction accuracy. Using a Markov fine-grained phase predictor as the baseline, MLPA can improve prediction accuracy by 20%, 39%, and 29% for next phase, phase change, and phase length prediction for SPEC2000, respectively, yet incur only about 2% time overhead and 40% space overhead (about 360 bytes in total). To demonstrate the effectiveness of MLPA, we apply it to a dynamic cache reconfiguration system that dynamically adjusts the cache size to reduce the power consumption and access time of the data cache. Experimental results show that MLPA can further reduce the average cache size by 15% compared to the fine-grained scheme.\n          <\/jats:p>\n          <jats:p>Moreover, for MLPA, we also observe that coarse-grained phases can better capture the overall program characteristics with fewer of phases and the last representative phase could be classified in a very early program position, leading to fewer execution internals being functionally simulated. Based on this observation, we also design a multilevel sampling simulation technique that combines both fine- and coarse-grained phase analysis for sampling simulation. Such a scheme uses fine-grained simulation points to represent only the selected coarse-grained simulation points instead of the entire program execution; thus, it could further reduce both the functional and detailed simulation time. Experimental results show that MLPA for sampling simulation can achieve a speedup in simulation time of about 8.3X with similar accuracy compared to 10M SimPoint.<\/jats:p>","DOI":"10.1145\/2629594","type":"journal-article","created":{"date-parts":[[2015,3,9]],"date-time":"2015-03-09T19:03:01Z","timestamp":1425927781000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Multilevel Phase Analysis"],"prefix":"10.1145","volume":"14","author":[{"given":"Weihua","family":"Zhang","sequence":"first","affiliation":[{"name":"Shanghai Key Laboratory of Data Science, Software School, and State Key Laboratory of ASIC &amp; System, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiaxin","family":"Li","sequence":"additional","affiliation":[{"name":"Parallel Processing Institute, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi","family":"Li","sequence":"additional","affiliation":[{"name":"Parallel Processing Institute, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haibo","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Software, Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2015,3,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/360128.360153"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/370155.370356"},{"key":"e_1_2_1_3_1","volume-title":"Technical Report 1342. Computer Sciences Department","author":"Burger Doug","year":"1997","unstructured":"Doug Burger and Todd M . Austin . 1997 . The SimpleScalar Tool Set, Version 2.0. Technical Report 1342. Computer Sciences Department , University of Wisconsin , Madison, WI . Doug Burger and Todd M. Austin. 1997. The SimpleScalar Tool Set, Version 2.0. Technical Report 1342. Computer Sciences Department, University of Wisconsin, Madison, WI."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2013.6557141"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/237090.237171"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1152154.1152173"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/361268.361281"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the International Symposium on Computer Architecture. 233--244","author":"Ashutosh","unstructured":"Ashutosh S. Dhodapkar and James E. Smith. 2002. Managing multi-configuration hardware via dynamic working set analysis . In Proceedings of the International Symposium on Computer Architecture. 233--244 . Ashutosh S. Dhodapkar and James E. Smith. 2002. Managing multi-configuration hardware via dynamic working set analysis. In Proceedings of the International Symposium on Computer Architecture. 233--244."},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 217","author":"Ashutosh","unstructured":"Ashutosh S. Dhodapkar and James E. Smith. 2003. Comparing program phase detection techniques . In Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 217 . Ashutosh S. Dhodapkar and James E. Smith. 2003. Comparing program phase detection techniques. In Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 217."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/942806.943853"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2010.5416636"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1028976.1028999"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/968878.968998"},{"key":"e_1_2_1_15_1","volume-title":"Sweeney","author":"Hind Michael J.","year":"2003","unstructured":"Michael J. Hind , Vadakkedathu T. Rajan , and Peter F . Sweeney . 2003 . Phase Shift Detection: A Problem Classification. Technical Report. IBM, Armonk, NY. Michael J. Hind, Vadakkedathu T. Rajan, and Peter F. Sweeney. 2003. Phase Shift Detection: A Problem Classification. Technical Report. IBM, Armonk, NY."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859637"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1152154.1152172"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/956417.956567"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2006.1598119"},{"key":"e_1_2_1_20_1","volume-title":"Wichern","author":"Johnson Richard A.","year":"2002","unstructured":"Richard A. Johnson and Dean A . Wichern . 2002 . Applied Multivariate Statistical Analysis (5th ed.). Prentice Hall . Richard A. Johnson and Dean A. Wichern. 2002. Applied Multivariate Statistical Analysis (5th ed.). Prentice Hall."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/264107.264207"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2006.32"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2005.1430568"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/1153925.1154588"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2005.39"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 180--190","author":"Lu Jiwei","year":"2003","unstructured":"Jiwei Lu , Howard Chen , Rao Fu , Wei-Chung Hsu , Bobbie Othmer , Pen-Chung Yew , and Dong-Yuan Chen . 2003 . The performance of runtime data cache prefetching in a dynamic optimization system . In Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 180--190 . Jiwei Lu, Howard Chen, Rao Fu, Wei-Chung Hsu, Bobbie Othmer, Pen-Chung Yew, and Dong-Yuan Chen. 2003. The performance of runtime data cache prefetching in a dynamic optimization system. In Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 180--190."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 281--297","author":"MacQueen James","year":"1967","unstructured":"James MacQueen . 1967 . Some methods for classification and analysis of multivariate observations . In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 281--297 . James MacQueen. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 281--297."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/344166.344610"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1542476.1542491"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCECE.2006.277819"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the IEEE International Conference on Computer Design. 38--46","author":"Arun","unstructured":"Arun A. Nair and Lizy Joh. 2008. Simulation points for SPEC 2006 . In Proceedings of the IEEE International Conference on Computer Design. 38--46 . Arun A. Nair and Lizy Joh. 2008. Simulation points for SPEC 2006. In Proceedings of the IEEE International Conference on Computer Design. 38--46."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/942806.943854"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/1898953.1899021"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2005.1430555"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5555\/1370998.1371004"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1024393.1024414"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605403"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859657"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2006.1620799"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.5555\/1153925.1154587"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2011.6114194"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2006.1620785"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859629"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2005.8"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1067915.1067921"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2629594","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2629594","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:13:29Z","timestamp":1750227209000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2629594"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,3,9]]},"references-count":44,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2015,3,25]]}},"alternative-id":["10.1145\/2629594"],"URL":"https:\/\/doi.org\/10.1145\/2629594","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,3,9]]},"assertion":[{"value":"2013-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-03-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}