{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T13:50:28Z","timestamp":1775051428560,"version":"3.50.1"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2012,9,1]],"date-time":"2012-09-01T00:00:00Z","timestamp":1346457600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004963","name":"Seventh Framework Programme","doi-asserted-by":"publisher","award":["(FP7\/2007-2013)\/ERC, 259295"],"award-info":[{"award-number":["(FP7\/2007-2013)\/ERC, 259295"]}],"id":[{"id":"10.13039\/501100004963","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004963","name":"Seventh Framework Programme","doi-asserted-by":"publisher","award":["European FP7\/ICT 217068"],"award-info":[{"award-number":["European FP7\/ICT 217068"]}],"id":[{"id":"10.13039\/501100004963","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["60873057, 60921002, and 61033009"],"award-info":[{"award-number":["60873057, 60921002, and 61033009"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100018537","name":"National Science and Technology Major Project","doi-asserted-by":"crossref","award":["2009ZX01036-001-002 and 2011ZX01028-001-002"],"award-info":[{"award-number":["2009ZX01036-001-002 and 2011ZX01028-001-002"]}],"id":[{"id":"10.13039\/501100018537","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004385","name":"Universiteit Gent","doi-asserted-by":"publisher","award":["01J14407 and 01Z04109"],"award-info":[{"award-number":["01J14407 and 01Z04109"]}],"id":[{"id":"10.13039\/501100004385","id-type":"DOI","asserted-by":"publisher"}]},{"name":"FWO","award":["G.0255.08 and G.0179.10"],"award-info":[{"award-number":["G.0255.08 and G.0179.10"]}]},{"DOI":"10.13039\/501100002855","name":"Ministry of Science and Technology of the People's Republic of China","doi-asserted-by":"publisher","award":["2011CB302504"],"award-info":[{"award-number":["2011CB302504"]}],"id":[{"id":"10.13039\/501100002855","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2012,9]]},"abstract":"<jats:p>Iterative optimization is a popular compiler optimization approach that has been studied extensively over the past decade. In this article, we deconstruct iterative optimization by evaluating whether it works across datasets and by analyzing why it works.<\/jats:p>\n          <jats:p>\n            Up to now, most iterative optimization studies are based on a premise which was never truly evaluated: that it is possible to learn the best compiler optimizations across datasets. In this article, we evaluate this question for the first time with a very large number of datasets. We therefore compose KDataSets, a dataset suite with 1000 datasets for 32 programs, which we release to the public. We characterize the diversity of KDataSets, and subsequently use it to evaluate iterative optimization. For all 32 programs, we find that there exists at least one combination of compiler optimizations that achieves at least 83% or more of the best possible speedup across\n            <jats:italic>all<\/jats:italic>\n            datasets on two widely used compilers (Intel's ICC and GNU's GCC). This optimal combination is program-specific and yields speedups up to 3.75\u00d7 (averaged across datasets of a program) over the highest optimization level of the compilers (-O3 for GCC and -fast for ICC). This finding suggests that optimizing programs across datasets might be much easier than previously anticipated.\n          <\/jats:p>\n          <jats:p>In addition, we evaluate the idea of introducing compiler choice as part of iterative optimization. We find that it can further improve the performance of iterative optimization because different programs favor different compilers. We also investigate why iterative optimization works by analyzing the optimal combinations. We find that only a handful optimizations yield most of the speedup. Finally, we show that optimizations interact in a complex and sometimes counterintuitive way through two case studies, which confirms that iterative optimization is an irreplaceable and important compiler strategy.<\/jats:p>","DOI":"10.1145\/2355585.2355594","type":"journal-article","created":{"date-parts":[[2012,10,2]],"date-time":"2012-10-02T13:50:00Z","timestamp":1349185800000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":38,"title":["Deconstructing iterative optimization"],"prefix":"10.1145","volume":"9","author":[{"given":"Yang","family":"Chen","sequence":"first","affiliation":[{"name":"State Key Laboratory of Computer Architecture, ICT, CAS, China"}]},{"given":"Shuangde","family":"Fang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, ICT, CAS, China"}]},{"given":"Yuanjie","family":"Huang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, ICT, CAS, China"}]},{"given":"Lieven","family":"Eeckhout","sequence":"additional","affiliation":[{"name":"Ghent University, Belgium"}]},{"given":"Grigori","family":"Fursin","sequence":"additional","affiliation":[{"name":"INRIA and Paris South University, France"}]},{"given":"Olivier","family":"Temam","sequence":"additional","affiliation":[{"name":"INRIA, Saclay, France"}]},{"given":"Chengyong","family":"Wu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, ICT, CAS, China"}]}],"member":"320","published-online":{"date-parts":[[2012,10,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2006.37"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 20th International Conference on Very Large Data Bases. 487--499","author":"Agrawal R.","unstructured":"Agrawal , R. and Srikant , R . 1994. Fast algorithms for mining association rules . In Proceedings of the 20th International Conference on Very Large Data Bases. 487--499 . Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases. 487--499."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 251--260","author":"Berube P.","unstructured":"Berube , P. and Amaral , J . 2006. Aestimo: A feedback-directed optimization evaluation tool . In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 251--260 . Berube, P. and Amaral, J. 2006. Aestimo: A feedback-directed optimization evaluation tool. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 251--260."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2007.32"},{"key":"e_1_2_1_6_1","unstructured":"cBench. cBench: Collective Benchmarks. http:\/\/www.cTuning.org\/tools.  cBench. cBench: Collective Benchmarks. http:\/\/www.cTuning.org\/tools."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1806596.1806647"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/314403.314414"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065910.1065921"},{"key":"e_1_2_1_10_1","first-page":"1","article-title":"Quantifying the impact of input data sets on program behavior and its applications.J","volume":"5","author":"Eeckhout L.","year":"2003","unstructured":"Eeckhout , L. , Vandierendonck , H. , and De Bosschere , K. 2003 . Quantifying the impact of input data sets on program behavior and its applications.J . Instruct.-Level Parallel. 5 , 1 -- 33 . Eeckhout, L., Vandierendonck, H., and De Bosschere, K. 2003. Quantifying the impact of input data sets on program behavior and its applications.J. Instruct.-Level Parallel. 5, 1--33.","journal-title":"Instruct.-Level Parallel."},{"key":"e_1_2_1_11_1","unstructured":"EEMBC. EEMBC: The Embedded Microprocessor Benchmark Consortium. http:\/\/www.eembc.org.  EEMBC. EEMBC: The Embedded Microprocessor Benchmark Consortium. http:\/\/www.eembc.org."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065910.1065922"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the International Conference on High Performance Embedded Architectures & Compilers. 245--260","author":"Fursin G.","unstructured":"Fursin , G. , Cavazos , J. , O'Boyle , M. , and Temam , O . 2007. MiDataSets: Creating the conditions for a more realistic evaluation of iterative optimization . In Proceedings of the International Conference on High Performance Embedded Architectures & Compilers. 245--260 . Fursin, G., Cavazos, J., O'Boyle, M., and Temam, O. 2007. MiDataSets: Creating the conditions for a more realistic evaluation of iterative optimization. In Proceedings of the International Conference on High Performance Embedded Architectures & Compilers. 245--260."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1880043.1880047"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/1128020.1128563"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium.","author":"Haneda M.","unstructured":"Haneda , M. , Knijnenburg , P. , and Wijshoff , H . 2006. On the impact of data input sets on statistical compiler tuning . In Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium. Haneda, M., Knijnenburg, P., and Wijshoff, H. 2006. On the impact of data input sets on statistical compiler tuning. In Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the IEEE International Symposium on Workload Characterization. 83--92","author":"Hoste K.","unstructured":"Hoste , K. and Eeckhout , L . 2006. Comparing benchmarks using key microarchitecture-independent characteristics . In Proceedings of the IEEE International Symposium on Workload Characterization. 83--92 . Hoste, K. and Eeckhout, L. 2006. Comparing benchmarks using key microarchitecture-independent characteristics. In Proceedings of the IEEE International Symposium on Workload Characterization. 83--92."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1356058.1356080"},{"key":"e_1_2_1_19_1","volume-title":"InProceedings of the 6th Annual Workshop on Interaction between Compilers and Computer Architectures. 45--53","author":"Hsu W. C.","unstructured":"Hsu , W. C. , Chen , H. , Yew , P. C. , and Chen , D . -Y. 2002. On the predictability of program behavior using different input data sets . InProceedings of the 6th Annual Workshop on Interaction between Compilers and Computer Architectures. 45--53 . Hsu, W. C., Chen, H., Yew, P. C., and Chen, D.-Y. 2002. On the predictability of program behavior using different input data sets. InProceedings of the 6th Annual Workshop on Interaction between Compilers and Computer Architectures. 45--53."},{"key":"e_1_2_1_20_1","volume-title":"Intel 64 and IA-32 Architectures Software Developers Manual","author":"Intel","unstructured":"Intel . 2011. Intel 64 and IA-32 Architectures Software Developers Manual . Vol. 3 . Intel. 2011. Intel 64 and IA-32 Architectures Software Developers Manual. Vol. 3."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772954.1772989"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/996841.996863"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/780732.780735"},{"key":"e_1_2_1_24_1","unstructured":"Loongson. Loongson. Loongson 2f. http:\/\/www.loongson.cn\/EN\/product info.php?id=34.  Loongson. Loongson. Loongson 2f. http:\/\/www.loongson.cn\/EN\/product info.php?id=34."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065034"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859621"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1508293.1508307"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing.","volume":"3","author":"Matteo F.","unstructured":"Matteo , F. and Johnson , S . 1998. FFTW: An adaptive software architecture for the FFT . In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 3 . 1381--1384. Matteo, F. and Johnson, S. 1998. FFTW: An adaptive software architecture for the FFT. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 3. 1381--1384."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1508244.1508275"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2004.47"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2006.38"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1152154.1152182"},{"key":"e_1_2_1_33_1","unstructured":"PAPI. PAPI: A Portable Interface to Hardware Performance Counters. http:\/\/icl.cs.utk.edu\/papi.  PAPI. PAPI: A Portable Interface to Hardware Performance Counters. http:\/\/icl.cs.utk.edu\/papi."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1022969.1022972"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 3rd Workshop on Media and Stream Processor.","author":"Slingerland N.","unstructured":"Slingerland , N. and Smith , A. J . 2001. Performance analysis of instruction set architecture extensions for multimedia . In Proceedings of the 3rd Workshop on Media and Stream Processor. Slingerland, N. and Smith, A. J. 2001. Performance analysis of instruction set architecture extensions for multimedia. In Proceedings of the 3rd Workshop on Media and Stream Processor."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/781131.781141"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the International Symposium on Code Generation and Optimization. 204--215","author":"Triantafyllis S.","unstructured":"Triantafyllis , S. , Vachharajani , M. , Vachharajani , N. , and August , D . 2003. Compiler optimization-space exploration . In Proceedings of the International Symposium on Code Generation and Optimization. 204--215 . Triantafyllis, S., Vachharajani, M., Vachharajani, N., and August, D. 2003. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization. 204--215."},{"key":"e_1_2_1_38_1","doi-asserted-by":"crossref","unstructured":"Whaley R. C. Petitet A. and Dongarra J. 2001. Automated empirical optimization of software and the ATLAS project. Parallel Comput 27.  Whaley R. C. Petitet A. and Dongarra J. 2001. Automated empirical optimization of software and the ATLAS project. Parallel Comput 27.","DOI":"10.1016\/S0167-8191(00)00087-9"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2007.363733"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1552309.1552310"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2355585.2355594","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2355585.2355594","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T20:01:15Z","timestamp":1750276875000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2355585.2355594"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,9]]},"references-count":40,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2012,9]]}},"alternative-id":["10.1145\/2355585.2355594"],"URL":"https:\/\/doi.org\/10.1145\/2355585.2355594","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,9]]},"assertion":[{"value":"2011-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-10-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}