{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T16:20:31Z","timestamp":1758126031497,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,6,21]],"date-time":"2023-06-21T00:00:00Z","timestamp":1687305600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,6,21]]},"DOI":"10.1145\/3577193.3593740","type":"proceedings-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T18:47:05Z","timestamp":1687286825000},"page":"25-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["FLORIA: A Fast and Featherlight Approach for Predicting Cache Performance"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9306-3876","authenticated-orcid":false,"given":"Jun","family":"Xiao","sequence":"first","affiliation":[{"name":"University of Amsterdam, Amsterdam, Netherlands"},{"name":"Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4664-3979","authenticated-orcid":false,"given":"Yaocheng","family":"Xiang","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6951-1613","authenticated-orcid":false,"given":"Xiaolin","family":"Wang","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7903-0717","authenticated-orcid":false,"given":"Yingwei","family":"Luo","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2043-4469","authenticated-orcid":false,"given":"Andy","family":"Pimentel","sequence":"additional","affiliation":[{"name":"University of Amsterdam, Amsterdam, Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0429-4371","authenticated-orcid":false,"given":"Zhenlin","family":"Wang","sequence":"additional","affiliation":[{"name":"Michigan Tech, Houghton, United States of America"}]}],"member":"320","published-online":{"date-parts":[[2023,6,21]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_3_2_1_1_1","DOI":"10.1145\/63404.63407"},{"key":"e_1_3_2_1_2_1","volume-title":"Identifying power-efficient multicore cache hierarchies via reuse distance analysis. ACM Transactions on Computer Systems (TOCS), 34(1):3","author":"Badamo Michael","year":"2016","unstructured":"Michael Badamo , Jeff Casarona , Minshu Zhao , and Donald Yeung . Identifying power-efficient multicore cache hierarchies via reuse distance analysis. ACM Transactions on Computer Systems (TOCS), 34(1):3 , 2016 . Michael Badamo, Jeff Casarona, Minshu Zhao, and Donald Yeung. Identifying power-efficient multicore cache hierarchies via reuse distance analysis. ACM Transactions on Computer Systems (TOCS), 34(1):3, 2016."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1109\/HPCA.2016.7446067","volume-title":"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)","author":"Beckmann Nathan","year":"2016","unstructured":"Nathan Beckmann and Daniel Sanchez . Modeling cache performance beyond lru . In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA) , pages 225 -- 236 . IEEE, 2016 . Nathan Beckmann and Daniel Sanchez. Modeling cache performance beyond lru. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 225--236. IEEE, 2016."},{"key":"e_1_3_2_1_4_1","first-page":"169","volume-title":"ACM SIGMETRICS Performance Evaluation Review","author":"Berg Erik","year":"2005","unstructured":"Erik Berg and Erik Hagersten . Fast data-locality profiling of native execution . In ACM SIGMETRICS Performance Evaluation Review , volume 33 , pages 169 -- 180 . ACM , 2005 . Erik Berg and Erik Hagersten. Fast data-locality profiling of native execution. In ACM SIGMETRICS Performance Evaluation Review, volume 33, pages 169--180. ACM, 2005."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_5_1","DOI":"10.1109\/ICPP.2015.84"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_6_1","DOI":"10.1145\/1772954.1772963"},{"key":"e_1_3_2_1_7_1","first-page":"308","volume-title":"ACM SIGARCH Computer Architecture News","author":"Cook Henry","year":"2013","unstructured":"Henry Cook , Miquel Moreto , Sarah Bird , Khanh Dao , David A Patterson , and Krste Asanovic . A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness . In ACM SIGARCH Computer Architecture News , volume 41 , pages 308 -- 319 . ACM , 2013 . Henry Cook, Miquel Moreto, Sarah Bird, Khanh Dao, David A Patterson, and Krste Asanovic. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In ACM SIGARCH Computer Architecture News, volume 41, pages 308--319. ACM, 2013."},{"key":"e_1_3_2_1_8_1","volume-title":"Intel 64 and IA-32 Architectures Software Developer's Manual","author":"Intel Corporation","year":"2010","unstructured":"Intel Corporation . Intel 64 and IA-32 Architectures Software Developer's Manual , Volume 3B: System Programming Guide, Part 2 , 2010 . Intel Corporation. Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3B: System Programming Guide, Part 2, 2010."},{"unstructured":"Perf developers. perf_event_open - Linux man page.  Perf developers. perf_event_open - Linux man page.","key":"e_1_3_2_1_9_1"},{"key":"e_1_3_2_1_10_1","volume-title":"February","author":"Devices Advanced Micro","year":"2015","unstructured":"Advanced Micro Devices . BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h Models 30h-3Fh Processors , February , 2015 . Advanced Micro Devices. BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h Models 30h-3Fh Processors, February, 2015."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_11_1","DOI":"10.1145\/781131.781159"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_12_1","DOI":"10.1109\/ISPASS.2010.5452069"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1109\/HPCA.2018.00019","volume-title":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","author":"El-Sayed Nosayba","year":"2018","unstructured":"Nosayba El-Sayed , Anurag Mukkara , Po-An Tsai , Harshad Kasture , Xiaosong Ma , and Daniel Sanchez . Kpart : A hybrid cache partitioning-sharing technique for commodity multicores . In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) , pages 104 -- 117 . IEEE, 2018 . Nosayba El-Sayed, Anurag Mukkara, Po-An Tsai, Harshad Kasture, Xiaosong Ma, and Daniel Sanchez. Kpart: A hybrid cache partitioning-sharing technique for commodity multicores. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 104--117. IEEE, 2018."},{"key":"e_1_3_2_1_14_1","volume-title":"Proc. MoBS","author":"Hilton Andrew","year":"2009","unstructured":"Andrew Hilton , Neeraj Eswaran , and Amir Roth . Fiesta : A sample-balanced multi-program workload methodology . Proc. MoBS , 2009 . Andrew Hilton, Neeraj Eswaran, and Amir Roth. Fiesta: A sample-balanced multi-program workload methodology. Proc. MoBS, 2009."},{"key":"e_1_3_2_1_15_1","first-page":"351","volume-title":"Kinetic modeling of data eviction in cache. In 2016 {USENIX} Annual Technical Conference ({USENIX}{ATC} 16)","author":"Hu Xiameng","year":"2016","unstructured":"Xiameng Hu , Xiaolin Wang , Lan Zhou , Yingwei Luo , Chen Ding , and Zhenlin Wang . Kinetic modeling of data eviction in cache. In 2016 {USENIX} Annual Technical Conference ({USENIX}{ATC} 16) , pages 351 -- 364 , 2016 . Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Chen Ding, and Zhenlin Wang. Kinetic modeling of data eviction in cache. In 2016 {USENIX} Annual Technical Conference ({USENIX}{ATC} 16), pages 351--364, 2016."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_16_1","DOI":"10.1109\/DSD.2015.56"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1007\/978-3-642-11970-5_15","volume-title":"International Conference on Compiler Construction","author":"Jiang Yunlian","year":"2010","unstructured":"Yunlian Jiang , Eddy Z Zhang , Kai Tian , and Xipeng Shen . Is reuse distance applicable to data locality analysis on chip multiprocessors ? In International Conference on Compiler Construction , pages 264 -- 282 . Springer , 2010 . Yunlian Jiang, Eddy Z Zhang, Kai Tian, and Xipeng Shen. Is reuse distance applicable to data locality analysis on chip multiprocessors? In International Conference on Compiler Construction, pages 264--282. Springer, 2010."},{"key":"e_1_3_2_1_18_1","first-page":"289","volume-title":"Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022","author":"Kaffes Kostis","year":"2022","unstructured":"Kostis Kaffes , Neeraja J. Yadwadkar , and Christos Kozyrakis . Hermod : principled and practical scheduling for serverless functions. In Ada Gavrilovska, Deniz Altinb\u00fcken, and Carsten Binnig, editors , Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022 , San Francisco, California, November 7--11 , 2022 , pages 289 -- 305 . ACM, 2022. Kostis Kaffes, Neeraja J. Yadwadkar, and Christos Kozyrakis. Hermod: principled and practical scheduling for serverless functions. In Ada Gavrilovska, Deniz Altinb\u00fcken, and Carsten Binnig, editors, Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, San Francisco, California, November 7--11, 2022, pages 289--305. ACM, 2022."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_19_1","DOI":"10.1109\/SP.2015.43"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_20_1","DOI":"10.1109\/ISPASS.2013.6557169"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_21_1","DOI":"10.1145\/2155620.2155650"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_22_1","DOI":"10.1147\/sj.92.0078"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_23_1","DOI":"10.1007\/978-3-319-26362-5_3"},{"key":"e_1_3_2_1_24_1","first-page":"541","volume-title":"Establishing a base of trust with performance counters for enterprise workloads. In 2015 {USENIX} Annual Technical Conference ({USENIX}{ATC} 15)","author":"Nowak Andrzej","year":"2015","unstructured":"Andrzej Nowak , Ahmad Yasin , Avi Mendelson , and Willy Zwaenepoel . Establishing a base of trust with performance counters for enterprise workloads. In 2015 {USENIX} Annual Technical Conference ({USENIX}{ATC} 15) , pages 541 -- 548 , 2015 . Andrzej Nowak, Ahmad Yasin, Avi Mendelson, and Willy Zwaenepoel. Establishing a base of trust with performance counters for enterprise workloads. In 2015 {USENIX} Annual Technical Conference ({USENIX}{ATC} 15), pages 541--548, 2015."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1109\/HPCA.2014.6835955","volume-title":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","author":"Nugteren Cedric","year":"2014","unstructured":"Cedric Nugteren , Gert-Jan Van den Braak , Henk Corp oraal, and Henri Bal . A detailed gpu cache model based on reuse distance theory . In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA) , pages 37 -- 48 . IEEE, 2014 . Cedric Nugteren, Gert-Jan Van den Braak, Henk Corporaal, and Henri Bal. A detailed gpu cache model based on reuse distance theory. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), pages 37--48. IEEE, 2014."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1109\/MICRO.2006.49","volume-title":"2006 39th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO'06)","author":"Qureshi Moinuddin K","year":"2006","unstructured":"Moinuddin K Qureshi and Yale N Patt . Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches . In 2006 39th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO'06) , pages 423 -- 432 . IEEE, 2006 . Moinuddin K Qureshi and Yale N Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In 2006 39th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO'06), pages 423--432. IEEE, 2006."},{"key":"e_1_3_2_1_27_1","first-page":"147","volume-title":"2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT)","author":"Rane Ashay","year":"2012","unstructured":"Ashay Rane and James Browne . Enhancing performance optimization of multicore chips and multichip nodes with data structure metrics . In 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT) , pages 147 -- 156 . IEEE, 2012 . Ashay Rane and James Browne. Enhancing performance optimization of multicore chips and multichip nodes with data structure metrics. In 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 147--156. IEEE, 2012."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_28_1","DOI":"10.1145\/2670979.2671007"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_29_1","DOI":"10.1145\/1854273.1854286"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_30_1","DOI":"10.1145\/2259016.2259040"},{"key":"e_1_3_2_1_31_1","first-page":"55","volume-title":"ACM SIGPLAN Notices","author":"Shen Xipeng","year":"2007","unstructured":"Xipeng Shen , Jonathan Shaw , Brian Meeker , and Chen Ding . Locality approximation using time . In ACM SIGPLAN Notices , volume 42 , pages 55 -- 61 . ACM , 2007 . Xipeng Shen, Jonathan Shaw, Brian Meeker, and Chen Ding. Locality approximation using time. In ACM SIGPLAN Notices, volume 42, pages 55--61. ACM, 2007."},{"unstructured":"SPEC. SPEC CPU Benchmarks.  SPEC. SPEC CPU Benchmarks.","key":"e_1_3_2_1_32_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_33_1","DOI":"10.1109\/12.165388"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_34_1","DOI":"10.1145\/2591635.2667181"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_35_1","DOI":"10.1023\/B:SUPE.0000014800.27383.8f"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_36_1","DOI":"10.1145\/2528521.1508259"},{"key":"e_1_3_2_1_37_1","first-page":"95","volume-title":"Conference on File and Storage Technologies ({FAST} 15)","author":"Waldspurger Carl A","year":"2015","unstructured":"Carl A Waldspurger , Nohhyun Park , Alexander Garthwaite , and Irfan Ahmad . Efficient {MRC} construction with {SHARDS}. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15) , pages 95 -- 110 , 2015 . Carl A Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad. Efficient {MRC} construction with {SHARDS}. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15), pages 95--110, 2015."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1109\/HPCA.2019.00056","volume-title":"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)","author":"Wang Qingsen","year":"2019","unstructured":"Qingsen Wang , Xu Liu , and Milind Chabbi . Featherlight reuse-distance measurement . In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA) , pages 440 -- 453 . IEEE, 2019 . Qingsen Wang, Xu Liu, and Milind Chabbi. Featherlight reuse-distance measurement. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 440--453. IEEE, 2019."},{"key":"e_1_3_2_1_39_1","first-page":"335","volume-title":"11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14)","author":"Wires Jake","year":"2014","unstructured":"Jake Wires , Stephen Ingram , Zachary Drudi , Nicholas JA Harvey, and Andrew Warfield. Characterizing storage workloads with counter stacks . In 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14) , pages 335 -- 349 , 2014 . Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas JA Harvey, and Andrew Warfield. Characterizing storage workloads with counter stacks. In 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14), pages 335--349, 2014."},{"unstructured":"H. Wong. Intel Ivy Bridge cache replacement policy http:\/\/blog.stuffedcow.net\/2013\/01\/ivb-cache-replacement\/.  H. Wong. Intel Ivy Bridge cache replacement policy http:\/\/blog.stuffedcow.net\/2013\/01\/ivb-cache-replacement\/ .","key":"e_1_3_2_1_40_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_41_1","DOI":"10.1145\/2499368.2451153"},{"key":"e_1_3_2_1_42_1","first-page":"13","volume-title":"Proceedings of the Thirteenth EuroSys Conference","author":"Xiang Yaocheng","unstructured":"Yaocheng Xiang , Xiaolin Wang , Zihui Huang , Zeyu Wang , Yingwei Luo , and Zhenlin Wang . Dcaps : dynamic cache allocation with partial sharing . In Proceedings of the Thirteenth EuroSys Conference , page 13 . ACM, 2018. Yaocheng Xiang, Xiaolin Wang, Zihui Huang, Zeyu Wang, Yingwei Luo, and Zhenlin Wang. Dcaps: dynamic cache allocation with partial sharing. In Proceedings of the Thirteenth EuroSys Conference, page 13. ACM, 2018."},{"key":"e_1_3_2_1_43_1","volume-title":"Proceedings of the 48th International Conference on Parallel Processing, ICPP 2019","author":"Xiao Jun","year":"2019","unstructured":"Jun Xiao , Andy D. Pimentel , and Xu Liu . Cppf : A prefetch aware llc partitioning approach . In Proceedings of the 48th International Conference on Parallel Processing, ICPP 2019 , New York, NY, USA , 2019 . Association for Computing Machinery. Jun Xiao, Andy D. Pimentel, and Xu Liu. Cppf: A prefetch aware llc partitioning approach. In Proceedings of the 48th International Conference on Parallel Processing, ICPP 2019, New York, NY, USA, 2019. Association for Computing Machinery."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_44_1","DOI":"10.1145\/1519065.1519076"}],"event":{"sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture"],"acronym":"ICS '23","name":"ICS '23: 37th International Conference on Supercomputing","location":"Orlando FL USA"},"container-title":["Proceedings of the 37th International Conference on Supercomputing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3577193.3593740","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:47:32Z","timestamp":1750178852000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3577193.3593740"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,21]]},"references-count":44,"alternative-id":["10.1145\/3577193.3593740","10.1145\/3577193"],"URL":"https:\/\/doi.org\/10.1145\/3577193.3593740","relation":{},"subject":[],"published":{"date-parts":[[2023,6,21]]},"assertion":[{"value":"2023-06-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}