{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:20:57Z","timestamp":1750220457340,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":39,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,6,3]],"date-time":"2021-06-03T00:00:00Z","timestamp":1622678400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Key-Area Research and Development Program of Guangdong Province","award":["2020B010164003"],"award-info":[{"award-number":["2020B010164003"]}]},{"name":"the National Natural Science Foundation of China","award":["62090020"],"award-info":[{"award-number":["62090020"]}]},{"name":"the Strategic Priority Research Program of Chinese Academy of Sciences","award":["XDC05030200"],"award-info":[{"award-number":["XDC05030200"]}]},{"name":"Youth Innovation Promotion Association of Chinese Academy of Sciences","award":["2013073, 2020105"],"award-info":[{"award-number":["2013073, 2020105"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,6,3]]},"DOI":"10.1145\/3447818.3460367","type":"proceedings-article","created":{"date-parts":[[2021,6,4]],"date-time":"2021-06-04T15:09:36Z","timestamp":1622819376000},"page":"152-163","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Omegaflow"],"prefix":"10.1145","author":[{"given":"Yaoyang","family":"Zhou","sequence":"first","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zihao","family":"Yu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chuanqi","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yinan","family":"Xu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huizhe","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sa","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences; Institute Of Computing Technology(Nanjing), Chinese Academy Of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ninghui","family":"Sun","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yungang","family":"Bao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology; University of Chinese Academy of Sciences; Peng Cheng Laboratory"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,6,4]]},"reference":[{"volume-title":"SPEC CPU2017","year":"2017","key":"e_1_3_2_1_1_1","unstructured":"[n.d.]. SPEC CPU2017 . https:\/\/www.spec.org\/cpu 2017 . [n.d.]. SPEC CPU2017. https:\/\/www.spec.org\/cpu2017."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339691"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358293"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228584"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2004.65"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/335231.335263"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750407"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750407"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2019.2897782"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/279361.279378"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815966"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2012.51"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155623"},{"key":"e_1_3_2_1_15_1","first-page":"1","article-title":"Simpoint 3.0: Faster and more flexible program phase analysis","volume":"7","author":"Hamerly Greg","year":"2005","unstructured":"Greg Hamerly , Erez Perelman , Jeremy Lau , and Brad Calder . 2005 . Simpoint 3.0: Faster and more flexible program phase analysis . Journal of Instruction Level Parallelism 7 , 4 (2005), 1 -- 28 . Greg Hamerly, Erez Perelman, Jeremy Lau, and Brad Calder. 2005. Simpoint 3.0: Faster and more flexible program phase analysis. Journal of Instruction Level Parallelism 7, 4 (2005), 1--28.","journal-title":"Journal of Instruction Level Parallelism"},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the international symposium on low power electronics and design. IEEE, 196--201","author":"Huang Michael","year":"2002","unstructured":"Michael Huang , Jose Renau , and Josep Torrellas . 2002 . Energy-efficient hybrid wakeup logic . In Proceedings of the international symposium on low power electronics and design. IEEE, 196--201 . Michael Huang, Jose Renau, and Josep Torrellas. 2002. Energy-efficient hybrid wakeup logic. In Proceedings of the international symposium on low power electronics and design. IEEE, 196--201."},{"key":"e_1_3_2_1_17_1","volume-title":"Freeway: Maximizing MLP for Slice-Out-of-Order Execution. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 558--569","author":"Kumar Rakesh","year":"2019","unstructured":"Rakesh Kumar , Mehdi Alipour , and David Black-Schaffer . 2019 . Freeway: Maximizing MLP for Slice-Out-of-Order Execution. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 558--569 . Rakesh Kumar, Mehdi Alipour, and David Black-Schaffer. 2019. Freeway: Maximizing MLP for Slice-Out-of-Order Execution. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 558--569."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2017.59"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750414"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2717764.2717783"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446087"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/31846.42226"},{"key":"e_1_3_2_1_23_1","volume-title":"A tool to model large caches. HP laboratories 27","author":"Muralimanohar Naveen","year":"2009","unstructured":"Naveen Muralimanohar , Rajeev Balasubramonian , and Norman P Jouppi . 2009. CACTI 6.0 : A tool to model large caches. HP laboratories 27 ( 2009 ), 28. Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP laboratories 27 (2009), 28."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080255"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750380"},{"volume-title":"Complexity-effective superscalar processors","author":"Palacharla Subbarao","key":"e_1_3_2_1_26_1","unstructured":"Subbarao Palacharla , Norman P Jouppi , and James E Smith . 1997. Complexity-effective superscalar processors . Vol. 25 . ACM. Subbarao Palacharla, Norman P Jouppi, and James E Smith. 1997. Complexity-effective superscalar processors. Vol. 25. ACM."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2002.1003589"},{"volume-title":"Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'04)","author":"Ram\u00edrez Marco A","key":"e_1_3_2_1_28_1","unstructured":"Marco A Ram\u00edrez , Adrian Cristal , Alexander V Veidenbaum , Luis Villa , and Mateo Valero . 2004. Direct instruction wakeup for out-of-order processors . In Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'04) . IEEE , 2--9. Marco A Ram\u00edrez, Adrian Cristal, Alexander V Veidenbaum, Luis Villa, and Mateo Valero. 2004. Direct instruction wakeup for out-of-order processors. In Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'04). IEEE, 2--9."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522341"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859667"},{"key":"e_1_3_2_1_31_1","volume-title":"Edward Brekelbaum, Gabriel H Loh, and Bryan Black.","author":"Sassone Peter G","year":"2007","unstructured":"Peter G Sassone , Jeff Rupley II , Edward Brekelbaum, Gabriel H Loh, and Bryan Black. 2007 . Matrix scheduler reloaded. In ACM SIGARCH Computer Architecture News, Vol. 35 . ACM , 335--346. Peter G Sassone, Jeff Rupley II, Edward Brekelbaum, Gabriel H Loh, and Bryan Black. 2007. Matrix scheduler reloaded. In ACM SIGARCH Computer Architecture News, Vol. 35. ACM, 335--346."},{"key":"e_1_3_2_1_32_1","volume-title":"5th JILP Workshop on Computer Architecture Competitions (JWAC-5): Championship Branch Prediction (CBP-5).","author":"Seznec Andr\u00e9","year":"2016","unstructured":"Andr\u00e9 Seznec . 2016 . Tage-sc-l branch predictors again . In 5th JILP Workshop on Computer Architecture Competitions (JWAC-5): Championship Branch Prediction (CBP-5). Andr\u00e9 Seznec. 2016. Tage-sc-l branch predictors again. In 5th JILP Workshop on Computer Architecture Competitions (JWAC-5): Championship Branch Prediction (CBP-5)."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.39"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783752"},{"key":"e_1_3_2_1_35_1","volume-title":"2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 305--316","author":"Suleman M Aater","year":"2012","unstructured":"M Aater Suleman , Milad Hashemi , Chris Wilkerson , Yale N Patt , 2012 . Morphcore: An energy-efficient microarchitecture for high performance ilp and high throughput tlp . In 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 305--316 . M Aater Suleman, Milad Hashemi, Chris Wilkerson, Yale N Patt, et al. 2012. Morphcore: An energy-efficient microarchitecture for high performance ilp and high throughput tlp. In 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 305--316."},{"key":"e_1_3_2_1_36_1","volume-title":"WaveScalar. In Proceedings of the 36th annual IEEE\/ACM International Symposium on Microarchitecture. IEEE Computer Society, 291","author":"Swanson Steven","year":"2003","unstructured":"Steven Swanson , Ken Michelson , Andrew Schwerin , and Mark Oskin . 2003 . WaveScalar. In Proceedings of the 36th annual IEEE\/ACM International Symposium on Microarchitecture. IEEE Computer Society, 291 . Steven Swanson, Ken Michelson, Andrew Schwerin, and Mark Oskin. 2003. WaveScalar. In Proceedings of the 36th annual IEEE\/ACM International Symposium on Microarchitecture. IEEE Computer Society, 291."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1183401.1183427"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.21236\/ADA605735"},{"key":"e_1_3_2_1_39_1","volume-title":"On a class of multistage interconnection networks","author":"Wu Chuan-Lin","year":"1980","unstructured":"Chuan-Lin Wu and Tse-Yun Feng . 1980. On a class of multistage interconnection networks . IEEE transactions on Computers 100, 8 ( 1980 ), 694--702. Chuan-Lin Wu and Tse-Yun Feng. 1980. On a class of multistage interconnection networks. IEEE transactions on Computers 100, 8 (1980), 694--702."}],"event":{"name":"ICS '21: 2021 International Conference on Supercomputing","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture"],"location":"Virtual Event USA","acronym":"ICS '21"},"container-title":["Proceedings of the ACM International Conference on Supercomputing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447818.3460367","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3447818.3460367","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:06Z","timestamp":1750193286000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447818.3460367"}},"subtitle":["a high-performance dependency-based architecture"],"short-title":[],"issued":{"date-parts":[[2021,6,3]]},"references-count":39,"alternative-id":["10.1145\/3447818.3460367","10.1145\/3447818"],"URL":"https:\/\/doi.org\/10.1145\/3447818.3460367","relation":{},"subject":[],"published":{"date-parts":[[2021,6,3]]},"assertion":[{"value":"2021-06-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}