{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T15:48:07Z","timestamp":1774712887851,"version":"3.50.1"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T00:00:00Z","timestamp":1774656000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T00:00:00Z","timestamp":1774656000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"\u201cC\u00e1tedra Chip Cantabria\u201d, funded by the Ministry for Digital Transformation and the Civil Service and by the European Union through the Next GenerationEU programme","award":["Project TSI-069100-2023-0011"],"award-info":[{"award-number":["Project TSI-069100-2023-0011"]}]},{"name":"Spanish Science and Technology Commission","award":["PID2022-136454NB-C21"],"award-info":[{"award-number":["PID2022-136454NB-C21"]}]},{"name":"Spanish Science and Technology Commission","award":["PID2022-136454NB-C21"],"award-info":[{"award-number":["PID2022-136454NB-C21"]}]},{"name":"Spanish Science and Technology Commission","award":["PID2022-136454NB-C21"],"award-info":[{"award-number":["PID2022-136454NB-C21"]}]},{"DOI":"10.13039\/501100006365","name":"Universidad de Cantabria","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006365","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Addressing the growing impact of the memory wall is critical to sustain performance in modern vector architectures. This work introduces the Bicameral+ Cache, an enhanced version of the Bicameral Cache architecture, which separates scalar and vector memory accesses into distinct cache structures, optimized for their respective locality patterns. Bicameral+ Cache incorporates two key improvements: a transition from a fully associative to a set-associative organization in the vector cache, reducing implementation complexity while preserving performance, and a novel replacement policy based on a configurable write-back threshold (WBT), which improves memory traffic efficiency. Experimental results show speedups of up to 1.59\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$\\times $$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mo>\u00d7<\/mml:mo>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    in dense workloads and 1.63\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$\\times $$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mo>\u00d7<\/mml:mo>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    in sparse ones, with respect to a conventional cache, when using a 16-way set-associative Bicameral+ Cache configuration. These findings, combined with estimations of a sevenfold area reduction and energy savings of one order of magnitude, confirm the practicality and effectiveness of the proposed enhancements for vector processing systems, retaining the benefits of the original Bicameral Cache design at reduced complexity and implementation costs.\n                  <\/jats:p>","DOI":"10.1007\/s11227-026-08457-x","type":"journal-article","created":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T14:48:44Z","timestamp":1774709324000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Bicameral+ Cache: re-assessing split vector and scalar cache designs for increased efficiency"],"prefix":"10.1007","volume":"82","author":[{"given":"Aitor","family":"Echevarr\u00eda","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Susana","family":"Rebolledo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Borja","family":"Perez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jose Luis","family":"Bosque","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter","family":"Hsu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2026,3,28]]},"reference":[{"key":"8457_CR1","doi-asserted-by":"publisher","unstructured":"Kova\u010devi\u0107 N, Mi\u0161elji\u0107 D, Stojkovi\u0107 A (2022) Risc-v vector processor for acceleration of machine learning algorithms. In: 2022 30th Telecommunications Forum (TELFOR). https:\/\/doi.org\/10.1109\/TELFOR56187.2022.9983779","DOI":"10.1109\/TELFOR56187.2022.9983779"},{"issue":"2","key":"8457_CR2","doi-asserted-by":"publisher","first-page":"729","DOI":"10.1007\/S11227-014-1316-5","volume":"71","author":"E Castillo","year":"2015","unstructured":"Castillo E, Camarero C, Borrego A, Bosque JL (2015) Financial applications on multi-cpu and multi-gpu architectures. J Supercomput 71(2):729\u2013739. https:\/\/doi.org\/10.1007\/S11227-014-1316-5","journal-title":"J Supercomput"},{"key":"8457_CR3","doi-asserted-by":"publisher","unstructured":"Robles OD, Bosque JL, Pastor L, Rodr\u00edguez A (2005) Performance analysis of a CBIR system on shared-memory systems and heterogeneous clusters In: Seventh International IEEE Workshop on Computer Architectures for Machine Perception, Italy, pp. 309\u2013314. https:\/\/doi.org\/10.1109\/CAMP.2005.40","DOI":"10.1109\/CAMP.2005.40"},{"key":"8457_CR4","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1016\/J.JPDC.2021.06.003","volume":"157","author":"B P\u00e9rez","year":"2021","unstructured":"P\u00e9rez B, Stafford E, Bosque JL, Beivide R (2021) Sigmoid: an auto-tuned load balancing algorithm for heterogeneous systems. J Parallel Distribut Comput 157:30\u201342. https:\/\/doi.org\/10.1016\/J.JPDC.2021.06.003","journal-title":"J Parallel Distribut Comput"},{"issue":"19","key":"8457_CR5","doi-asserted-by":"publisher","first-page":"2386","DOI":"10.3390\/electronics10192386","volume":"10","author":"R Nozal","year":"2021","unstructured":"Nozal R, Bosque JL (2021) Straightforward heterogeneous computing with the oneapi coexecutor runtime. Electronics 10(19):2386","journal-title":"Electronics"},{"issue":"2","key":"8457_CR6","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1109\/MM.2017.35","volume":"37","author":"N Stephens","year":"2017","unstructured":"Stephens N, Biles S, Boettcher M, Eapen J, Eyole M, Gabrielli G, Horsnell M, Magklis G, Martinez A, Premillieu N, Reid A, Rico A, Walker P (2017) The arm scalable vector extension. IEEE Micro 37(2):26\u201339. https:\/\/doi.org\/10.1109\/MM.2017.35","journal-title":"IEEE Micro"},{"key":"8457_CR7","unstructured":"RISC-V V\u2019 Vector Extension. ttps:\/\/github.com\/riscv\/riscv-v-spec\/"},{"key":"8457_CR8","doi-asserted-by":"publisher","first-page":"79225","DOI":"10.1109\/ACCESS.2020.2990418","volume":"8","author":"AF Furtunato","year":"2020","unstructured":"Furtunato AF, Georgiou K, Eder K, Xavier-De-Souza S (2020) When parallel speedups hit the memory wall. IEEE Access 8:79225\u201379238","journal-title":"IEEE Access"},{"key":"8457_CR9","doi-asserted-by":"publisher","unstructured":"Pohl A, Greese M, Cosenza B, Juurlink B (2019) A performance analysis of vector length agnostic code In: 2019 International Conference on High Performance Computing & Simulation (HPCS), pp. 159\u2013164. https:\/\/doi.org\/10.1109\/HPCS48598.2019.9188238","DOI":"10.1109\/HPCS48598.2019.9188238"},{"key":"8457_CR10","doi-asserted-by":"publisher","unstructured":"Wu H, Nathella K, Pusdesris J, Sunwoo D, Jain A, Lin C (2019) Temporal prefetching without the off-chip metadata In: Proceedings of the 52nd Annual IEEE\/ACM International Symposium on Microarchitecture. MICRO \u201952, pp. 996\u20131008. https:\/\/doi.org\/10.1145\/3352460.3358300","DOI":"10.1145\/3352460.3358300"},{"key":"8457_CR11","doi-asserted-by":"crossref","unstructured":"Wu H, Nathella K, Sunwoo D, Jain A, Lin C (2019) Efficient metadata management for irregular data prefetching In: ACM\/IEEE 46th Annual Int. Symposium on Computer Architecture (ISCA), pp. 1\u201313","DOI":"10.1145\/3307650.3322225"},{"key":"8457_CR12","doi-asserted-by":"publisher","unstructured":"Bakhshalipour M, Lotfi-Kamran P, Sarbazi-Azad H (2018) Domino temporal data prefetcher. In: IEEE Int Symposium on High Performance Computer Architecture (HPCA). https:\/\/doi.org\/10.1109\/HPCA.2018.00021","DOI":"10.1109\/HPCA.2018.00021"},{"key":"8457_CR13","doi-asserted-by":"publisher","unstructured":"Bakhshalipour M, Shakerinava M, Lotfi-Kamran P, Sarbazi-Azad H (2019) Bingo spatial data prefetcher. In: IEEE Int Symposium on High Performance Computer Architecture (HPCA). https:\/\/doi.org\/10.1109\/HPCA.2019.00053","DOI":"10.1109\/HPCA.2019.00053"},{"key":"8457_CR14","doi-asserted-by":"publisher","unstructured":"Pakalapati S, Panda B (2020) Bouquet of instruction pointers Instruction pointer classifier-based spatial hardware prefetching. In: 2020 ACM\/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), pp. 118\u2013131. https:\/\/doi.org\/10.1109\/ISCA45697.2020.00021","DOI":"10.1109\/ISCA45697.2020.00021"},{"key":"8457_CR15","doi-asserted-by":"publisher","unstructured":"Ruiz SR, P\u00e9rez B, Bosque JL, Hsu P (2024) The bicameral cache a split cache for vector architectures In: 30th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2024, Belgrade, Serbia, pp. 728\u2013736. https:\/\/doi.org\/10.1109\/ICPADS63350.2024.00099","DOI":"10.1109\/ICPADS63350.2024.00099"},{"key":"8457_CR16","doi-asserted-by":"publisher","unstructured":"Skadron K, Clark DW (1997) Design issues and tradeoffs for write buffers. In: Proceedings Third International Symposium on High-Performance Computer Architecture. https:\/\/doi.org\/10.1109\/HPCA.1997.569650","DOI":"10.1109\/HPCA.1997.569650"},{"issue":"3","key":"8457_CR17","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1109\/TVLSI.2015.2429587","volume":"24","author":"J Lee","year":"2016","unstructured":"Lee J, Kim S (2016) Write buffer-oriented energy reduction in the l1 data cache for embedded systems. IEEE Trans Very Large Scale Integr VLSI Syst 24(3):871\u2013883. https:\/\/doi.org\/10.1109\/TVLSI.2015.2429587","journal-title":"IEEE Trans Very Large Scale Integr VLSI Syst"},{"key":"8457_CR18","doi-asserted-by":"publisher","unstructured":"Chu PP, Gottipati R (1994) Write buffer design for on-chip cache. In: Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computers and Processors, pp. 311\u2013316. https:\/\/doi.org\/10.1109\/ICCD.1994.331913","DOI":"10.1109\/ICCD.1994.331913"},{"key":"8457_CR19","doi-asserted-by":"publisher","unstructured":"Jouppi NP (1993) Cache write policies and performance In: Proceedings of the 20th Annual International Symposium on Computer Architecture. ISCA \u201993, pp. 191\u2013201. Association for Computing Machinery, New York. https:\/\/doi.org\/10.1145\/165123.165154","DOI":"10.1145\/165123.165154"},{"issue":"2","key":"8457_CR20","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1109\/MM.2020.2974217","volume":"40","author":"D Suggs","year":"2020","unstructured":"Suggs D, Subramony M, Bouvier D (2020) The amd \u201czen 2\u2019\u2019 processor. IEEE Micro 40(2):45\u201352. https:\/\/doi.org\/10.1109\/MM.2020.2974217","journal-title":"IEEE Micro"},{"key":"8457_CR21","doi-asserted-by":"publisher","DOI":"10.1145\/3134437","author":"C Ye","year":"2017","unstructured":"Ye C, Ding C, Luo H, Brock J, Chen D, Jin H (2017) Cache exclusivity and sharing: theory and optimization. ACM Trans Archit Code Optim. https:\/\/doi.org\/10.1145\/3134437","journal-title":"ACM Trans Archit Code Optim"},{"issue":"7","key":"8457_CR22","doi-asserted-by":"publisher","first-page":"1822","DOI":"10.1109\/TC.2024.3388896","volume":"73","author":"M Perotti","year":"2024","unstructured":"Perotti M, Cavalcante M, Andri R, Cavigelli L, Benini L (2024) Ara2: exploring single- and multi-core vector processing with an efficient rvv 1.0 compliant open-source processor. IEEE Trans Comput 73(7):1822\u20131836. https:\/\/doi.org\/10.1109\/TC.2024.3388896","journal-title":"IEEE Trans Comput"},{"key":"8457_CR23","doi-asserted-by":"publisher","unstructured":"Chen C, Xiang X, Liu C, Shang Y, Guo R, Liu D, Lu Y, Hao Z, Luo J, Chen Z, Li C, Pu Y, Meng J, Yan X, Xie Y, Qi X (2020) Xuantie-910: a commercial multi-core 12-stage pipeline out-of-order 64-bit high performance risc-v processor with vector extension : Industrial product. In: 2020 ACM\/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), pp. 52\u201364. https:\/\/doi.org\/10.1109\/ISCA45697.2020.00016","DOI":"10.1109\/ISCA45697.2020.00016"},{"key":"8457_CR24","doi-asserted-by":"publisher","unstructured":"Balasubramonian R (2019) Innovations in the memory system. Synthesis Lectures on Computer Architecture 14:1\u2013151. https:\/\/doi.org\/10.2200\/S00933ED1V01Y201906CAC048","DOI":"10.2200\/S00933ED1V01Y201906CAC048"},{"key":"8457_CR25","doi-asserted-by":"publisher","unstructured":"Li S, Verdejo RS, Radojkovi\u0107 P, Jacob B (2019) Rethinking cycle accurate dram simulation In: Proceedings of the International Symposium on Memory Systems. MEMSYS \u201919, pp. 184\u2013191. https:\/\/doi.org\/10.1145\/3357526.3357539","DOI":"10.1145\/3357526.3357539"},{"key":"8457_CR26","doi-asserted-by":"publisher","DOI":"10.1145\/3422667","author":"C Ram\u00edrez","year":"2020","unstructured":"Ram\u00edrez C, Hern\u00e1ndez CA, Palomar O, Unsal O, Ram\u00edrez MA, Cristal A (2020) A RISC-V simulator and benchmark suite for designing and evaluating vector architectures. ACM Trans Archit Code Optim. https:\/\/doi.org\/10.1145\/3422667","journal-title":"ACM Trans Archit Code Optim"},{"key":"8457_CR27","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663","author":"TA Davis","year":"2011","unstructured":"Davis TA, Hu Y (2011) The university of florida sparse matrix collection. ACM Trans Math Softw. https:\/\/doi.org\/10.1145\/2049662.2049663","journal-title":"ACM Trans Math Softw"},{"key":"8457_CR28","doi-asserted-by":"publisher","DOI":"10.1145\/3085572","author":"R Balasubramonian","year":"2017","unstructured":"Balasubramonian R, Kahng AB, Muralimanohar N, Shafiee A, Srinivas V (2017) Cacti 7: new tools for interconnect exploration in innovative off-chip memories. ACM Trans Archit Code Optim. https:\/\/doi.org\/10.1145\/3085572","journal-title":"ACM Trans Archit Code Optim"},{"key":"8457_CR29","doi-asserted-by":"publisher","unstructured":"Batten C, Krashinsky R, Gerding S, Asanovic K (2004) Cache refill\/access decoupling for vector machines. In: 37th International Symposium on Microarchitecture (MICRO-37\u201904), pp. 331\u2013342. https:\/\/doi.org\/10.1109\/MICRO.2004.9","DOI":"10.1109\/MICRO.2004.9"},{"key":"8457_CR30","doi-asserted-by":"publisher","unstructured":"Gao Y, Shoji N, Egawa R, Takizawa H, Kobayashi H (2013) Design and evaluation of a media-oriented vector processor with a multi-banked cache memory. In: The 11th IEEE Symposium on Embedded Systems for Real-time Multimedia, pp. 78\u201387. https:\/\/doi.org\/10.1109\/ESTIMedia.2013.6704506","DOI":"10.1109\/ESTIMedia.2013.6704506"},{"issue":"9","key":"8457_CR31","doi-asserted-by":"publisher","first-page":"565","DOI":"10.2514\/1.I011097","volume":"20","author":"S Di Mascio","year":"2023","unstructured":"Di Mascio S, Menicucci A, Gill E, Monteleone C (2023) Extending the noel-v platform with a risc-v vector processor for space applications. J Aerosp Inf Syst 20(9):565\u2013574. https:\/\/doi.org\/10.2514\/1.I011097","journal-title":"J Aerosp Inf Syst"},{"key":"8457_CR32","doi-asserted-by":"publisher","unstructured":"Musa A, Sato Y, Soga T, Okabe K, Egawa R, Takizawa H, Kobayashi H (2008) A shared cache for a chip multi vector processor In: 9th Workshop on Memory Performance: Dealing with Applications, Systems and Architecture. MEDEA \u201908, pp. 24\u201329. https:\/\/doi.org\/10.1145\/1509084.1509088","DOI":"10.1145\/1509084.1509088"},{"key":"8457_CR33","doi-asserted-by":"publisher","unstructured":"Gonz\u00e1lez A, Aliagas C, Valero M (1995) A data cache with multiple caching strategies tuned to different types of locality. In: 9th International Conference on Supercomputing. ICS \u201995, pp. 338\u2013347. https:\/\/doi.org\/10.1145\/224538.224622","DOI":"10.1145\/224538.224622"},{"key":"8457_CR34","doi-asserted-by":"publisher","unstructured":"Rothman JB, Smith AJ (2000) Sector cache design and performance. In: Proceedings 8th International Sympossium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 124\u2013133. https:\/\/doi.org\/10.1109\/MASCOT.2000.876437","DOI":"10.1109\/MASCOT.2000.876437"},{"issue":"3","key":"8457_CR35","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1145\/115953.115959","volume":"19","author":"JWC Fu","year":"1991","unstructured":"Fu JWC, Patel JH (1991) Data prefetching in multiprocessor vector cache memories. SIGARCH Comput Archit News 19(3):54\u201363. https:\/\/doi.org\/10.1145\/115953.115959","journal-title":"SIGARCH Comput Archit News"},{"key":"8457_CR36","doi-asserted-by":"publisher","unstructured":"Sethumurugan S, Yin J, Sartori J (2021) Designing a cost-effective cache replacement policy using machine learning. In: 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pp. 291\u2013303. https:\/\/doi.org\/10.1109\/HPCA51647.2021.00033","DOI":"10.1109\/HPCA51647.2021.00033"},{"key":"8457_CR37","doi-asserted-by":"publisher","unstructured":"Li Y, Gao M (2023) Baryon Efficient hybrid memory management with compression and sub-blocking. In: 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pp. 137\u2013151. https:\/\/doi.org\/10.1109\/HPCA56546.2023.10071115","DOI":"10.1109\/HPCA56546.2023.10071115"},{"key":"8457_CR38","doi-asserted-by":"publisher","unstructured":"Escuin C, Khan AA, Ib\u00e1\u00f1ez P, Monreal T, Castrillon J, Vi\u00f1als V (2023) Compression-aware and performance-efficient insertion policies for long-lasting hybrid llcs. In: 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pp. 179\u2013192. https:\/\/doi.org\/10.1109\/HPCA56546.2023.10070968","DOI":"10.1109\/HPCA56546.2023.10070968"},{"key":"8457_CR39","doi-asserted-by":"publisher","unstructured":"Khan S, Alameldeen AR, Wilkerson C, Mutlu O, Jimenez DA (2014) Improving cache performance using read-write partitioning. In: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), pp. 452\u2013463. https:\/\/doi.org\/10.1109\/HPCA.2014.6835954","DOI":"10.1109\/HPCA.2014.6835954"},{"issue":"1","key":"8457_CR40","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1109\/MM.2010.102","volume":"31","author":"J Stuecheli","year":"2011","unstructured":"Stuecheli J, Kaseridis D, Daly D, Hunter H, John L (2011) Coordinating dram and last-level-cache policies with the virtual write queue. IEEE Micro 31(1):90\u201398. https:\/\/doi.org\/10.1109\/MM.2010.102","journal-title":"IEEE Micro"},{"issue":"2","key":"8457_CR41","doi-asserted-by":"publisher","first-page":"445","DOI":"10.1109\/TCAD.2017.2712666","volume":"37","author":"B Lee","year":"2018","unstructured":"Lee B, Kim K, Chung E-Y (2018) Replacement policy adaptable miss curve estimation for efficient cache partitioning. IEEE Trans Comput Aided Des Integr Circuits Syst 37(2):445\u2013457. https:\/\/doi.org\/10.1109\/TCAD.2017.2712666","journal-title":"IEEE Trans Comput Aided Des Integr Circuits Syst"},{"issue":"2","key":"8457_CR42","doi-asserted-by":"publisher","first-page":"530","DOI":"10.1109\/TVLSI.2019.2950087","volume":"28","author":"M Cavalcante","year":"2020","unstructured":"Cavalcante M, Schuiki F, Zaruba F, Schaffner M, Benini L (2020) Ara: a 1-ghz+ scalable and energy-efficient risc-v vector processor with multiprecision floating-point support in 22-nm fd-soi. IEEE Trans Very Large Scale Integr VLSI Syst 28(2):530\u2013543. https:\/\/doi.org\/10.1109\/TVLSI.2019.2950087","journal-title":"IEEE Trans Very Large Scale Integr VLSI Syst"},{"key":"8457_CR43","doi-asserted-by":"publisher","unstructured":"Minervini F, Palomar O, Unsal O, Reggiani E, Quiroga J, Marimon J, Rojas C, Figueras R, Ruiz A, Gonzalez A, Mendoza J, Vargas I, Hernandez C, Cabre J, Khoirunisya L, Bouhali M, Pavon J, Moll F, Olivieri M, Kovac M, Kovac M, Dragic L, Valero M, Cristal A (2023) Vitruvius+: an area-efficient risc-v decoupled vector coprocessor for high performance computing applications. ACM Trans Archit Code Optim. https:\/\/doi.org\/10.1145\/3575861","DOI":"10.1145\/3575861"},{"key":"8457_CR44","doi-asserted-by":"publisher","unstructured":"P870 high-performance risc-v processor. In: 2023 IEEE Hot Chips 35 Symposium (HCS), pp. 1\u201319 (2023). https:\/\/doi.org\/10.1109\/HCS59251.2023.10254712","DOI":"10.1109\/HCS59251.2023.10254712"},{"key":"8457_CR45","doi-asserted-by":"publisher","unstructured":"Cavalcante M, W\u00fcthrich D, Perotti M, Riedel S, Benini L (2022) Spatz: a compact vector processing unit for high-performance and energy-efficient shared-l1 clusters. In: Proceedings of the 41st IEEE\/ACM International Conference on Computer-Aided Design. ICCAD \u201922. Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3508352.3549367","DOI":"10.1145\/3508352.3549367"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-026-08457-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-026-08457-x","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-026-08457-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T14:48:46Z","timestamp":1774709326000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-026-08457-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,28]]},"references-count":45,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2026,4]]}},"alternative-id":["8457"],"URL":"https:\/\/doi.org\/10.1007\/s11227-026-08457-x","relation":{},"ISSN":["1573-0484"],"issn-type":[{"value":"1573-0484","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,28]]},"assertion":[{"value":"15 October 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 March 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 March 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"307"}}