{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,25]],"date-time":"2025-11-25T19:58:12Z","timestamp":1764100692882,"version":"3.46.0"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"12","license":[{"start":{"date-parts":[[2025,11,25]],"date-time":"2025-11-25T00:00:00Z","timestamp":1764028800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/publication-rights-and-licensing-policy"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Commun. ACM"],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:p>As early career researchers supported by modest grants from the National Science Foundation, we modeled the expected effects of semiconductor scaling trends on computer architectures and built compiler infrastructure to optimize for inevitable heterogeneity in computer architecture. This early work inspired the vision for the UT-TRIPS project, which was subsequently funded by the U.S. Department of Defense, the University of Texas at Austin, multiple computer companies, and a private foundation. This partnership enabled our team to develop novel computer architectures and compiler technologies that demonstrated the viability of highly parallel chip designs that were composed of distributed processing and memory systems components. This article traces the history and ultimate impact of the project.<\/jats:p>","DOI":"10.1145\/3760436","type":"journal-article","created":{"date-parts":[[2025,11,24]],"date-time":"2025-11-24T17:07:14Z","timestamp":1764004034000},"page":"90-97","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["The TRIPS Project"],"prefix":"10.1145","volume":"68","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-6588-6596","authenticated-orcid":false,"given":"Doug","family":"Burger","sequence":"first","affiliation":[{"name":"Microsoft, Microsoft Research, Redmond, Washington, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6701-6099","authenticated-orcid":false,"given":"Stephen W.","family":"Keckler","sequence":"additional","affiliation":[{"name":"NVIDIA Corp, NVIDIA Research, Santa Clara, California, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7188-2501","authenticated-orcid":false,"given":"Kathryn S.","family":"McKinley","sequence":"additional","affiliation":[{"name":"Google Inc., Seattle, Washington, United States"}]}],"member":"320","published-online":{"date-parts":[[2025,11,25]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"Agarwal V. Hrishikesh M.S. Keckler S. W. and Burger D. Clock rate versus IPC: The end of the road for conventional microarchitectures. In Proceedings of the Intern. Symp. on Computer Architecture \u00a0IEEE (2000) 248\u2013259.","key":"e_1_3_1_2_2","DOI":"10.1145\/339647.339691"},{"doi-asserted-by":"publisher","key":"e_1_3_1_3_2","DOI":"10.1109\/MC.2004.65"},{"doi-asserted-by":"crossref","unstructured":"Coons K.E. et al. A spatial path scheduling algorithm for EDGE architectures. In Proceedings of the Intern. Conf. on Architectural Support for Programming Languages and Operating Systems ACM\u00a0(2006) 129\u2013140.","key":"e_1_3_1_4_2","DOI":"10.1145\/1168857.1168875"},{"volume-title":"National Research Council","year":"2012","unstructured":"Continuing Innovation in Information Technology. National Research Council.\u00a0The National Academies Press\u00a0(2012).","key":"e_1_3_1_5_2"},{"doi-asserted-by":"crossref","unstructured":"Gebhart M. et al. An evaluation of the TRIPS computer system. In Proceedings of the Intern. Conf. on Architectural Support for Programming Languages and Operating Systems \u00a0ACM (2009) 1\u201312.","key":"e_1_3_1_6_2","DOI":"10.1145\/1508244.1508246"},{"doi-asserted-by":"crossref","unstructured":"Goldstein S.C. et al. PipeRench: A coprocessor for streaming multimedia acceleration. In Proceedings of the Intern. Symp. on Computer Architecture IEEE (1999) 28\u201339.","key":"e_1_3_1_7_2","DOI":"10.1145\/307338.300982"},{"doi-asserted-by":"crossref","unstructured":"Gratz P. et al. Implementation and evaluation of a dynamically routed processor operand network. In Proceedings of the Intern. Symp. on Networks-on-Chip IEEE (2007) 7\u201317.","key":"e_1_3_1_8_2","DOI":"10.1109\/NOCS.2007.23"},{"doi-asserted-by":"crossref","unstructured":"Hrishikesh M.S. et al. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays. In Proceedings of the Intern. Symp. on Computer Architecture IEEE (2002) 14\u201324.","key":"e_1_3_1_9_2","DOI":"10.1145\/545214.545218"},{"unstructured":"Huang X. Wang Z. and McKinley K.S. Compiling for an impulse memory controller. In Proceedings of the Intern. Conf. on Parallel Architectures and Compilation Techniques\u00a0(2001) 141\u2013150.","key":"e_1_3_1_10_2"},{"doi-asserted-by":"crossref","unstructured":"Huh J. et al. A NUCA substrate for flexible CMP cache sharing. In Proceedings of the Intern. Conf. on Supercomputing ACM (2005) 31\u201340.","key":"e_1_3_1_11_2","DOI":"10.1145\/1088149.1088154"},{"doi-asserted-by":"crossref","unstructured":"Huh J. et al. Author retrospective for a NUCA substrate for flexible CMP cache sharing. In Proceedings of the ACM Intern. Conf. on Supercomputing 25th Anniversary Volume\u00a0(2014) 74\u201376.","key":"e_1_3_1_12_2","DOI":"10.1145\/2591635.2591667"},{"volume-title":"Intel Xeon\u00ae Processor Scalable Family Technical Overview","unstructured":"Intel Xeon\u00ae Processor Scalable Family Technical Overview. Technical Report ID 673025. Intel Corporation.\u00a0Section \u201cSkylake Mesh Architecture\u201d (2022),\u00a025\u201328; https:\/\/tinyurl.com\/28fujm8h.","first-page":"25","key":"e_1_3_1_13_2"},{"doi-asserted-by":"crossref","unstructured":"Kim C. Burger D. and Keckler S.W. An adaptive non-uniform cache structure for wire-delay dominated on-chip caches. In Proceedings of the Intern. Conf. on Architectural Support for Programming Languages and Operating Systems ACM (2002) 211\u2013222.","key":"e_1_3_1_14_2","DOI":"10.1145\/605397.605420"},{"doi-asserted-by":"crossref","unstructured":"Maher B. Smith A. Burger D. and McKinley K.S. Merging head and tail duplication for convergent hyperblock formation. In Proceedings of the Intern. Symp. on Microarchitecture IEEE\/ACM (2006) 65\u201376.","key":"e_1_3_1_15_2","DOI":"10.1109\/MICRO.2006.34"},{"key":"e_1_3_1_16_2","volume-title":"The Scale Compiler","author":"McKinley K.S.","year":"2005","unstructured":"McKinley, K.S. et al. The Scale Compiler. Technical Report. University of Massachusetts, University of Texas (2005)."},{"doi-asserted-by":"publisher","key":"e_1_3_1_17_2","DOI":"10.1007\/BF02577867"},{"doi-asserted-by":"crossref","unstructured":"Nagarajan R. Sankaralingam K. Burger D. and Keckler S.W. A design space evaluation of grid processor architectures. In Proceedings of the Intern. Symp. on Microarchitecture IEEE\/ACM (2001) 40\u201351.","key":"e_1_3_1_18_2","DOI":"10.1109\/MICRO.2001.991104"},{"doi-asserted-by":"crossref","unstructured":"Nowatzki T. et al. A general constraint-centric scheduling framework for spatial architectures. In Proceedings of the\u00a0ACM SIGPLAN Conf. on Programming Language Design and Implementation\u00a0(2013) 495\u2013506.","key":"e_1_3_1_19_2","DOI":"10.1145\/2491956.2462163"},{"doi-asserted-by":"crossref","unstructured":"Sankaralingam K. et al. Exploiting ILP TLP and DLP with the Polymorphous TRIPS architecture. In Proceedings of the Intern. Symp. on Computer Architecture\u00a0(2003) 422\u2013433.","key":"e_1_3_1_20_2","DOI":"10.1145\/871656.859667"},{"doi-asserted-by":"crossref","unstructured":"Sankaralingam K. et al. Distributed microarchitectural protocols in the TRIPS prototype processor. In Proceedings of the Intern. Symp. on Microarchitecture IEEE\/ACM (2006) 480\u2013491.","key":"e_1_3_1_21_2","DOI":"10.1109\/MICRO.2006.19"},{"doi-asserted-by":"publisher","key":"e_1_3_1_22_2","DOI":"10.1147\/JRD.2011.2127330"},{"doi-asserted-by":"crossref","unstructured":"Smith A. et al. Compiling for EDGE architectures. In Proceedoings of the Intern. Symp. on Code Generation and Optimization IEEE (2006) 185\u2013195.","key":"e_1_3_1_23_2","DOI":"10.1109\/CGO.2006.10"},{"doi-asserted-by":"crossref","unstructured":"Smith A. et al. Dataflow Predication. In Proceedings of the Intern. Symp. on Microarchitecture IEEE\/ACM (2006) 89\u2013102.","key":"e_1_3_1_24_2","DOI":"10.1109\/MICRO.2006.17"},{"doi-asserted-by":"crossref","unstructured":"Stuecheli J. POWER8. In Proceedings of Hot Chips 25: A Symp. of High Performance Chips\u00a0(2013).","key":"e_1_3_1_25_2","DOI":"10.1109\/HOTCHIPS.2013.7478303"},{"doi-asserted-by":"crossref","unstructured":"Velten M. Sch\u00f6ne R. Ilsche T. and Hackenberg D. Memory performance of AMD EPYC Rome and Intel Cascade Lake SP server processors. In Proceedings of the ACM\/SPEC Intern. Conf. on Performance Engineering\u00a0(2022) 165\u2013175.","key":"e_1_3_1_26_2","DOI":"10.1145\/3489525.3511689"},{"doi-asserted-by":"publisher","key":"e_1_3_1_27_2","DOI":"10.1109\/2.612254"},{"unstructured":"Wang Z. McKinley K.S. Rosenberg A.L. and Weems C.C. Using the compiler to improve cache replacement decisions. In Proceedings of the Intern. Conf. on Parallel Architectures and Compilation Techniques\u00a0(2002) 199\u2013208.","key":"e_1_3_1_28_2"},{"doi-asserted-by":"crossref","unstructured":"Weaver G.E. McKinley K.S. and Weemes C.C. Compiling high-level languages for configurable computers: Applying lessons from heterogeneous processing. In Proceedings of the SPIE Intern. Conf.: Photonics East. (1996) 249\u2013258.","key":"e_1_3_1_29_2","DOI":"10.1117\/12.255822"}],"container-title":["Communications of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3760436","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,25]],"date-time":"2025-11-25T19:56:34Z","timestamp":1764100594000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3760436"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,25]]},"references-count":28,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["10.1145\/3760436"],"URL":"https:\/\/doi.org\/10.1145\/3760436","relation":{},"ISSN":["0001-0782","1557-7317"],"issn-type":[{"type":"print","value":"0001-0782"},{"type":"electronic","value":"1557-7317"}],"subject":[],"published":{"date-parts":[[2025,11,25]]},"assertion":[{"value":"2025-06-20","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-11-25","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}