{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T18:55:18Z","timestamp":1771959318587,"version":"3.50.1"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[1999,7,1]],"date-time":"1999-07-01T00:00:00Z","timestamp":930787200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Program. Lang. Syst."],"published-print":{"date-parts":[[1999,7]]},"abstract":"<jats:p>With the ever-widening performance gap between processors and main memory, cache memory, which is used to bridge this gap, is becoming more and more significant. Caches work well for programs that exhibit sufficient locality. Other programs, however, have reference patterns that fail to exploit the cache, thereby suffering heavily from high memory latency. In order to get high cache efficiency and achieve good program performance, efficient memory accessing behavior is necessary. In fact, for many programs, program transformations or source-code changes can radically alter memory access patterns, significantly improving cache performance. Both hand-tuning and compiler optimization techniques are often used to transform codes to improve cache utilization.  Unfortunately, cache conflicts   are difficult to predict and estimate, precluding effective transformations. Hence, effective transformations require detailed knowledge about the frequency and causes of cache misses in the code. This article describes methods for generating and solving Cache Miss Equations (CMEs) that give a detailed representation of cache behavior, including conflict misses, in loop-oriented scientific code. Implemented within the SUIF compiler framework, our approach extends traditional compiler reuse analysis to generate linear Diophantine equations that summarize each loop's memory behavior. While solving these equations is in general difficult, we show that is also unnecessary, as mathematical techniques for manipulating Diophantine equations allow us to relatively easily compute and\/or reduce  the  number of possible solutions, where each solution corresponds to a potential cache miss. The mathematical precision of CMEs allows us to find true optimal solutions for transformations such as blocking or padding. The generality of CMEs also allows us to reason about interactions between transformations applied in concert. The article also gives examples of their use to determine array padding<\/jats:p>\n          <jats:p>and offset amounts that minimize cache misses, and to determine optimal blocking factors for tiled code. Overall, these equations represent an analysis framework that offers the generality and precision needed for detailed compiler optimizations.<\/jats:p>","DOI":"10.1145\/325478.325479","type":"journal-article","created":{"date-parts":[[2002,10,7]],"date-time":"2002-10-07T13:52:47Z","timestamp":1033998767000},"page":"703-746","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":190,"title":["Cache miss equations"],"prefix":"10.1145","volume":"21","author":[{"given":"Somnath","family":"Ghosh","sequence":"first","affiliation":[{"name":"Princeton Univ., Princeton, NJ"}]},{"given":"Margaret","family":"Martonosi","sequence":"additional","affiliation":[{"name":"Princeton Univ., Princeton, NJ"}]},{"given":"Sharad","family":"Malik","sequence":"additional","affiliation":[{"name":"Princeton Univ., Princeton, NJ"}]}],"member":"320","published-online":{"date-parts":[[1999,7]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Adler A. and Coury J. E. 1995. The Theory of Numbers: A Text and Source Book of Problems. Jones and Bartlett Publishers Boston MA.  Adler A. and Coury J. E. 1995. The Theory of Numbers: A Text and Source Book of Problems. Jones and Bartlett Publishers Boston MA."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/29873.29875"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the IBM Centre for Advanced Studies Conference '94","author":"Bacon D. F."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Banerjee U. 1993. Loop transformations for Restructuring Compilers. Kluwer Academic Publishers Norwell MA.   Banerjee U. 1993. Loop transformations for Restructuring Compilers. Kluwer Academic Publishers Norwell MA.","DOI":"10.1007\/b102311"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the Supercomputing '92 Conference.","author":"Carr S."},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 8th SIAM Conference on Paral lel Processing for Scientic Computing.","author":"Carr S."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/237578.237617"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/207110.207162"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 3rd Workshop on Programming Languages and Compilers for Parallel Computing.","author":"Eisenbeis C."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 4th International Workshop on Languages and Compilers for Parallel Computing.","author":"Ferrante J."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/55364.55388"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/0743-7315(88)90014-7"},{"key":"e_1_2_1_14_1","volume-title":"Computer Architecture: A Quantitative Approach","author":"Hennessy J. L.","year":"1996"},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Hill M. D. 1987. Aspects of cache memory and instruction buffer performance. Ph.D. thesis Computer Science Dept. University of California Berkeley.   Hill M. D. 1987. Aspects of cache memory and instruction buffer performance. Ph.D. thesis Computer Science Dept. University of California Berkeley.","DOI":"10.21236\/ADA604007"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.40842"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/73560.73588"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/258915.258946"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/106972.106981"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.318580"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/143365.143541"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/133057.133079"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/233561.233564"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/237090.237161"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/143365.143488"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/181181.181561"},{"key":"e_1_2_1_27_1","unstructured":"Porterfield A. K. 1989. Software methods for improvement of cache performance on supercomputer applications. Ph.D. thesis Rice University.   Porterfield A. K. 1989. Software methods for improvement of cache performance on supercomputer applications. Ph.D. thesis Rice University."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/135226.135233"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/178243.178254"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/277650.277661"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/166955.166974"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/183018.183047"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/169627.169762"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 1990 International Conference on Parallel Processing.","author":"Torrellas J."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/193209.193217"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/113445.113449"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/76263.76337"}],"container-title":["ACM Transactions on Programming Languages and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/325478.325479","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/325478.325479","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:10:35Z","timestamp":1750234235000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/325478.325479"}},"subtitle":["a compiler framework for analyzing and tuning memory behavior"],"short-title":[],"issued":{"date-parts":[[1999,7]]},"references-count":36,"journal-issue":{"issue":"4","published-print":{"date-parts":[[1999,7]]}},"alternative-id":["10.1145\/325478.325479"],"URL":"https:\/\/doi.org\/10.1145\/325478.325479","relation":{},"ISSN":["0164-0925","1558-4593"],"issn-type":[{"value":"0164-0925","type":"print"},{"value":"1558-4593","type":"electronic"}],"subject":[],"published":{"date-parts":[[1999,7]]},"assertion":[{"value":"1999-07-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}