{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T02:40:19Z","timestamp":1767926419843,"version":"3.49.0"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2014,12,8]],"date-time":"2014-12-08T00:00:00Z","timestamp":1417996800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100005302","name":"University of Illinois at Urbana-Champaign","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100005302","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000143","name":"Division of Computing and Communication Foundations","doi-asserted-by":"publisher","award":["CCF-1018796 and CCF-1302641"],"award-info":[{"award-number":["CCF-1018796 and CCF-1302641"]}],"id":[{"id":"10.13039\/100000143","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000028","name":"Semiconductor Research Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000028","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007245","name":"Microelectronics Advanced Research Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100007245","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100005144","name":"Qualcomm","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100005144","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Intel and Microsoft through the Universal Parallel Computing Research Center (UPCRC) at Illinois"},{"name":"Center for Future Architectures Research"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2015,1,9]]},"abstract":"<jats:p>Cache coherence is an integral part of shared-memory systems but is also widely considered to be one of the most complex parts of such systems. Much prior work has addressed this complexity and the verification techniques to prove the correctness of hardware coherence. Given the new multicore era with increasing number of cores, there is a renewed debate about whether the complexity of hardware coherence has been tamed or whether it should be abandoned in favor of software coherence. This article revisits the complexity of hardware cache coherence by verifying a publicly available, state-of-the-art implementation of the widely used MESI protocol, using the Mur\u03c6 model checking tool. To our surprise, we found six bugs in this protocol, most of which were hard to analyze and took several days to fix. To compare the complexity, we also verified the recently proposed DeNovo protocol, which exploits disciplined software programming models. We found three relatively easy to fix bugs in this less mature protocol. After fixing these bugs, our verification experiments showed that, compared to DeNovo, MESI had 15X more reachable states leading to a 20X increase in verification (model checking) time. Although we were eventually successful in verifying the protocols, the tool required making several simplifying assumptions (e.g., two cores, one address). Our results have several implications: (1) they indicate that hardware coherence protocols remain complex; (2) they reinforce the need for protocol designers to embrace formal verification tools to demonstrate correctness of new protocols and extensions; (3) they reinforce the need for formal verification tools that are both scalable and usable by non-expert; and (4) they show that a system based on hardware-software co-design can offer a simpler approach for cache coherence, thus reducing the overall verification effort and allowing verification of more detailed models and protocol extensions that are otherwise limited by computing resources.<\/jats:p>","DOI":"10.1145\/2663345","type":"journal-article","created":{"date-parts":[[2014,12,8]],"date-time":"2014-12-08T16:17:14Z","timestamp":1418055434000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":21,"title":["Revisiting the Complexity of Hardware Cache Coherence and Some Implications"],"prefix":"10.1145","volume":"11","author":[{"given":"Rakesh","family":"Komuravelli","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Urbana, IL, USA"}]},{"given":"Sarita V.","family":"Adve","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Urbana, IL, USA"}]},{"given":"Ching-Tsun","family":"Chou","sequence":"additional","affiliation":[{"name":"Intel Corporation, Juliette Lane Santa Clara, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2014,12,8]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the Workshop on Complexity-Effective Design held in conjunction with the 27th Annual International Symposium on Computer Architecture (ISCA2000)","author":"Abts Dennis","year":"2000","unstructured":"Dennis Abts , David J. Lilja , and Steve Scott . 2000 . Toward complexity-effective verification: A case study of the Cray SV2 cache coherence protocol . In Proceedings of the Workshop on Complexity-Effective Design held in conjunction with the 27th Annual International Symposium on Computer Architecture (ISCA2000) . Dennis Abts, David J. Lilja, and Steve Scott. 2000. Toward complexity-effective verification: A case study of the Cray SV2 cache coherence protocol. In Proceedings of the Workshop on Complexity-Effective Design held in conjunction with the 27th Annual International Symposium on Computer Architecture (ISCA2000)."},{"key":"e_1_2_1_2_1","volume-title":"Lilja","author":"Abts Dennis","year":"2003","unstructured":"Dennis Abts , Steve Scott , and David J . Lilja . 2003 . So many states, so little time: Verifying memory coherence in the cray X1. In IPDPS. Dennis Abts, Steve Scott, and David J. Lilja. 2003. So many states, so little time: Verifying memory coherence in the cray X1. In IPDPS."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155627"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1594835.1504190"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375600"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629575.1629579"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1640089.1640096"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/PDMC-HiBi.2010.13"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/209937.209958"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 25th European Conference on Object-oriented Programming(ECOOP\u201911)","author":"Robert","unstructured":"Robert L. Bocchino and Vikram S. Adve. 2011. Types, regions, and effects for safe programming with object-oriented parallel frameworks . In Proceedings of the 25th European Conference on Object-oriented Programming(ECOOP\u201911) . Springer-Verlag, Berlin, 306--332. Robert L. Bocchino and Vikram S. Adve. 2011. Types, regions, and effects for safe programming with object-oriented parallel frameworks. In Proceedings of the 25th European Conference on Object-oriented Programming(ECOOP\u201911). Springer-Verlag, Berlin, 306--332."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1640089.1640097"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1926385.1926447"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375591"},{"key":"e_1_2_1_15_1","volume-title":"Keynote at 20th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201911).","author":"Borkar S.","unstructured":"S. Borkar . 2011. The exascale challenge . In Keynote at 20th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201911). S. Borkar. 2011. The exascale challenge. In Keynote at 20th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201911)."},{"key":"e_1_2_1_16_1","unstructured":"Zoran Budimlic Aparna Chandramowlishwaran Kathleen Knobe Geoff Lowney Vivek Sarkar and Leo Treggiari. 2009. Multi-core implementations of the concurrent collections programming model. In IWCPC.  Zoran Budimlic Aparna Chandramowlishwaran Kathleen Knobe Geoff Lowney Vivek Sarkar and Leo Treggiari. 2009. Multi-core implementations of the concurrent collections programming model. In IWCPC."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30579-8_9"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1978.1675013"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2011.21"},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Ching-Tsun Chou Phanindra K. Mannava and Seungjoon Park. 2004. A simple method for parameterized verification of cache coherence protocols. In FMCAD. 382--398.  Ching-Tsun Chou Phanindra K. Mannava and Seungjoon Park. 2004. A simple method for parameterized verification of cache coherence protocols. In FMCAD. 382--398.","DOI":"10.1007\/978-3-540-30494-4_27"},{"key":"e_1_2_1_21_1","volume-title":"Workshop. Springer-Verlag","author":"Clarke Edmund M.","unstructured":"Edmund M. Clarke and E. Allen Emerson . 1982. Design and synthesis of synchronization skeletons using branching-time temporal logic. In Logic of Programs , Workshop. Springer-Verlag , London, 52--71. Edmund M. Clarke and E. Allen Emerson. 1982. Design and synthesis of synchronization skeletons using branching-time temporal logic. In Logic of Programs, Workshop. Springer-Verlag, London, 52--71."},{"key":"e_1_2_1_22_1","volume-title":"ICCD\u201992","author":"Dill David L.","unstructured":"David L. Dill , Andreas J. Drexler , Alan J. Hu , and C. Han Yang . 1992. Protocol verification as a hardware design aid . In ICCD\u201992 . IEEE Computer Society , Washington, DC , 522--525. David L. Dill, Andreas J. Drexler, Alan J. Hu, and C. Han Yang. 1992. Protocol verification as a hardware design aid. In ICCD\u201992. IEEE Computer Society, Washington, DC, 522--525."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1362702.1362707"},{"key":"e_1_2_1_24_1","unstructured":"Stein Gjessing Stein Krogdahl and Ellen Munthe-Kaas. 1989. Formal Specification and Verification of SCI Cache Coherence: The Top Layers. http:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary&quest;doi=10.1.1.51.8390  Stein Gjessing Stein Krogdahl and Ellen Munthe-Kaas. 1989. Formal Specification and Verification of SCI Cache Coherence: The Top Layers. http:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary&quest;doi=10.1.1.51.8390"},{"key":"e_1_2_1_25_1","volume-title":"Axum: Language Overview. Microsoft Language Specification.","author":"Gustafsson Niklas","year":"2009","unstructured":"Niklas Gustafsson . 2009 . Axum: Language Overview. Microsoft Language Specification. Niklas Gustafsson. 2009. Axum: Language Overview. Microsoft Language Specification."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2442516.2442540"},{"key":"e_1_2_1_27_1","volume-title":"Van Der Wijngaart, and T. Mattson","author":"Howard J.","year":"2010","unstructured":"J. Howard , S. Dighe , Y. Hoskote , S. Vangal , D. Finan , G. Ruhl , D. Jenkins , H. Wilson , N. Borkar , G. Schrom , F. Pailet , S. Jain , T. Jacob , S. Yada , S. Marella , P. Salihundam , V. Erraguntla , M. Konow , M. Riepen , G. Droege , J. Lindemann , M. Gries , T. Apel , K. Henriss , T. Lund-Larsen , S. Steibl , S. Borkar , V. De , R. Van Der Wijngaart, and T. Mattson . 2010 . A 48-core IA-32 message-passing processor with DVFS in 45nm CMOS. In ISSCC. 108--109. J. Howard, S. Dighe, Y. Hoskote, S. Vangal, D. Finan, G. Ruhl, D. Jenkins, H. Wilson, N. Borkar, G. Schrom, F. Pailet, S. Jain, T. Jacob, S. Yada, S. Marella, P. Salihundam, V. Erraguntla, M. Konow, M. Riepen, G. Droege, J. Lindemann, M. Gries, T. Apel, K. Henriss, T. Lund-Larsen, S. Steibl, S. Borkar, V. De, R. Van Der Wijngaart, and T. Mattson. 2010. A 48-core IA-32 message-passing processor with DVFS in 45nm CMOS. In ISSCC. 108--109."},{"key":"e_1_2_1_28_1","unstructured":"IntelSCC. 2009. The SCC Platform Overview. Retrieved from http:\/\/communities.intel.com\/servlet\/JiveServlet\/downloadBody\/5 512-102-2-22524\/SCC_Platform_Overview.pdf.  IntelSCC. 2009. The SCC Platform Overview. Retrieved from http:\/\/communities.intel.com\/servlet\/JiveServlet\/downloadBody\/5 512-102-2-22524\/SCC_Platform_Overview.pdf."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.1993.393375"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00625968"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555774"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250734.1250759"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.121510"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2150976.2150979"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1040305.1040336"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2209249.2209269"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105734.1105747"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.5555\/646705.702177"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the International Conference on Parallel and Distributed Computing. Information Processing Society","author":"McMillan K. L.","unstructured":"K. L. McMillan and Schwalbe J . 1991. Formal verification of the gigamax cache consistency protocol . In Proceedings of the International Conference on Parallel and Distributed Computing. Information Processing Society , Tokyo, Japan, 242--251. K. L. McMillan and Schwalbe J. 1991. Formal verification of the gigamax cache consistency protocol. In Proceedings of the International Conference on Parallel and Distributed Computing. Information Processing Society, Tokyo, Japan, 242--251."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10009-008-0094-x"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/106972.106999"},{"key":"e_1_2_1_42_1","unstructured":"A. K. Nanda and L. N. Bhuyan. 1992. A formal specification and verification technique for cache coherence protocols. In ICPP. I22--I26.  A. K. Nanda and L. N. Bhuyan. 1992. A formal specification and verification technique for cache coherence protocols. In ICPP. I22--I26."},{"key":"e_1_2_1_43_1","volume-title":"Formal Methods in Computer-Aided Design, 2009. FMCAD 2009. IEEE, 172--179","author":"O\u2019Leary J.","year":"2001","unstructured":"J. O\u2019Leary , M. Talupur , and M. R. Tuttle . 2009. Protocol verification using flows: An industrial experience . In Formal Methods in Computer-Aided Design, 2009. FMCAD 2009. IEEE, 172--179 . http:\/\/scholar.google.com\/scholar.bib&quest;q=info:h_Ok5nemssQJ:scholar.google.com\/&output=citation&hl=en&as_sdt= 2001 &as_ylo=1999&ct=citation&cd=1 J. O\u2019Leary, M. Talupur, and M. R. Tuttle. 2009. Protocol verification using flows: An industrial experience. In Formal Methods in Computer-Aided Design, 2009. FMCAD 2009. IEEE, 172--179. http:\/\/scholar.google.com\/scholar.bib&quest;q=info:h_Ok5nemssQJ:scholar.google.com\/&output=citation&hl=en&as_sdt=2001&as_ylo=1999&ct=citation&cd=1"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1508244.1508256"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/800015.808204"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/215399.215413"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.656100"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/248621.248624"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.5555\/647325.721668"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/237502.237573"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2451116.2451119"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2014.5"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2009.68"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.5555\/2014698.2014902"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541982"},{"key":"e_1_2_1_56_1","volume-title":"Workshop on determinism and correctness in parallel programming.","author":"DET.","year":"2009","unstructured":"Wo DET. 2009 . Workshop on determinism and correctness in parallel programming. WoDET. 2009. Workshop on determinism and correctness in parallel programming."},{"key":"e_1_2_1_57_1","volume-title":"Workshop on determinism and correctness in parallel programming.","author":"DET.","year":"2011","unstructured":"Wo DET. 2011 . Workshop on determinism and correctness in parallel programming. WoDET. 2011. Workshop on determinism and correctness in parallel programming."},{"key":"e_1_2_1_58_1","volume-title":"Workshop on determinism and correctness in parallel programming.","author":"DET.","year":"2012","unstructured":"Wo DET. 2012 . Workshop on determinism and correctness in parallel programming. WoDET. 2012. Workshop on determinism and correctness in parallel programming."},{"key":"e_1_2_1_59_1","volume-title":"Workshop on determinism and correctness in parallel programming.","author":"DET.","year":"2013","unstructured":"Wo DET. 2013 . Workshop on determinism and correctness in parallel programming. WoDET. 2013. Workshop on determinism and correctness in parallel programming."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669166"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2014.6835949"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.11"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485969"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2663345","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2663345","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:12:47Z","timestamp":1750227167000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2663345"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,12,8]]},"references-count":63,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2015,1,9]]}},"alternative-id":["10.1145\/2663345"],"URL":"https:\/\/doi.org\/10.1145\/2663345","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,12,8]]},"assertion":[{"value":"2014-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-12-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}