{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:39:01Z","timestamp":1750307941385,"version":"3.41.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2007,6,1]],"date-time":"2007-06-01T00:00:00Z","timestamp":1180656000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2007,6]]},"abstract":"<jats:p>We develop a microprocessor design that tolerates hard faults, including fabrication defects and in-field faults, by leveraging existing microprocessor redundancy. To do this, we must: detect and correct errors, diagnose hard faults at the field deconfigurable unit (FDU) granularity, and deconfigure FDUs with hard faults. In our reliable microprocessor design, we use DIVA dynamic verification to detect and correct errors. Our new scheme for diagnosing hard faults tracks instructions' core structure occupancy from decode until commit. If a DIVA checker detects an error in an instruction, it increments a small saturating error counter for every FDU used by that instruction, including that DIVA checker. A hard fault in an FDU quickly leads to an above-threshold error counter for that FDU and thus diagnoses the fault. For deconfiguration, we use previously developed schemes for functional units and buffers and present a scheme for deconfiguring DIVA checkers. Experimental results show that our reliable microprocessor quickly and accurately diagnoses each hard fault that is injected and continues to function, albeit with somewhat degraded performance.<\/jats:p>","DOI":"10.1145\/1250727.1250728","type":"journal-article","created":{"date-parts":[[2007,9,14]],"date-time":"2007-09-14T13:44:55Z","timestamp":1189777495000},"page":"8","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["Online diagnosis of hard faults in microprocessors"],"prefix":"10.1145","volume":"4","author":[{"given":"Fred A.","family":"Bower","sequence":"first","affiliation":[{"name":"Duke University and IBM Systems and Technology Group, Durham, NC"}]},{"given":"Daniel J.","family":"Sorin","sequence":"additional","affiliation":[{"name":"Duke University, Durham, NC"}]},{"given":"Sule","family":"Ozev","sequence":"additional","affiliation":[{"name":"Duke University, Durham, NC"}]}],"member":"320","published-online":{"date-parts":[[2007,6]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"06","article-title":"Software Optimization Guide for AMD64 Processors","volume":"3","author":"AMD.","year":"2005","unstructured":"AMD. 2005 . Software Optimization Guide for AMD64 Processors . Publication 25112, Rev. 3 . 06 (Sept.). AMD. 2005. Software Optimization Guide for AMD64 Processors. Publication 25112, Rev. 3.06 (Sept.).","journal-title":"Publication 25112, Rev."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.982917"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/320080.320111"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2002.805728"},{"key":"e_1_2_1_5_1","first-page":"1","article-title":"The microarchitecture of the Intel Pentium 4 processor on 90nm technology","volume":"8","author":"Boggs D.","year":"2004","unstructured":"Boggs , D. 2004 . The microarchitecture of the Intel Pentium 4 processor on 90nm technology . Intel Technology Journal 8 , 1 . Boggs, D. et al. 2004. The microarchitecture of the Intel Pentium 4 processor on 90nm technology. Intel Technology Journal 8, 1.","journal-title":"Intel Technology Journal"},{"volume-title":"Proceedings of the International Conference on Dependable Systems and Networks (June). 51--60","author":"Bower F. A.","key":"e_1_2_1_6_1","unstructured":"Bower , F. A. , Shealy , P. G. , Ozev , S. , and Sorin , D. J . 2004. Tolerating hard faults in microprocessor array structures . In Proceedings of the International Conference on Dependable Systems and Networks (June). 51--60 . Bower, F. A., Shealy, P. G., Ozev, S., and Sorin, D. J. 2004. Tolerating hard faults in microprocessor array structures. In Proceedings of the International Conference on Dependable Systems and Networks (June). 51--60."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2005.94"},{"volume-title":"Proc. of the International Conference on Computer Design (ICCD). 576--581","author":"Chen T.","key":"e_1_2_1_8_1","unstructured":"Chen , T. and Sunada , G . 1992. An ultra-large capacity single-chip memory architecture with self-testing and self-repairing . In Proc. of the International Conference on Computer Design (ICCD). 576--581 , (Oct.). Chen, T. and Sunada, G. 1992. An ultra-large capacity single-chip memory architecture with self-testing and self-repairing. In Proc. of the International Conference on Computer Design (ICCD). 576--581, (Oct.)."},{"volume-title":"Proceedings of the IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Nov.).","author":"Culbertson W. B.","key":"e_1_2_1_9_1","unstructured":"Culbertson , W. B. , Amerson , R. , Carter , R. J. , Kuekes , P. , and Snider , G . 1996. The teramac custom computer: Extending the limits with defect tolerance . In Proceedings of the IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Nov.). Culbertson, W. B., Amerson, R., Carter, R. J., Kuekes, P., and Snider, G. 1996. The teramac custom computer: Extending the limits with defect tolerance. In Proceedings of the IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Nov.)."},{"key":"e_1_2_1_10_1","unstructured":"Dell T. J. 2002. A white paper on the benefits of chipkill-correct ECC for PC server main memory. IBM Microelectronics Division Whitepaper (Nov.).  Dell T. J. 2002. A white paper on the benefits of chipkill-correct ECC for PC server main memory. IBM Microelectronics Division Whitepaper (Nov.)."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1142\/4880"},{"key":"e_1_2_1_12_1","unstructured":"Hinton G. Sager D. Upton M. Boggs D. Carmean D. Kyker A. and Roussel P. 2001. The microarchitecture of the Pentium 4 processor. Intel Technology Journal (Feb.).  Hinton G. Sager D. Upton M. Boggs D. Carmean D. Kyker A. and Roussel P. 2001. The microarchitecture of the Pentium 4 processor. Intel Technology Journal (Feb.)."},{"key":"e_1_2_1_13_1","unstructured":"Huynh J. 2003. The AMD Athlon XP processor with 512KB L2 cache. AMD White Paper (Feb.).  Huynh J. 2003. The AMD Athlon XP processor with 512KB L2 cache. AMD White Paper (Feb.)."},{"key":"e_1_2_1_14_1","unstructured":"IBM. 1999. Enhancing IBM netfinity server reliability: IBM Chipkill Memory. IBM Whitepaper (Feb.).  IBM. 1999. Enhancing IBM netfinity server reliability: IBM Chipkill Memory. IBM Whitepaper (Feb.)."},{"key":"e_1_2_1_15_1","unstructured":"International Technology Roadmap for Semiconductors. 2003.  International Technology Roadmap for Semiconductors. 2003."},{"key":"e_1_2_1_16_1","unstructured":"JEDEC Solid State Technology Association. 2003. Failure Mechanisms and Models for Semiconductor Devices. JEDEC Publication JEP122-B (Aug.).  JEDEC Solid State Technology Association. 2003. Failure Mechanisms and Models for Semiconductor Devices. JEDEC Publication JEP122-B (Aug.)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/FTCS.1991.146709"},{"volume-title":"Proceedings of the International Test Conference. 833--841","author":"Mazumder P.","key":"e_1_2_1_18_1","unstructured":"Mazumder , P. and Yih , J. S . 1990. A novel built-in self-repair approach to VLSI memory yield enhancement . In Proceedings of the International Test Conference. 833--841 . Mazumder, P. and Yih, J. S. 1990. A novel built-in self-repair approach to VLSI memory yield enhancement. In Proceedings of the International Test Conference. 833--841."},{"volume-title":"Proceedings of the 29th Annual International Symposium on Computer Architecture (May). 99--110","author":"Mukherjee S. S.","key":"e_1_2_1_19_1","unstructured":"Mukherjee , S. S. , Kontz , M. , and Reinhardt , S. K . 2002. Detailed Design and implementation of redundant multhreading alternatives . In Proceedings of the 29th Annual International Symposium on Computer Architecture (May). 99--110 . Mukherjee, S. S., Kontz, M., and Reinhardt, S. K. 2002. Detailed Design and implementation of redundant multhreading alternatives. In Proceedings of the 29th Annual International Symposium on Computer Architecture (May). 99--110."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/996070.1009949"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/50202.50214"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339652"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/795672.796966"},{"volume-title":"Proceedings of the IEEE Custom Integrated Circuits Conference.","author":"Sawada K.","key":"e_1_2_1_24_1","unstructured":"Sawada , K. , Sakurai , T. , Uchino , Y. , and Yamada , K . 1989. Built-in self repair circuit for high density ASMIC . In Proceedings of the IEEE Custom Integrated Circuits Conference. Sawada, K., Sakurai, T., Uchino, Y., and Yamada, K. 1989. Built-in self repair circuit for high density ASMIC. In Proceedings of the IEEE Custom Integrated Circuits Conference."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.44"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605403"},{"volume-title":"Proceedings of the 21st International Conference on Computer Design (Oct.).","author":"Shivakumar P.","key":"e_1_2_1_27_1","unstructured":"Shivakumar , P. , Keckler , S. W. , Moore , C. R. , and Burger , D . 2003. Exploiting microarchitectural redundancy for defect tolerance . In Proceedings of the 21st International Conference on Computer Design (Oct.). Shivakumar, P., Keckler, S. W., Moore, C. R., and Burger, D. 2003. Exploiting microarchitectural redundancy for defect tolerance. In Proceedings of the 21st International Conference on Computer Design (Oct.)."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.435.0863"},{"volume-title":"Proceedings of the 31st Annual International Symposium on Computer Architecture (June).","author":"Srinivasan J.","key":"e_1_2_1_29_1","unstructured":"Srinivasan , J. , Adve , S. V. , Bose , P. , and Rivers , J. A . 2004a. The case for lifetime reliability-aware microprocessors . In Proceedings of the 31st Annual International Symposium on Computer Architecture (June). Srinivasan, J., Adve, S. V., Bose, P., and Rivers, J. A. 2004a. The case for lifetime reliability-aware microprocessors. In Proceedings of the 31st Annual International Symposium on Computer Architecture (June)."},{"volume-title":"Proceedings of the International Conference on Dependable Systems and Networks (June).","author":"Srinivasan J.","key":"e_1_2_1_30_1","unstructured":"Srinivasan , J. , Adve , S. V. , Bose , P. , and Rivers , J. A . 2004b. The impact of technology scaling on lifetime reliability . In Proceedings of the International Conference on Dependable Systems and Networks (June). Srinivasan, J., Adve, S. V., Bose, P., and Rivers, J. A. 2004b. The impact of technology scaling on lifetime reliability. In Proceedings of the International Conference on Dependable Systems and Networks (June)."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.28"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/378993.379247"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/16.491258"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/232973.232993"},{"volume-title":"Proceedings of the 29th Annual International Symposium on Computer Architecture (May). 87--98","author":"Vijaykumar T. N.","key":"e_1_2_1_35_1","unstructured":"Vijaykumar , T. N. , Pomeranz , I. , and Chung , K. K . 2002. Transient fault recovery using simultaneous multithreading . In Proceedings of the 29th Annual International Symposium on Computer Architecture (May). 87--98 . Vijaykumar, T. N., Pomeranz, I., and Chung, K. K. 2002. Transient fault recovery using simultaneous multithreading. In Proceedings of the 29th Annual International Symposium on Computer Architecture (May). 87--98."},{"volume-title":"Proceedings of the International Conference on Dependable Systems and Networks (July). 411--420","author":"Weaver C.","key":"e_1_2_1_36_1","unstructured":"Weaver , C. and Austin , T . 2001. A fault tolerant approach to microprocessor design . In Proceedings of the International Conference on Dependable Systems and Networks (July). 411--420 . Weaver, C. and Austin, T. 2001. A fault tolerant approach to microprocessor design. In Proceedings of the International Conference on Dependable Systems and Networks (July). 411--420."},{"key":"e_1_2_1_37_1","unstructured":"Wilson D. 1985. The stratus computer system. In Resilient Computer Systems. 208--231.   Wilson D. 1985. The stratus computer system. In Resilient Computer Systems. 208--231."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/54.573354"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1250727.1250728","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1250727.1250728","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T14:52:19Z","timestamp":1750258339000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1250727.1250728"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,6]]},"references-count":38,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2007,6]]}},"alternative-id":["10.1145\/1250727.1250728"],"URL":"https:\/\/doi.org\/10.1145\/1250727.1250728","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2007,6]]},"assertion":[{"value":"2007-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}