{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:53:19Z","timestamp":1750308799763,"version":"3.41.0"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2009,12,1]],"date-time":"2009-12-01T00:00:00Z","timestamp":1259625600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Syst."],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:p>Soft errors are an important challenge in contemporary microprocessors. Modern processors have caches and large memory arrays protected by parity or error detection and correction codes. However, today's failure rate is dominated by flip flops, latches, and the increasing sensitivity of combinational logic to particle strikes. Moreover, as Chip Multi-Processors (CMPs) become ubiquitous, meeting the FIT budget for new designs is becoming a major challenge.<\/jats:p>\n          <jats:p>Solutions based on replicating threads have been explored deeply; however, their high cost in performance and energy make them unsuitable for current designs. Moreover, our studies based on a typical configuration for a modern processor show that focusing on the top 5 most vulnerable structures can provide up to 70% reduction in FIT rate. Therefore, full replication may overprotect the chip by reducing the FIT much below budget.<\/jats:p>\n          <jats:p>\n            We propose\n            <jats:italic>Selective Replication<\/jats:italic>\n            , a lightweight-reconfigurable mechanism that achieves a high FIT reduction by protecting the most vulnerable instructions with minimal performance and energy impact. Low performance degradation is achieved by not requiring additional issue slots and reissuing instructions only during the time window between when they are retirable and they actually retire. Coverage can be reconfigured online by replicating only a subset of the instructions (the most vulnerable ones). Instructions' vulnerability is estimated based on the area they occupy and the time they spend in the issue queue. By changing the vulnerability threshold, we can adjust the trade-off between coverage and performance loss.\n          <\/jats:p>\n          <jats:p>Results for an out-of-order processor configured similarly to Intel\u00ae Core\u2122 Micro-Architecture show that our scheme can achieve over 65% FIT reduction with less than 4% performance degradation with small area and complexity overhead.<\/jats:p>","DOI":"10.1145\/1658357.1658359","type":"journal-article","created":{"date-parts":[[2012,10,15]],"date-time":"2012-10-15T19:22:23Z","timestamp":1350328943000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":23,"title":["Selective replication"],"prefix":"10.1145","volume":"27","author":[{"given":"Xavier","family":"Vera","sequence":"first","affiliation":[{"name":"Intel Barcelona Research Center, Intel Labs - UPC, Barcelona, Spain"}]},{"given":"Jaume","family":"Abella","sequence":"additional","affiliation":[{"name":"Intel Barcelona Research Center, Intel Labs - UPC, Barcelona, Spain"}]},{"given":"Javier","family":"Carretero","sequence":"additional","affiliation":[{"name":"Intel Barcelona Research Center, Intel Labs - UPC, Barcelona, Spain"}]},{"given":"Antonio","family":"Gonz\u00e1lez","sequence":"additional","affiliation":[{"name":"Intel Barcelona Research Center, Intel Labs - UPC, Barcelona, Spain"}]}],"member":"320","published-online":{"date-parts":[[2010,1]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/320080.320111"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/MDT.2005.69"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.18"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.888701"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2003.1225959"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.401.0119"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASCOTS.2006.18"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859631"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.38"},{"volume-title":"Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA).","author":"Kumar S.","key":"e_1_2_1_10_1","unstructured":"Kumar , S. and Aggarwal , A . 2006. Reducing resource redundancy for concurrent error detection techniques in high performance microprocessors . In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA). Kumar, S. and Aggarwal, A. 2006. Reducing resource redundancy for concurrent error detection techniques in high performance microprocessors. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA)."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.9"},{"key":"e_1_2_1_12_1","first-page":"1","article-title":"Hyper-Threading technology architecture and microarchitecture","volume":"6","author":"Marr D.","year":"2002","unstructured":"Marr , D. , Binns , F. , Hill , D. , Hinton , G. , Koufaty , D. , Miller , J. , and Upton , M. 2002 . Hyper-Threading technology architecture and microarchitecture . Intel Tech. J. 6 , 1 . Marr, D., Binns, F., Hill, D., Hinton, G., Koufaty, D., Miller, J., and Upton, M. 2002. Hyper-Threading technology architecture and microarchitecture. Intel Tech. J. 6, 1.","journal-title":"Intel Tech. J."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2005.181"},{"volume-title":"International Reliability Physics Symposium. IEEE Computer Society","author":"Ming Z.","key":"e_1_2_1_14_1","unstructured":"Ming , Z. and Shanbhag , N . 2005. A cmos design style for logic circuit hardening . In International Reliability Physics Symposium. IEEE Computer Society , Los Alamitos, CA, 223--229. Ming, Z. and Shanbhag, N. 2005. A cmos design style for logic circuit hardening. In International Reliability Physics Symposium. IEEE Computer Society, Los Alamitos, CA, 223--229."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2005.70"},{"volume-title":"Proceedings of the 2nd Workshop on System Effects of Logic Soft Errors (SELSE).","author":"Mitra S.","key":"e_1_2_1_16_1","unstructured":"Mitra , S. , Zhang , M. , Waqas , S. , Seifert , N. , Gill , B. , and Kim , K . 2006. Combinational logic soft error correction . In Proceedings of the 2nd Workshop on System Effects of Logic Soft Errors (SELSE). Mitra, S., Zhang, M., Waqas, S., Seifert, N., Gill, B., and Kim, K. 2006. Combinational logic soft error correction. In Proceedings of the 2nd Workshop on System Effects of Logic Soft Errors (SELSE)."},{"volume-title":"Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA).","author":"Mukherjee S.","key":"e_1_2_1_17_1","unstructured":"Mukherjee , S. , Kontz , M. , and Reinhardt , S . 2002. Detailed design and evaluation of redundant multithreading alternatives . In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA). Mukherjee, S., Kontz, M., and Reinhardt, S. 2002. Detailed design and evaluation of redundant multithreading alternatives. In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA)."},{"volume-title":"Proceedings of the 36th International Symposium on Microarchitecture (MICRO). ACM Press","author":"Mukherjee S.","key":"e_1_2_1_18_1","unstructured":"Mukherjee , S. , Weaver , C. , Emer , J. , Reinhardt , S. , and Austin , T . 2003. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor . In Proceedings of the 36th International Symposium on Microarchitecture (MICRO). ACM Press , New York, NY. Mukherjee, S., Weaver, C., Emer, J., Reinhardt, S., and Austin, T. 2003. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proceedings of the 36th International Symposium on Microarchitecture (MICRO). ACM Press, New York, NY."},{"volume-title":"Proceedings of the Reliability Physics Symposium. 60--70","author":"Nguyen H.","key":"e_1_2_1_19_1","unstructured":"Nguyen , H. and Yagil , Y . 2003. A systematic approach to SER estimation and solutions . In Proceedings of the Reliability Physics Symposium. 60--70 . Nguyen, H. and Yagil, Y. 2003. A systematic approach to SER estimation and solutions. In Proceedings of the Reliability Physics Symposium. 60--70."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2005.62"},{"volume-title":"Proceedings of the 34th International Symposium on Microarchitecture (MICRO). 214--224","author":"Ray J.","key":"e_1_2_1_21_1","unstructured":"Ray , J. , Hoe , J. , and Falsafi , B . 2001. Dual use of superscalar datapath for transient-fault detection and recovery . In Proceedings of the 34th International Symposium on Microarchitecture (MICRO). 214--224 . Ray, J., Hoe, J., and Falsafi, B. 2001. Dual use of superscalar datapath for transient-fault detection and recovery. In Proceedings of the 34th International Symposium on Microarchitecture (MICRO). 214--224."},{"volume-title":"Proceedings of the International Conference on Computer Design (ICCD). 362--369","author":"Reddy V.","key":"e_1_2_1_22_1","unstructured":"Reddy , V. , Al-Zawawi , A. , and Rotenberg , E . 2007. Assertion-Based microarchitecture design for improved fault tolerance . In Proceedings of the International Conference on Computer Design (ICCD). 362--369 . Reddy, V., Al-Zawawi, A., and Rotenberg, E. 2007. Assertion-Based microarchitecture design for improved fault tolerance. In Proceedings of the International Conference on Computer Design (ICCD). 362--369."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2007.59"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168869"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339652"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2005.34"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.21"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/795672.796966"},{"volume-title":"Proceedings of the International Reliability Physics Symposium. IEEE Computer Society","author":"Seifert N.","key":"e_1_2_1_29_1","unstructured":"Seifert , N. , Slankard , P. , Kirsch , M. , Narasimham , B. , Zia , V., C. , Brookresonand A. , Voand S. , Mitraand B. , Gill , B. , and Maiz , J . 2006. Radiation-Induced soft error rates of advanced cmos bulk devices . In Proceedings of the International Reliability Physics Symposium. IEEE Computer Society , Los Alamitos, CA, 217--225. Seifert, N., Slankard, P., Kirsch, M., Narasimham, B., Zia, V., C., Brookresonand A., Voand S., Mitraand B., Gill, B., and Maiz, J. 2006. Radiation-Induced soft error rates of advanced cmos bulk devices. In Proceedings of the International Reliability Physics Symposium. IEEE Computer Society, Los Alamitos, CA, 217--225."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TDMR.2004.831993"},{"volume-title":"Proceedings of the International Conference on Dependable Systems and Network (DSN'02)","author":"Shivakumar P.","key":"e_1_2_1_31_1","unstructured":"Shivakumar , P. , Kistler , M. , Keckler , S. , Burger , D. , and Alvisi , L . 2002. Modeling the effect of technology trends on the soft error rate of combinational logic . In Proceedings of the International Conference on Dependable Systems and Network (DSN'02) , 389. Shivakumar, P., Kistler, M., Keckler, S., Burger, D., and Alvisi, L. 2002. Modeling the effect of technology trends on the soft error rate of combinational logic. In Proceedings of the International Conference on Dependable Systems and Network (DSN'02), 389."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2004.19"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250725"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.435.0863"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/360128.360155"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/232973.232993"},{"volume-title":"Proceedings of the 29th International Symposium on Computer Architecture (ISCA).","author":"Vijaykumar T.","key":"e_1_2_1_37_1","unstructured":"Vijaykumar , T. , Pomeranz , I. , and Cheng , K . 2002. Transient-Fault recovery using simultaneous multithreading . In Proceedings of the 29th International Symposium on Computer Architecture (ISCA). Vijaykumar, T., Pomeranz, I., and Cheng, K. 2002. Transient-Fault recovery using simultaneous multithreading. In Proceedings of the 29th International Symposium on Computer Architecture (ISCA)."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250726"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2005.82"},{"volume-title":"Proceedings of the 31st International Symposium on Computer Architecture (ISCA). IEEE Computer Society","author":"Weaver C.","key":"e_1_2_1_40_1","unstructured":"Weaver , C. , Emer , J. , Mukherjee , S. , and Reinhardt , S . 2004. Techniques to reduce the soft error rate of a high-performance microprocessor . In Proceedings of the 31st International Symposium on Computer Architecture (ISCA). IEEE Computer Society , Los Alamitos, CA. Weaver, C., Emer, J., Mukherjee, S., and Reinhardt, S. 2004. Techniques to reduce the soft error rate of a high-performance microprocessor. In Proceedings of the 31st International Symposium on Computer Architecture (ISCA). IEEE Computer Society, Los Alamitos, CA."}],"container-title":["ACM Transactions on Computer Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1658357.1658359","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1658357.1658359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T20:25:56Z","timestamp":1750278356000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1658357.1658359"}},"subtitle":["A lightweight technique for soft errors"],"short-title":[],"issued":{"date-parts":[[2009,12]]},"references-count":40,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2009,12]]}},"alternative-id":["10.1145\/1658357.1658359"],"URL":"https:\/\/doi.org\/10.1145\/1658357.1658359","relation":{},"ISSN":["0734-2071","1557-7333"],"issn-type":[{"type":"print","value":"0734-2071"},{"type":"electronic","value":"1557-7333"}],"subject":[],"published":{"date-parts":[[2009,12]]},"assertion":[{"value":"2008-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-01-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}