{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:27:22Z","timestamp":1758270442218,"version":"3.38.0"},"reference-count":33,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2017,2,3]],"date-time":"2017-02-03T00:00:00Z","timestamp":1486080000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2018,11]]},"abstract":"<jats:p> We introduce a novel algorithm-based fault-tolerance scheme to detect and repair soft transient faults (silent data corruption, bitflips) in multigrid solvers: by applying the full approximation scheme (FAS) variant of multigrid to linear systems, we prove invariants that enable fault detection and correction, and ultimately lead to a black-box protection of the smoothing stage. A statistical analysis for a wide range of prototypical problems demonstrates the efficiency of our approach, especially compared with full checksum protection. In particular, the overhead of our new method is negligible in the fault-free case, since we only employ readily available quantities. <\/jats:p>","DOI":"10.1177\/1094342016684006","type":"journal-article","created":{"date-parts":[[2017,2,3]],"date-time":"2017-02-03T10:18:19Z","timestamp":1486117099000},"page":"897-912","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":10,"title":["Soft fault detection and correction for multigrid"],"prefix":"10.1177","volume":"32","author":[{"given":"Mirco","family":"Altenbernd","sequence":"first","affiliation":[{"name":"University of Stuttgart, Germany"}]},{"given":"Dominik","family":"G\u00f6ddeke","sequence":"additional","affiliation":[{"name":"University of Stuttgart, Germany"}]}],"member":"179","published-online":{"date-parts":[[2017,2,3]]},"reference":[{"key":"bibr1-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-39929-4_24"},{"key":"bibr2-1094342016684006","unstructured":"Altenbernd M (2015) Konzepte f\u00fcr fehlertolerante Mehrgitterverfahren. Master\u2019s Thesis, TU Dortmund, Fakult\u00e4t f\u00fcr Mathematik, Lehrstuhl 3 f\u00fcr Angewandte Mathematik und Numerik."},{"key":"bibr3-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-14313-2_45"},{"key":"bibr4-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2008.12.002"},{"key":"bibr5-1094342016684006","volume-title":"Finite Elements \u2013 Theory, Fast Solvers and Applications in Solid Mechanics","author":"Braess D","year":"2001","edition":"2"},{"key":"bibr6-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1090\/S0025-5718-1977-0431719-X"},{"key":"bibr7-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719505"},{"key":"bibr8-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/2304576.2304590"},{"key":"bibr9-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/1996130.1996142"},{"key":"bibr10-1094342016684006","doi-asserted-by":"publisher","DOI":"10.2172\/1149042"},{"key":"bibr11-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.123"},{"key":"bibr12-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2012.56"},{"key":"bibr13-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063443"},{"key":"bibr14-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.49"},{"key":"bibr15-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2015.07.003"},{"key":"bibr16-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-02427-0"},{"key":"bibr17-1094342016684006","unstructured":"Hoemmen M, Heroux MA (2011) Fault-tolerant iterative methods via selective reliability. Available at: http:\/\/www.sandia.gov\/~maherou\/docs\/FTGMRES.pdf."},{"key":"bibr18-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1984.1676475"},{"key":"bibr19-1094342016684006","unstructured":"Huber M, Gmeiner B, R\u00fcde U, Wohlmuth B (2015a) Resilience for exascale enabled multigrid methods. arXiv preprint 1501.07400. Available at: http:\/\/epubs.siam.org\/doiabs\/10.1137\/15M1026122."},{"key":"bibr20-1094342016684006","unstructured":"Huber M, Gmeiner B, R\u00fcde U, Wohlmuth B (2015b) Resilience for multigrid software at the extreme scale. arXiv preprint 1506.06185. Available at: http:\/\/epubs.siam.org\/doiabs\/10.1137\/15M1026122."},{"key":"bibr21-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/MSPEC.2010.5605876"},{"key":"bibr22-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503226"},{"key":"bibr23-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503266"},{"key":"bibr24-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503271"},{"key":"bibr25-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/2530268.2530272"},{"key":"bibr26-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/DFT.2015.7315136"},{"key":"bibr27-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2012.6263938"},{"key":"bibr28-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2013.6575309"},{"key":"bibr29-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1177\/1094342014522573"},{"key":"bibr30-1094342016684006","doi-asserted-by":"publisher","DOI":"10.2172\/1011662"},{"key":"bibr31-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1145\/2642769.2642774"},{"volume-title":"Multigrid","year":"2001","author":"Trottenberg U","key":"bibr32-1094342016684006"},{"key":"bibr33-1094342016684006","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.1584"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016684006","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342016684006","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016684006","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,28]],"date-time":"2025-02-28T18:51:07Z","timestamp":1740768667000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342016684006"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,2,3]]},"references-count":33,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2018,11]]}},"alternative-id":["10.1177\/1094342016684006"],"URL":"https:\/\/doi.org\/10.1177\/1094342016684006","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2017,2,3]]}}}