{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T12:16:39Z","timestamp":1763468199466,"version":"3.38.0"},"reference-count":24,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2014,4,25]],"date-time":"2014-04-25T00:00:00Z","timestamp":1398384000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2015,11]]},"abstract":"<jats:p> Errors due to hardware or low-level software problems, if detected, can be fixed by various schemes, such as recomputation from a checkpoint. Silent errors are errors in application state that have escaped low-level error detection. At extreme scale, where machines can perform astronomically many operations per second, silent errors threaten the validity of computed results. <\/jats:p><jats:p> We propose a new paradigm for detecting silent errors at the application level. Our central idea is to frequently compare computed values to those provided by a cheap checking computation, and to build error detectors based on the difference between the two output sequences. Numerical analysis provides us with usable checking computations for the solution of initial-value problems in ODEs and PDEs, arguably the most common problems in computational science. Here, we provide, optimize, and test methods based on Runge\u2013Kutta and linear multistep methods for ODEs, and on implicit and explicit finite difference schemes for PDEs. We take the heat equation and Navier\u2013Stokes equations as examples. In tests with artificially injected errors, this approach effectively detects almost all meaningful errors, without significant slowdown. <\/jats:p>","DOI":"10.1177\/1094342014532297","type":"journal-article","created":{"date-parts":[[2014,4,26]],"date-time":"2014-04-26T03:32:42Z","timestamp":1398483162000},"page":"403-421","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":29,"title":["Silent error detection in numerical time-stepping schemes"],"prefix":"10.1177","volume":"29","author":[{"given":"Austin R","family":"Benson","sequence":"first","affiliation":[{"name":"Institute for Computational and Mathematical Engineering, Stanford University, CA, USA"},{"name":"HP Labs, Palo Alto, CA, USA"}]},{"given":"Sven","family":"Schmit","sequence":"additional","affiliation":[{"name":"Institute for Computational and Mathematical Engineering, Stanford University, CA, USA"}]},{"given":"Robert","family":"Schreiber","sequence":"additional","affiliation":[{"name":"HP Labs, Palo Alto, CA, USA"}]}],"member":"179","published-online":{"date-parts":[[2014,4,25]]},"reference":[{"key":"bibr1-1094342014532297","doi-asserted-by":"crossref","unstructured":"Balay S, Brown J, Buschelman K, (2013) PETSc users manual, revision 3.4. Available at: http:\/\/www.mcs.anl.gov\/petsc\/petsc-current\/docs\/manual.pdf (accessed 24 March 2014).","DOI":"10.2172\/1178104"},{"key":"bibr2-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9991(84)90073-1"},{"key":"bibr3-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1145\/1375527.1375552"},{"key":"bibr4-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1177\/1094342009347767"},{"key":"bibr5-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1145\/2304576.2304590"},{"key":"bibr6-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1177\/1094342010391989"},{"key":"bibr7-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1016\/0771-050X(80)90013-3"},{"key":"bibr8-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2012.04.023"},{"key":"bibr9-1094342014532297","unstructured":"Fehlberg E (1969) Low-order classical Runge-Kutta formulas with stepsize control and their application to some heat transfer problems. Technical report R-315, National Aeronautics and Space Administration. Available at: http:\/\/ntrs.nasa.gov\/archive\/nasa\/casi.ntrs.nasa.gov\/19690021375.pdf"},{"journal-title":"Provocative ideas session, ASPLOS","year":"2013","author":"Fujita H","key":"bibr10-1094342014532297"},{"key":"bibr11-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1515\/9780691218632"},{"key":"bibr12-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1145\/1089014.1089021"},{"key":"bibr13-1094342014532297","unstructured":"Hoemmen M, Heroux MA (2011) Fault-tolerant iterative methods via selective reliability. Technical report SAND2011-3915 C, Sandia National Laboratories."},{"key":"bibr14-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1984.1676475"},{"key":"bibr15-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1145\/882082.882086"},{"key":"bibr16-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1145\/1966445.1966477"},{"key":"bibr17-1094342014532297","unstructured":"Rinard M (2013) Parallel synchronization-free approximate data structure construction. In: Proceedings of the 5th USENIX workshop on hot topics in parallelism. Available at: https:\/\/www.usenix.org\/conference\/hotpar13\/parallel-synchronization-free-approximate-data-structure-construction (accessed 1 August 2013)."},{"key":"bibr18-1094342014532297","unstructured":"Seibold B (2008) A compact and fast Matlab code solving the incompressible Navier-Stokes equations on rectangular domains. Available at: http:\/\/math.mit.edu\/cse\/codes\/mit18086_navierstokes.pdf. (accessed 15 January 2014)"},{"volume-title":"Proceedings of the symposium on application accelerators in high-performance computing","year":"2009","author":"Shi G","key":"bibr19-1094342014532297"},{"key":"bibr20-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1177\/1094342014522573"},{"volume-title":"Computational Science and Engineering","year":"2007","author":"Strang G","key":"bibr21-1094342014532297"},{"volume-title":"Finite Difference Schemes and Partial Differential Equations","year":"2007","author":"Strikwerda J","key":"bibr22-1094342014532297"},{"key":"bibr23-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1021\/ct400489c"},{"key":"bibr24-1094342014532297","doi-asserted-by":"publisher","DOI":"10.1109\/DSNW.2012.6264677"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342014532297","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342014532297","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342014532297","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T16:29:32Z","timestamp":1741019372000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342014532297"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,4,25]]},"references-count":24,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2015,11]]}},"alternative-id":["10.1177\/1094342014532297"],"URL":"https:\/\/doi.org\/10.1177\/1094342014532297","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2014,4,25]]}}}