{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T02:06:30Z","timestamp":1776996390293,"version":"3.51.4"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"5s","license":[{"start":{"date-parts":[[2017,10,10]],"date-time":"2017-10-10T00:00:00Z","timestamp":1507593600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100006132","name":"DOE Office of Science","doi-asserted-by":"crossref","award":["DE-AC52-07NA27344"],"award-info":[{"award-number":["DE-AC52-07NA27344"]}],"id":[{"id":"10.13039\/100006132","id-type":"DOI","asserted-by":"crossref"}]},{"name":"DOE ASCR","award":["LLNL-CONF-656877"],"award-info":[{"award-number":["LLNL-CONF-656877"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2017,10,31]]},"abstract":"<jats:p>Modern embedded systems are becoming more reliant on real-valued arithmetic as they employ mathematically complex vision algorithms and sensor signal processing. Double-precision floating point is the most commonly used precision in computer vision algorithm implementations. A single-precision floating point can provide a performance boost due to less memory transfers, less cache occupancy, and relatively faster mathematical operations on some architectures. However, adopting it can result in loss of accuracy. Identifying which parts of the program can run in single-precision floating point with low impact on error is a manual and tedious process. In this paper, we propose an automatic approach to identify parts of the program that have a low impact on error using shadow-value analysis. Our approach provides the user with a performance\/error tradeoff, using which the user can decide how much accuracy can be sacrificed in return for performance improvement. We illustrate the impact of the approach using a well known implementation of Apriltag detection used in robotics vision. We demonstrate that an average 1.3x speedup can be achieved with no impact on tag detection, and a 1.7x speedup with only 4% false negatives.<\/jats:p>","DOI":"10.1145\/3126519","type":"journal-article","created":{"date-parts":[[2017,10,12]],"date-time":"2017-10-12T12:52:50Z","timestamp":1507812770000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Managing the Performance\/Error Tradeoff of Floating-point Intensive Applications"],"prefix":"10.1145","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9385-5862","authenticated-orcid":false,"given":"Ramy","family":"Medhat","sequence":"first","affiliation":[{"name":"University of Waterloo, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael O.","family":"Lam","sequence":"additional","affiliation":[{"name":"James Madison University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Barry L.","family":"Rountree","sequence":"additional","affiliation":[{"name":"Lawrence Livermore National Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Borzoo","family":"Bonakdarpour","sequence":"additional","affiliation":[{"name":"McMaster University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sebastian","family":"Fischmeister","sequence":"additional","affiliation":[{"name":"University of Waterloo, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2017,10,10]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Ashish Agarwal , Paul Barham , Eugene Brevdo , Zhifeng Chen , Craig Citro , Greg S. Corrado , Andy Davis , Jeffrey Dean , Matthieu Devin , and others. 2016 . Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016). Mart\u00edn Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, and others. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)."},{"key":"e_1_2_1_2_1","volume-title":"https:\/\/april.eecs.umich.edu\/software\/apriltag.html. (2017). {Online","author":"University of Michigan APRIL Robotics Laboratory. 2017. AprilTags.","year":"2017","unstructured":"University of Michigan APRIL Robotics Laboratory. 2017. AprilTags. https:\/\/april.eecs.umich.edu\/software\/apriltag.html. (2017). {Online ; accessed 03- April - 2017 }. University of Michigan APRIL Robotics Laboratory. 2017. AprilTags. https:\/\/april.eecs.umich.edu\/software\/apriltag.html. (2017). {Online; accessed 03-April-2017}."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1809028.1806620"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488873"},{"key":"e_1_2_1_5_1","volume-title":"Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024","author":"Courbariaux Matthieu","year":"2014","unstructured":"Matthieu Courbariaux , Yoshua Bengio , and Jean-Pierre David . 2014. Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 ( 2014 ). Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2014. Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 (2014)."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ETS.2013.6569370"},{"key":"e_1_2_1_7_1","volume-title":"Lulesh programming model and performance ports overview. Lawrence Livermore National Laboratory (LLNL)","author":"Karlin Ian","year":"2012","unstructured":"Ian Karlin , Abhinav Bhatele , Bradford L. Chamberlain , Jonathan Cohen , Zachary Devito , Maya Gokhale , Riyaz Haque , Rich Hornung , Jeff Keasler , Dan Laney , and others. 2012. Lulesh programming model and performance ports overview. Lawrence Livermore National Laboratory (LLNL) , Livermore, CA , Tech. Rep ( 2012 ). Ian Karlin, Abhinav Bhatele, Bradford L. Chamberlain, Jonathan Cohen, Zachary Devito, Maya Gokhale, Riyaz Haque, Rich Hornung, Jeff Keasler, Dan Laney, and others. 2012. Lulesh programming model and performance ports overview. Lawrence Livermore National Laboratory (LLNL), Livermore, CA, Tech. Rep (2012)."},{"key":"e_1_2_1_8_1","volume-title":"Hollingsworth","author":"Lam Michael O.","year":"2016","unstructured":"Michael O. Lam and Jeffrey K . Hollingsworth . 2016 . Fine-Grained Floating-Point Precision Analysis. International Journal of High Performance Computing Applications ( jun 2016), 1094342016652462. Michael O. Lam and Jeffrey K. Hollingsworth. 2016. Fine-Grained Floating-Point Precision Analysis. International Journal of High Performance Computing Applications (jun 2016), 1094342016652462."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2464996.2465018"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2012.08.002"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/3018823.3018826"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 95","author":"Liu Cong","year":"2014","unstructured":"Cong Liu , Jie Han , and Fabrizio Lombardi . 2014 . A low-power, high-performance approximate multiplier with configurable partial error recovery . In Proceedings of the conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 95 . Cong Liu, Jie Han, and Fabrizio Lombardi. 2014. A low-power, high-performance approximate multiplier with configurable partial error recovery. In Proceedings of the conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 95."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1064978.1065034"},{"key":"e_1_2_1_14_1","volume-title":"Microsoft Cognitive Toolkit. https:\/\/www.microsoft.com\/en-us\/research\/product\/cognitive-toolkit\/. (2017). {Online","year":"2017","unstructured":"Microsoft. 2017. Microsoft Cognitive Toolkit. https:\/\/www.microsoft.com\/en-us\/research\/product\/cognitive-toolkit\/. (2017). {Online ; accessed 03- April - 2017 }. Microsoft. 2017. Microsoft Cognitive Toolkit. https:\/\/www.microsoft.com\/en-us\/research\/product\/cognitive-toolkit\/. (2017). {Online; accessed 03-April-2017}."},{"key":"e_1_2_1_15_1","volume-title":"Workshop on Approximate Computing Across the System Stack (WACAS).","author":"Mishra Asit K.","year":"2014","unstructured":"Asit K. Mishra , Rajkishore Barik , and Somnath Paul . 2014 . iACT: A software-hardware framework for understanding the scope of approximate computing . In Workshop on Approximate Computing Across the System Stack (WACAS). Asit K. Mishra, Rajkishore Barik, and Somnath Paul. 2014. iACT: A software-hardware framework for understanding the scope of approximate computing. In Workshop on Approximate Computing Across the System Stack (WACAS)."},{"key":"e_1_2_1_16_1","volume-title":"Matos","author":"Neves Ricardo","year":"2013","unstructured":"Ricardo Neves and Anibal C . Matos . 2013 . Raspberry PI based stereo vision for small size ASVs. In Oceans-San Diego, 2013. IEEE , 1--6. Ricardo Neves and Anibal C. Matos. 2013. Raspberry PI based stereo vision for small size ASVs. In Oceans-San Diego, 2013. IEEE, 1--6."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2011.5979561"},{"key":"e_1_2_1_18_1","volume-title":"PennCOSYVIO Data Set. https:\/\/daniilidis-group.github.io\/penncosyvio\/. (2017). {Online","author":"Pfrommer Bernd","year":"2017","unstructured":"Bernd Pfrommer . 2017. PennCOSYVIO Data Set. https:\/\/daniilidis-group.github.io\/penncosyvio\/. (2017). {Online ; accessed 03- April - 2017 }. Bernd Pfrommer. 2017. PennCOSYVIO Data Set. https:\/\/daniilidis-group.github.io\/penncosyvio\/. (2017). {Online; accessed 03-April-2017}."},{"key":"e_1_2_1_19_1","unstructured":"Quanser. 2017. QUARC Real-Time Control Software. http:\/\/www.quanser.com\/Products\/quarc. (2017). {Online; accessed 03-April-2017}.  Quanser. 2017. QUARC Real-Time Control Software. http:\/\/www.quanser.com\/Products\/quarc. (2017). {Online; accessed 03-April-2017}."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884850"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503296"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1993316.1993518"},{"key":"e_1_2_1_23_1","volume-title":"https:\/\/01.org\/powertop. (2017). {Online","author":"Source Intel Open","year":"2017","unstructured":"Intel Open Source . 2017. Powertop. https:\/\/01.org\/powertop. (2017). {Online ; accessed 03- April - 2017 }. Intel Open Source. 2017. Powertop. https:\/\/01.org\/powertop. (2017). {Online; accessed 03-April-2017}."},{"key":"e_1_2_1_24_1","volume-title":"https:\/\/01.org\/rapl-power-meter. (2017). {Online","author":"Source Intel Open","year":"2017","unstructured":"Intel Open Source . 2017. RAPL. https:\/\/01.org\/rapl-power-meter. (2017). {Online ; accessed 03- April - 2017 }. Intel Open Source. 2017. RAPL. https:\/\/01.org\/rapl-power-meter. (2017). {Online; accessed 03-April-2017}."}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3126519","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3126519","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3126519","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T19:05:01Z","timestamp":1750273501000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3126519"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,10,10]]},"references-count":24,"journal-issue":{"issue":"5s","published-print":{"date-parts":[[2017,10,31]]}},"alternative-id":["10.1145\/3126519"],"URL":"https:\/\/doi.org\/10.1145\/3126519","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,10,10]]},"assertion":[{"value":"2017-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-10-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}