{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T23:16:37Z","timestamp":1771024597599,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":65,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,10,27]],"date-time":"2019-10-27T00:00:00Z","timestamp":1572134400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,10,27]]},"DOI":"10.1145\/3341301.3359645","type":"proceedings-article","created":{"date-parts":[[2019,10,21]],"date-time":"2019-10-21T13:34:22Z","timestamp":1571664862000},"page":"114-130","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["CrashTuner"],"prefix":"10.1145","author":[{"given":"Jie","family":"Lu","sequence":"first","affiliation":[{"name":"University of Chinese Academy of Sciences, China"}]},{"given":"Chen","family":"Liu","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, China"}]},{"given":"Lian","family":"Li","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, China"}]},{"given":"Xiaobing","family":"Feng","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, China"}]},{"given":"Feng","family":"Tan","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Jun","family":"Yang","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Liang","family":"You","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]}],"member":"320","published-online":{"date-parts":[[2019,10,27]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Java bytecode engineering toolkit since","year":"1999","unstructured":"1999. Java bytecode engineering toolkit since 1999 . https:\/\/www.javassist.org\/. 1999. Java bytecode engineering toolkit since 1999. https:\/\/www.javassist.org\/."},{"key":"e_1_3_2_1_2_1","unstructured":"2012. Downtime costs per Hour. http:\/\/iwgcr.org\/?p=404.  2012. Downtime costs per Hour. http:\/\/iwgcr.org\/?p=404."},{"key":"e_1_3_2_1_3_1","unstructured":"2012. MapReduce bug 3858. https:\/\/jira.apache.org\/jira\/browse\/MAPREDUCE-3858.  2012. MapReduce bug 3858. https:\/\/jira.apache.org\/jira\/browse\/MAPREDUCE-3858."},{"key":"e_1_3_2_1_4_1","unstructured":"2015. Understanding HDFS Recovery Processes. https:\/\/blog.cloudera.com\/blog\/2015\/02\/understanding-hdfs-recovery-processes-part-1\/.  2015. Understanding HDFS Recovery Processes. https:\/\/blog.cloudera.com\/blog\/2015\/02\/understanding-hdfs-recovery-processes-part-1\/."},{"key":"e_1_3_2_1_5_1","unstructured":"2015. WALA Home page. http:\/\/wala.sourceforge.net\/wiki\/index.php\/Main_Page\/.  2015. WALA Home page. http:\/\/wala.sourceforge.net\/wiki\/index.php\/Main_Page\/."},{"key":"e_1_3_2_1_6_1","unstructured":"2016. Fault Injection Framework and Development Guide. https:\/\/hadoop.apache.org\/docs\/r2.7.2\/hadoop-project-dist\/hadoop-hdfs\/FaultInjectFramework.html.  2016. Fault Injection Framework and Development Guide. https:\/\/hadoop.apache.org\/docs\/r2.7.2\/hadoop-project-dist\/hadoop-hdfs\/FaultInjectFramework.html."},{"key":"e_1_3_2_1_7_1","unstructured":"2016. Scheduling of opportunistic containers. https:\/\/issues.apache.org\/jira\/browse\/YARN-5542.  2016. Scheduling of opportunistic containers. https:\/\/issues.apache.org\/jira\/browse\/YARN-5542."},{"key":"e_1_3_2_1_8_1","unstructured":"2016. YARN bug 5918. https:\/\/jira.apache.org\/jira\/browse\/YARN-5918.  2016. YARN bug 5918. https:\/\/jira.apache.org\/jira\/browse\/YARN-5918."},{"key":"e_1_3_2_1_9_1","unstructured":"2018. Lloyd's Estimates the Impact of a U.S. Cloud Outage at $19 Billion. https:\/\/www.eweek.com\/cloud\/lloyd-s-estimates-the-impact-of-a-u.s.-cloud-outage-at-19-billion.  2018. Lloyd's Estimates the Impact of a U.S. Cloud Outage at $19 Billion. https:\/\/www.eweek.com\/cloud\/lloyd-s-estimates-the-impact-of-a-u.s.-cloud-outage-at-19-billion."},{"key":"e_1_3_2_1_10_1","unstructured":"2019. Apache log4j a logging library for Java. http:\/\/logging.apache.org\/log4j\/2.x\/.  2019. Apache log4j a logging library for Java. http:\/\/logging.apache.org\/log4j\/2.x\/."},{"key":"e_1_3_2_1_11_1","unstructured":"2019. Centralize Transform & Stash Your Data. https:\/\/www.elastic.co\/products\/logstash  2019. Centralize Transform & Stash Your Data. https:\/\/www.elastic.co\/products\/logstash"},{"key":"e_1_3_2_1_12_1","unstructured":"2019. Go is an open source programming language that makes it easy to build simple reliable and efficient software. https:\/\/golang.org\/  2019. Go is an open source programming language that makes it easy to build simple reliable and efficient software. https:\/\/golang.org\/"},{"key":"e_1_3_2_1_13_1","unstructured":"2019. HintedHandoff. https:\/\/wiki.apache.org\/cassandra\/HintedHandof.  2019. HintedHandoff. https:\/\/wiki.apache.org\/cassandra\/HintedHandof."},{"key":"e_1_3_2_1_14_1","unstructured":"2019. Simple logging facade for Java (SLF4J). http:\/\/www.slf4j.org\/.  2019. Simple logging facade for Java (SLF4J). http:\/\/www.slf4j.org\/."},{"key":"e_1_3_2_1_15_1","unstructured":"2019. What is Kubernetes. https:\/\/kubernetes.io\/docs\/concepts\/overview\/what-is-kubernetes\/  2019. What is Kubernetes. https:\/\/kubernetes.io\/docs\/concepts\/overview\/what-is-kubernetes\/"},{"key":"e_1_3_2_1_16_1","unstructured":"2019. Write Ahead Log (WAL). http:\/\/hbase.apache.org\/book.html#wal.  2019. Write Ahead Log (WAL). http:\/\/hbase.apache.org\/book.html#wal."},{"key":"e_1_3_2_1_17_1","unstructured":"2019. ZooKeeperSmoketest. https:\/\/github.com\/phunt\/zk-smoketest.  2019. ZooKeeperSmoketest. https:\/\/github.com\/phunt\/zk-smoketest."},{"key":"e_1_3_2_1_18_1","volume-title":"Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15)","author":"Alvaro Peter","unstructured":"Peter Alvaro , Joshua Rosen , and Joseph M. Hellerstein . 2015. Lineage-driven Fault Injection . In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15) . ACM, New York, NY, USA, 331--346. Peter Alvaro, Joshua Rosen, and Joseph M. Hellerstein. 2015. Lineage-driven Fault Injection. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 331--346."},{"key":"e_1_3_2_1_19_1","unstructured":"Dhruba Borthakur et al. 2008. HDFS architecture guide. Hadoop Apache Project 53 (2008).  Dhruba Borthakur et al. 2008. HDFS architecture guide. Hadoop Apache Project 53 (2008)."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3236024.3236030"},{"key":"e_1_3_2_1_22_1","volume-title":"HBase: the definitive guide: random access to your planet-size data. \"O'Reilly Media","author":"George Lars","unstructured":"Lars George . 2011. HBase: the definitive guide: random access to your planet-size data. \"O'Reilly Media , Inc .\". Lars George. 2011. HBase: the definitive guide: random access to your planet-size data. \"O'Reilly Media, Inc.\"."},{"key":"e_1_3_2_1_23_1","volume-title":"Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI '11)","author":"Gunawi Haryadi S.","year":"2011","unstructured":"Haryadi S. Gunawi , Thanh Do , Pallavi Joshi , Peter Alvaro , Joseph M. Hellerstein , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , Koushik Sen , and Dhruba Borthakur . 2011 . FATE and DESTINI: A Framework for Cloud Recovery Testing . In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI '11) . USENIX Association, Berkeley, CA, USA, 238--252. Haryadi S. Gunawi, Thanh Do, Pallavi Joshi, Peter Alvaro, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Koushik Sen, and Dhruba Borthakur. 2011. FATE and DESTINI: A Framework for Cloud Recovery Testing. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI '11). USENIX Association, Berkeley, CA, USA, 238--252."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/1924908.1924914"},{"key":"e_1_3_2_1_25_1","volume-title":"Proceedings of the ACM Symposium on Cloud Computing (SoCC '14)","author":"Gunawi Haryadi S.","unstructured":"Haryadi S. Gunawi , Mingzhe Hao , Tanakorn Leesatapornwongsa , Tiratat Patana-anake, Thanh Do , Jeffry Adityatama , Kurnia J. Eliazar , Agung Laksono , Jeffrey F. Lukman , Vincentius Martin , and Anang D. Satria . 2014. What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems . In Proceedings of the ACM Symposium on Cloud Computing (SoCC '14) . ACM, New York, NY, USA, Article 7, 14 pages. Haryadi S. Gunawi, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patana-anake, Thanh Do, Jeffry Adityatama, Kurnia J. Eliazar, Agung Laksono, Jeffrey F. Lukman, Vincentius Martin, and Anang D. Satria. 2014. What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems. In Proceedings of the ACM Symposium on Cloud Computing (SoCC '14). ACM, New York, NY, USA, Article 7, 14 pages."},{"key":"e_1_3_2_1_26_1","volume-title":"Proceedings of the Seventh ACM Symposium on Cloud Computing (SoCC '16)","author":"Gunawi Haryadi S.","unstructured":"Haryadi S. Gunawi , Mingzhe Hao , Riza O. Suminto , Agung Laksono , Anang D. Satria , Jeffry Adityatama , and Kurnia J. Eliazar . 2016. Why Does the Cloud Stop Computing?: Lessons from Hundreds of Service Outages . In Proceedings of the Seventh ACM Symposium on Cloud Computing (SoCC '16) . ACM, New York, NY, USA, 1--16. Haryadi S. Gunawi, Mingzhe Hao, Riza O. Suminto, Agung Laksono, Anang D. Satria, Jeffry Adityatama, and Kurnia J. Eliazar. 2016. Why Does the Cloud Stop Computing?: Lessons from Hundreds of Service Outages. In Proceedings of the Seventh ACM Symposium on Cloud Computing (SoCC '16). ACM, New York, NY, USA, 1--16."},{"key":"e_1_3_2_1_27_1","first-page":"23","article-title":"Fail-slow at scale: Evidence of hardware performance faults in large production systems","volume":"14","author":"Gunawi Haryadi S","year":"2018","unstructured":"Haryadi S Gunawi , Riza O Suminto , Russell Sears , Casey Golliher , Swaminathan Sundararaman , Xing Lin , Tim Emami , Weiguang Sheng , Nematollah Bidokhti , Caitie McCaffrey , 2018 . Fail-slow at scale: Evidence of hardware performance faults in large production systems . ACM Transactions on Storage (TOS) 14 , 3 (2018), 23 . Haryadi S Gunawi, Riza O Suminto, Russell Sears, Casey Golliher, Swaminathan Sundararaman, Xing Lin, Tim Emami, Weiguang Sheng, Nematollah Bidokhti, Caitie McCaffrey, et al. 2018. Fail-slow at scale: Evidence of hardware performance faults in large production systems. ACM Transactions on Storage (TOS) 14, 3 (2018), 23.","journal-title":"ACM Transactions on Storage (TOS)"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2043556.2043582"},{"key":"e_1_3_2_1_29_1","volume-title":"Proceedings of the 34rd ACM\/IEEE International Conference on Automated Software Engineering (ASE '19)","author":"Haicheng Chen","year":"2019","unstructured":"Chen Haicheng , Dou Wensheng , Jiang Yanyan , and Qin Feng . 2019 . Understanding Exception-Related Bugs in Large-Scale Cloud Systems . In Proceedings of the 34rd ACM\/IEEE International Conference on Automated Software Engineering (ASE '19) . ACM. Chen Haicheng, Dou Wensheng, Jiang Yanyan, and Qin Feng. 2019. Understanding Exception-Related Bugs in Large-Scale Cloud Systems. In Proceedings of the 34rd ACM\/IEEE International Conference on Automated Software Engineering (ASE '19). ACM."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/3291168.3291170"},{"key":"e_1_3_2_1_31_1","volume-title":"Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC '10). USENIX Association","author":"Hunt Patrick","year":"2010","unstructured":"Patrick Hunt , Mahadev Konar , Flavio P. Junqueira , and Benjamin Reed . 2010 . ZooKeeper: Wait-free Coordination for Internet-scale Systems . In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC '10). USENIX Association , Berkeley, CA, USA, 11--11. Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed. 2010. ZooKeeper: Wait-free Coordination for Internet-scale Systems. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC '10). USENIX Association, Berkeley, CA, USA, 11--11."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2524211.2524217"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2048066.2048082"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2523616.2523622"},{"key":"e_1_3_2_1_35_1","volume-title":"Proceedings of the 4th USENIX Conference on Networked Systems Design &#38; Implementation (NSDI'07)","author":"Killian Charles","year":"2007","unstructured":"Charles Killian , James W. Anderson , Ranjit Jhala , and Amin Vahdat . 2007 . Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code . In Proceedings of the 4th USENIX Conference on Networked Systems Design &#38; Implementation (NSDI'07) . USENIX Association, Berkeley, CA, USA, 18--18. Charles Killian, James W. Anderson, Ranjit Jhala, and Amin Vahdat. 2007. Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code. In Proceedings of the 4th USENIX Conference on Networked Systems Design &#38; Implementation (NSDI'07). USENIX Association, Berkeley, CA, USA, 18--18."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1773912.1773922"},{"key":"e_1_3_2_1_37_1","volume-title":"Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA '15)","author":"Leesatapornwongsa Tanakorn","unstructured":"Tanakorn Leesatapornwongsa and Haryadi S. Gunawi . 2015. SAMC: A Fast Model Checker for Finding Heisenbugs in Distributed Systems (Demo) . In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA '15) . ACM, New York, NY, USA, 423--427. Tanakorn Leesatapornwongsa and Haryadi S. Gunawi. 2015. SAMC: A Fast Model Checker for Finding Heisenbugs in Distributed Systems (Demo). In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA '15). ACM, New York, NY, USA, 423--427."},{"key":"e_1_3_2_1_38_1","volume-title":"Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14)","author":"Leesatapornwongsa Tanakorn","unstructured":"Tanakorn Leesatapornwongsa , Mingzhe Hao , Pallavi Joshi , Jeffrey F. Lukman , and Haryadi S. Gunawi . 2014. SAMC: Semantic-aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems . In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14) . USENIX Association, Berkeley, CA, USA, 399--414. Tanakorn Leesatapornwongsa, Mingzhe Hao, Pallavi Joshi, Jeffrey F. Lukman, and Haryadi S. Gunawi. 2014. SAMC: Semantic-aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14). USENIX Association, Berkeley, CA, USA, 399--414."},{"key":"e_1_3_2_1_39_1","volume-title":"Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '16)","author":"Leesatapornwongsa Tanakorn","unstructured":"Tanakorn Leesatapornwongsa , Jeffrey F. Lukman , Shan Lu , and Haryadi S. Gunawi . 2016. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems . In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '16) . ACM, New York, NY, USA, 517--530. Tanakorn Leesatapornwongsa, Jeffrey F. Lukman, Shan Lu, and Haryadi S. Gunawi. 2016. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '16). ACM, New York, NY, USA, 517--530."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2025113.2025160"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491894.2466483"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037735"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173162.3177161"},{"key":"e_1_3_2_1_44_1","volume-title":"5th USENIX Symposium on Networked Systems Design & Implementation (NSDI '08)","author":"Liu Xuezheng","year":"2008","unstructured":"Xuezheng Liu , Zhenyu Guo , Xi Wang , Feibo Chen , Xiaochen Lian , Jian Tang , Ming Wu , M. Frans Kaashoek , and Zheng Zhang . 2008 . D3S: Debugging Deployed Distributed Systems . In 5th USENIX Symposium on Networked Systems Design & Implementation (NSDI '08) . USENIX Association, 423--437. Xuezheng Liu, Zhenyu Guo, Xi Wang, Feibo Chen, Xiaochen Lian, Jian Tang, Ming Wu, M. Frans Kaashoek, and Zheng Zhang. 2008. D3S: Debugging Deployed Distributed Systems. In 5th USENIX Symposium on Networked Systems Design & Implementation (NSDI '08). USENIX Association, 423--437."},{"key":"e_1_3_2_1_45_1","volume-title":"Proceedings of the 4th USENIX Conference on Networked Systems Design &#38; Implementation (NSDI '07)","author":"Liu Xuezheng","year":"2007","unstructured":"Xuezheng Liu , Wei Lin , Aimin Pan , and Zheng Zhang . 2007 . WiDS Checker: Combating Bugs in Distributed Systems . In Proceedings of the 4th USENIX Conference on Networked Systems Design &#38; Implementation (NSDI '07) . USENIX Association, Berkeley, CA, USA, 19--19. Xuezheng Liu, Wei Lin, Aimin Pan, and Zheng Zhang. 2007. WiDS Checker: Combating Bugs in Distributed Systems. In Proceedings of the 4th USENIX Conference on Networked Systems Design &#38; Implementation (NSDI '07). USENIX Association, Berkeley, CA, USA, 19--19."},{"key":"e_1_3_2_1_46_1","volume-title":"Understanding Node Change Bugs for Distributed Systems. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 399--410","author":"Lu Jie","year":"2019","unstructured":"Jie Lu , Liu Chen , Lian Li , and Xiaobing Feng . 2019 . Understanding Node Change Bugs for Distributed Systems. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 399--410 . Jie Lu, Liu Chen, Lian Li, and Xiaobing Feng. 2019. Understanding Node Change Bugs for Distributed Systems. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 399--410."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3236024.3236071"},{"key":"e_1_3_2_1_48_1","volume-title":"Proceedings of the Fourteenth EuroSys Conference 2019 (EuroSys '19)","author":"Lukman Jeffrey F.","unstructured":"Jeffrey F. Lukman , Huan Ke , Cesar A. Stuardo , Riza O. Suminto , Daniar H. Kurniawan , Dikaimin Simon , Satria Priambada , Chen Tian , Feng Ye , Tanakorn Leesatapornwongsa , Aarti Gupta , Shan Lu , and Haryadi S. Gunawi . 2019. FlyMC: Highly Scalable Testing of Complex Interleavings in Distributed Systems . In Proceedings of the Fourteenth EuroSys Conference 2019 (EuroSys '19) . ACM, New York, NY, USA, 1--16. Jeffrey F. Lukman, Huan Ke, Cesar A. Stuardo, Riza O. Suminto, Daniar H. Kurniawan, Dikaimin Simon, Satria Priambada, Chen Tian, Feng Ye, Tanakorn Leesatapornwongsa, Aarti Gupta, Shan Lu, and Haryadi S. Gunawi. 2019. FlyMC: Highly Scalable Testing of Complex Interleavings in Distributed Systems. In Proceedings of the Fourteenth EuroSys Conference 2019 (EuroSys '19). ACM, New York, NY, USA, 1--16."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICWR.2017.7959305"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.5555\/3291168.3291172"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.5555\/2228298.2228334"},{"key":"e_1_3_2_1_52_1","unstructured":"Biswaranjan Panda Deepthi Srinivasan Huan Ke Karan Gupta Vinayak Khot and Haryadi S Gunawi. 2019. {IASO}: A Fail-Slow Detection and Mitigation Framework for Distributed Storage Services. In 2019 {USENIX} Annual Technical Conference ({USENIX}{ATC} 19). 47--62.  Biswaranjan Panda Deepthi Srinivasan Huan Ke Karan Gupta Vinayak Khot and Haryadi S Gunawi. 2019. {IASO}: A Fail-Slow Detection and Mitigation Framework for Distributed Storage Services. In 2019 { USENIX } Annual Technical Conference ( { USENIX }{ ATC } 19). 47--62."},{"key":"e_1_3_2_1_53_1","volume-title":"All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14)","author":"Pillai Thanumalayan Sankaranarayana","unstructured":"Thanumalayan Sankaranarayana Pillai , Vijay Chidambaram , Ramnatthan Alagappan , Samer Al-Kiswany , Andrea C. Arpaci-Dusseau , and Remzi H . Arpaci-Dusseau. 2014 . All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14) . USENIX Association, 433--448. Thanumalayan Sankaranarayana Pillai, Vijay Chidambaram, Ramnatthan Alagappan, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14). USENIX Association, 433--448."},{"key":"e_1_3_2_1_54_1","volume-title":"5th International Workshop on Systems Software Verification (SSV '10)","author":"Simsa Jir\u00ed","unstructured":"Jir\u00ed Simsa , Randy Bryant , and Garth A. Gibson . 2010. dBug: Systematic Evaluation of Distributed Systems . In 5th International Workshop on Systems Software Verification (SSV '10) . USENIX Association, 1--8. Jir\u00ed Simsa, Randy Bryant, and Garth A. Gibson. 2010. dBug: Systematic Evaluation of Distributed Systems. In 5th International Workshop on Systems Software Verification (SSV '10). USENIX Association, 1--8."},{"key":"e_1_3_2_1_55_1","volume-title":"Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST'19)","author":"Stuardo Cesar A.","unstructured":"Cesar A. Stuardo , Tanakorn Leesatapornwongsa , Riza O. Suminto , Huan Ke , Jeffrey F. Lukman , Wei-Chiu Chuang , Shan Lu , and Haryadi S. Gunawi . 2019. Scalecheck: A Single-machine Approach for Discovering Scalability Bugs in Large Distributed Systems . In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST'19) . USENIX Association, Berkeley, CA, USA, 359--373. Cesar A. Stuardo, Tanakorn Leesatapornwongsa, Riza O. Suminto, Huan Ke, Jeffrey F. Lukman, Wei-Chiu Chuang, Shan Lu, and Haryadi S. Gunawi. 2019. Scalecheck: A Single-machine Approach for Discovering Scalability Bugs in Large Distributed Systems. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST'19). USENIX Association, Berkeley, CA, USA, 359--373."},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950296"},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/2523616.2523633"},{"key":"e_1_3_2_1_58_1","volume-title":"Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP '09)","author":"Xu Wei","unstructured":"Wei Xu , Ling Huang , Armando Fox , David Patterson , and Michael I. Jordan . 2009. Detecting Large-scale System Problems by Mining Console Logs . In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP '09) . ACM, New York, NY, USA, 117--132. Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I. Jordan. 2009. Detecting Large-scale System Problems by Mining Console Logs. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP '09). ACM, New York, NY, USA, 117--132."},{"key":"e_1_3_2_1_59_1","volume-title":"Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI '09)","author":"Yang Junfeng","year":"2009","unstructured":"Junfeng Yang , Tisheng Chen , Ming Wu , Zhilei Xu , Xuezheng Liu , Haoxiang Lin , Mao Yang , Fan Long , Lintao Zhang , and Lidong Zhou . 2009 . MODIST: Transparent Model Checking of Unmodified Distributed Systems . In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI '09) . USENIX Association, Berkeley, CA, USA, 213--228. Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou. 2009. MODIST: Transparent Model Checking of Unmodified Distributed Systems. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI '09). USENIX Association, Berkeley, CA, USA, 213--228."},{"key":"e_1_3_2_1_60_1","volume-title":"Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation -","volume":"7","author":"Yang Junfeng","year":"2006","unstructured":"Junfeng Yang , Can Sar , and Dawson Engler . 2006 . EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors . In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7 (OSDI '06). USENIX Association, Berkeley, CA, USA, 1--10. Junfeng Yang, Can Sar, and Dawson Engler. 2006. EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7 (OSDI '06). USENIX Association, Berkeley, CA, USA, 1--10."},{"key":"e_1_3_2_1_61_1","volume-title":"Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14)","author":"Yuan Ding","year":"2014","unstructured":"Ding Yuan , Yu Luo , Xin Zhuang , Guilherme Renna Rodrigues , Xu Zhao , Yongle Zhang , Pranay U. Jain , and Michael Stumm . 2014 . Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-intensive Systems . In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14) . USENIX Association, Berkeley, CA, USA, 249--265. Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. 2014. Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-intensive Systems. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14). USENIX Association, Berkeley, CA, USA, 249--265."},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132747.3132768"},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.5555\/3026877.3026924"},{"key":"e_1_3_2_1_64_1","volume-title":"Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14)","author":"Zhao Xu","year":"2014","unstructured":"Xu Zhao , Yongle Zhang , David Lion , Muhammad Faizan Ullah , Yu Luo , Ding Yuan , and Michael Stumm . 2014 . Lprof: A Non-intrusive Request Flow Profiler for Distributed Systems . In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14) . USENIX Association, Berkeley, CA, USA, 629--644. Xu Zhao, Yongle Zhang, David Lion, Muhammad Faizan Ullah, Yu Luo, Ding Yuan, and Michael Stumm. 2014. Lprof: A Non-intrusive Request Flow Profiler for Distributed Systems. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14). USENIX Association, Berkeley, CA, USA, 629--644."},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243176.3243206"}],"event":{"name":"SOSP '19: ACM SIGOPS 27th Symposium on Operating Systems Principles","location":"Huntsville Ontario Canada","acronym":"SOSP '19","sponsor":["SIGOPS ACM Special Interest Group on Operating Systems","USENIX Assoc USENIX Assoc"]},"container-title":["Proceedings of the 27th ACM Symposium on Operating Systems Principles"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3341301.3359645","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3341301.3359645","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:12:56Z","timestamp":1750201976000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3341301.3359645"}},"subtitle":["detecting crash-recovery bugs in cloud systems via meta-info analysis"],"short-title":[],"issued":{"date-parts":[[2019,10,27]]},"references-count":65,"alternative-id":["10.1145\/3341301.3359645","10.1145\/3341301"],"URL":"https:\/\/doi.org\/10.1145\/3341301.3359645","relation":{},"subject":[],"published":{"date-parts":[[2019,10,27]]},"assertion":[{"value":"2019-10-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}