{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,28]],"date-time":"2025-09-28T04:15:06Z","timestamp":1759032906544,"version":"3.41.0"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA","license":[{"start":{"date-parts":[[2017,10,12]],"date-time":"2017-10-12T00:00:00Z","timestamp":1507766400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CCF-1422471,CCF-1223850,CCF-1218344,CCF-1116289,CCF-0954024,IIS-1617157"],"award-info":[{"award-number":["CCF-1422471,CCF-1223850,CCF-1218344,CCF-1116289,CCF-0954024,IIS-1617157"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006831","name":"U.S. Air Force","doi-asserted-by":"publisher","award":["FA8750-15-2-0075"],"award-info":[{"award-number":["FA8750-15-2-0075"]}],"id":[{"id":"10.13039\/100006831","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006112","name":"Microsoft Research","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006112","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2017,10,12]]},"abstract":"<jats:p>Localizing type errors is challenging in languages with global type inference, as the type checker must make assumptions about what the programmer intended to do. We introduce Nate, a data-driven approach to error localization based on supervised learning. Nate analyzes a large corpus of training data -- pairs of ill-typed programs and their \"fixed\" versions -- to automatically learn a model of where the error is most likely to be found. Given a new ill-typed program, Nate executes the model to generate a list of potential blame assignments ranked by likelihood. We evaluate Nate by comparing its precision to the state of the art on a set of over 5,000 ill-typed OCaml programs drawn from two instances of an introductory programming course. We show that when the top-ranked blame assignment is considered, Nate's data-driven model is able to correctly predict the exact sub-expression that should be changed 72% of the time, 28 points higher than OCaml and 16 points higher than the state-of-the-art SHErrLoc tool. Furthermore, Nate's accuracy surpasses 85% when we consider the top two locations and reaches 91% if we consider the top three.<\/jats:p>","DOI":"10.1145\/3138818","type":"journal-article","created":{"date-parts":[[2017,10,13]],"date-time":"2017-10-13T15:15:45Z","timestamp":1507907745000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":21,"title":["Learning to blame: localizing novice type errors with data-driven diagnosis"],"prefix":"10.1145","volume":"1","author":[{"given":"Eric L.","family":"Seidel","sequence":"first","affiliation":[{"name":"University of California at San Diego, USA"}]},{"given":"Huma","family":"Sibghat","sequence":"additional","affiliation":[{"name":"University of California at San Diego, USA"}]},{"given":"Kamalika","family":"Chaudhuri","sequence":"additional","affiliation":[{"name":"University of California at San Diego, USA"}]},{"given":"Westley","family":"Weimer","sequence":"additional","affiliation":[{"name":"University of Virginia, USA"}]},{"given":"Ranjit","family":"Jhala","sequence":"additional","affiliation":[{"name":"University of California at San Diego, USA"}]}],"member":"320","published-online":{"date-parts":[[2017,10,12]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/PRDC.2006.18"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAIC.PART.2007.13"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/176454.176460"},{"key":"e_1_2_2_5_1","volume-title":"Proceedings of the 33rd International Conference on Machine Learning (ICML \u201916)","author":"Bielik Pavol","year":"2016","unstructured":"Pavol Bielik , Veselin Raychev , and Martin Vechev . 2016 . PHOG: Probabilistic Model for Code . In Proceedings of the 33rd International Conference on Machine Learning (ICML \u201916) . Pavol Bielik, Veselin Raychev, and Martin Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33rd International Conference on Machine Learning (ICML \u201916)."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"volume-title":"Classification and regression trees","author":"Breiman Leo","key":"e_1_2_2_7_1","unstructured":"Leo Breiman , Jerome Friedman , Charles J Stone , and Richard A Olshen . 1984. Classification and regression trees . CRC press . Leo Breiman, Jerome Friedman, Charles J Stone, and Richard A Olshen. 1984. Classification and regression trees. CRC press."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2002.1029005"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2535838.2535863"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-07151-0_3"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/507635.507659"},{"key":"e_1_2_2_12_1","volume-title":"Preproceedings of the 15th Symposium on Trends in Functional Programming.","author":"Christiansen David Raymond","year":"2014","unstructured":"David Raymond Christiansen . 2014 . Reflect on your mistakes! Lightweight domain-specific error messages . In Preproceedings of the 15th Symposium on Trends in Functional Programming. David Raymond Christiansen. 2014. Reflect on your mistakes! Lightweight domain-specific error messages. In Preproceedings of the 15th Symposium on Trends in Functional Programming."},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/512950.512973"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/0167-6423(95)00007-0"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/231379.231387"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0031619"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1882291.1882315"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/11431664_5"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-36575-3_20"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74130-5_12"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2009.36"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-84858-7"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/944705.944707"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/2337223.2337322"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227135"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/581339.581397"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0956796800000599"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1993316.1993550"},{"key":"e_1_2_2_29_1","volume-title":"Adam: A Method for Stochastic Optimization. (22","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . 2014 . Adam: A Method for Stochastic Optimization. (22 Dec. 2014). arXiv: 1412.6980 Diederik P Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. (22 Dec. 2014). arXiv: 1412.6980"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2931037.2931051"},{"key":"e_1_2_2_31_1","first-page":"249","article-title":"Supervised Machine Learning","volume":"31","author":"Kotsiantis S B","year":"2007","unstructured":"S B Kotsiantis . 2007 . Supervised Machine Learning : A Review of Classification Techniques. Informatica 31 , 3 (2007), 249 \u2013 268 . S B Kotsiantis. 2007. Supervised Machine Learning: A Review of Classification Techniques. Informatica 31, 3 (2007), 249\u2013268.","journal-title":"A Review of Classification Techniques. Informatica"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-44898-5_16"},{"key":"e_1_2_2_33_1","volume-title":"Content Analysis: An Introduction to Its Methodology","author":"Krippendorff K","year":"2012","unstructured":"K Krippendorff . 2012 . Content Analysis: An Introduction to Its Methodology . SAGE Publications . K Krippendorff. 2012. Content Analysis: An Introduction to Its Methodology. SAGE Publications."},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.2307\/2529310"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/291891.291892"},{"key":"e_1_2_2_36_1","doi-asserted-by":"crossref","unstructured":"Eelco Lempsink. 2009. Generic type-safe diff and patch for families of datatypes. Master\u2019s thesis. Universiteit Utrecht.  Eelco Lempsink. 2009. Generic type-safe diff and patch for families of datatypes. Master\u2019s thesis. Universiteit Utrecht.","DOI":"10.1145\/1596614.1596624"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250734.1250783"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983990.2983994"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177730491"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-48515-5_9"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/3104322.3104425"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/357073.357079"},{"volume-title":"Neural Networks and Deep Learning","author":"Nielsen Michael A","key":"e_1_2_2_43_1","unstructured":"Michael A Nielsen . 2015. Neural Networks and Deep Learning . Determination Press . Michael A Nielsen. 2015. Neural Networks and Deep Learning. Determination Press."},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2660193.2660230"},{"key":"e_1_2_2_45_1","unstructured":"John Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann.  John Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann."},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2676726.2677009"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2594291.2594321"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-32037-8_1"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.806813"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2951913.2951915"},{"key":"e_1_2_2_52_1","volume-title":"Learning to Blame: Localizing Novice Type Errors with Data-Driven Diagnosis. (Aug","author":"Seidel Eric L.","year":"2017","unstructured":"Eric L. Seidel , Huma Sibghat , Kamalika Chaudhuri , Westley Weimer , and Ranjit Jhala . 2017. Learning to Blame: Localizing Novice Type Errors with Data-Driven Diagnosis. (Aug . 2017 ). arXiv: 1708.07583 Eric L. Seidel, Huma Sibghat, Kamalika Chaudhuri, Westley Weimer, and Ranjit Jhala. 2017. Learning to Blame: Localizing Novice Type Errors with Data-Driven Diagnosis. (Aug. 2017). arXiv: 1708.07583"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-49498-1_26"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/366378.366379"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/512644.512648"},{"volume-title":"Selected Papers from the 1st Scottish Functional Programming Workshop (SFP \u201999)","author":"Yang Jun","key":"e_1_2_2_57_1","unstructured":"Jun Yang . 1999. Explaining Type Errors by Finding the Source of a Type Conflict . In Selected Papers from the 1st Scottish Functional Programming Workshop (SFP \u201999) . Intellect Books , Exeter, UK , 59\u201367. Jun Yang. 1999. Explaining Type Errors by Finding the Source of a Type Conflict. In Selected Papers from the 1st Scottish Functional Programming Workshop (SFP \u201999). Intellect Books, Exeter, UK, 59\u201367."},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491509.2491513"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2535838.2535870"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3138818","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3138818","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3138818","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T19:04:33Z","timestamp":1750273473000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3138818"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,10,12]]},"references-count":56,"journal-issue":{"issue":"OOPSLA","published-print":{"date-parts":[[2017,10,12]]}},"alternative-id":["10.1145\/3138818"],"URL":"https:\/\/doi.org\/10.1145\/3138818","relation":{},"ISSN":["2475-1421"],"issn-type":[{"type":"electronic","value":"2475-1421"}],"subject":[],"published":{"date-parts":[[2017,10,12]]},"assertion":[{"value":"2017-10-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}