{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T21:01:24Z","timestamp":1760043684286,"version":"3.40.3"},"publisher-location":"Cham","reference-count":19,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783031435126"},{"type":"electronic","value":"9783031435133"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,14]],"date-time":"2023-09-14T00:00:00Z","timestamp":1694649600000},"content-version":"vor","delay-in-days":256,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We introduce a machine-learning-based tool for the Lean proof assistant that suggests relevant premises for theorems being proved by a user. The design principles for the tool are (1) tight integration with the proof assistant, (2) ease of use and installation, (3) a lightweight and fast approach. For this purpose, we designed a custom version of the random forest model, trained in an online fashion. It is implemented directly in Lean, which was possible thanks to the rich and efficient metaprogramming features of Lean 4. The random forest is trained on data extracted from  \u2013 Lean\u2019s mathematics library. We experiment with various options for producing training features and labels. The advice from a trained model is accessible to the user via the \n\"Image missing\"\n tactic which can be called in an editor while constructing a proof interactively.<\/jats:p>","DOI":"10.1007\/978-3-031-43513-3_10","type":"book-chapter","created":{"date-parts":[[2023,9,13]],"date-time":"2023-09-13T14:02:36Z","timestamp":1694613756000},"page":"175-186","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Machine-Learned Premise Selection for\u00a0Lean"],"prefix":"10.1007","author":[{"given":"Bartosz","family":"Piotrowski","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ramon Fern\u00e1ndez","family":"Mir","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Edward","family":"Ayers","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,9,14]]},"reference":[{"key":"10_CR1","doi-asserted-by":"publisher","unstructured":"Alama, J., Heskes, T., K\u00fchlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reason. 52(2), 191\u2013213 (2014). https:\/\/doi.org\/10.1007\/s10817-013-9286-5","DOI":"10.1007\/s10817-013-9286-5"},{"key":"10_CR2","doi-asserted-by":"publisher","unstructured":"Blanchette, J.C., Greenaway, D., Kaliszyk, C., K\u00fchlwein, D., Urban, J.: A learning-based fact selector for Isabelle\/HOL. J. Autom. Reason. 57(3), 219\u2013244 (2016). https:\/\/doi.org\/10.1007\/s10817-016-9362-8","DOI":"10.1007\/s10817-016-9362-8"},{"key":"10_CR3","doi-asserted-by":"publisher","unstructured":"Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. J. Formaliz. Reason. 9(1), 101\u2013148 (2016). https:\/\/doi.org\/10.6092\/issn.1972-5787\/4593","DOI":"10.6092\/issn.1972-5787\/4593"},{"issue":"1","key":"10_CR4","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman, L.: Random forests. Mach. Learn. 45(1), 5\u201332 (2001)","journal-title":"Mach. Learn."},{"key":"10_CR5","doi-asserted-by":"publisher","unstructured":"The mathlib Community. The lean mathematical library. In: Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs, pp. 367\u2013381. CPP 2020, Association for Computing Machinery, New York, NY, USA (2020). https:\/\/doi.org\/10.1145\/3372885.3373824","DOI":"10.1145\/3372885.3373824"},{"key":"10_CR6","series-title":"Springer Series in Statistics","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The Elements of Statistical Learning","author":"T Hastie","year":"2009","unstructured":"Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS, Springer, New York (2009). https:\/\/doi.org\/10.1007\/978-0-387-84858-7"},{"key":"10_CR7","unstructured":"Irving, G., Szegedy, C., Alemi, A.A., E\u00e9n, N., Chollet, F., Urban, J.: DeepMath - deep sequence models for premise selection. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016(December), pp. 5\u201310, 2016. Barcelona, Spain, pp. 2235\u20132243 (2016), https:\/\/proceedings.neurips.cc\/paper\/2016\/hash\/f197002b9a0853eca5e046d9ca4663d5-Abstract.html"},{"key":"10_CR8","series-title":"Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence)","doi-asserted-by":"publisher","first-page":"292","DOI":"10.1007\/978-3-319-62075-6_20","volume-title":"Intelligent Computer Mathematics","author":"J Jakub\u016fv","year":"2017","unstructured":"Jakub\u016fv, J., Urban, J.: ENIGMA: efficient learning-based inference guiding machine. In: Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. (eds.) CICM 2017. LNCS (LNAI), vol. 10383, pp. 292\u2013302. Springer, Cham (2017). https:\/\/doi.org\/10.1007\/978-3-319-62075-6_20"},{"key":"10_CR9","doi-asserted-by":"publisher","unstructured":"Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. J. Autom. Reason. 53(2), 173\u2013213 (2014). https:\/\/doi.org\/10.1007\/s10817-014-9303-3","DOI":"10.1007\/s10817-014-9303-3"},{"key":"10_CR10","doi-asserted-by":"publisher","unstructured":"Kaliszyk, C., Urban, J.: MizAR 40 for Mizar 40. J. Autom. Reason. 55(3), 245\u2013256 (2015). https:\/\/doi.org\/10.1007\/s10817-015-9330-8","DOI":"10.1007\/s10817-015-9330-8"},{"key":"10_CR11","series-title":"Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence)","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1007\/978-3-642-31365-3_30","volume-title":"Automated Reasoning","author":"D K\u00fchlwein","year":"2012","unstructured":"K\u00fchlwein, D., van Laarhoven, T., Tsivtsivadze, E., Urban, J., Heskes, T.: Overview and evaluation of premise selection techniques for large theory mathematics. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012. LNCS (LNAI), vol. 7364, pp. 378\u2013392. Springer, Heidelberg (2012). https:\/\/doi.org\/10.1007\/978-3-642-31365-3_30"},{"key":"10_CR12","doi-asserted-by":"publisher","unstructured":"Mikula, M., et al.: Magnushammer: A transformer-based approach to premise selection. CoRR abs\/2303.04488 (2023).https:\/\/doi.org\/10.48550\/arXiv.2303.04488,https:\/\/doi.org\/10.48550\/arXiv.2303.04488","DOI":"10.48550\/arXiv.2303.04488"},{"key":"10_CR13","series-title":"Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence)","doi-asserted-by":"publisher","first-page":"625","DOI":"10.1007\/978-3-030-79876-5_37","volume-title":"Automated Deduction \u2013 CADE 28","author":"L Moura","year":"2021","unstructured":"Moura, L., Ullrich, S.: The lean 4 theorem prover and programming language. In: Platzer, A., Sutcliffe, G. (eds.) CADE 2021. LNCS (LNAI), vol. 12699, pp. 625\u2013635. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-79876-5_37"},{"key":"10_CR14","unstructured":"Nawrocki, W., Ayers, E.W., Ebner, G.: An extensible user interface for Lean 4. In: 14th International Conference on Interactive Theorem Proving, ITP 2023, July 31-August 4, 2023, Bia\u0142ystok, Poland. Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik (2023)"},{"key":"10_CR15","series-title":"Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence)","doi-asserted-by":"publisher","first-page":"566","DOI":"10.1007\/978-3-319-94205-6_37","volume-title":"Automated Reasoning","author":"B Piotrowski","year":"2018","unstructured":"Piotrowski, B., Urban, J.: ATPboost: Learning Premise Selection in Binary Setting with ATP Feedback. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018. LNCS (LNAI), vol. 10900, pp. 566\u2013574. Springer, Cham (2018). https:\/\/doi.org\/10.1007\/978-3-319-94205-6_37"},{"key":"10_CR16","doi-asserted-by":"publisher","unstructured":"Piotrowski, B., Urban, J.: Stateful premise selection by recurrent neural networks. In: Albert, E., Kov\u00e1cs, L. (eds.) LPAR 2020: 23rd International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Alicante, Spain, 22\u201327 May 2020. EPiC Series in Computing, vol. 73, pp. 409\u2013422. EasyChair (2020). https:\/\/doi.org\/10.29007\/j5hd","DOI":"10.29007\/j5hd"},{"key":"10_CR17","doi-asserted-by":"publisher","unstructured":"Pit-Claudel, C.: Untangling mechanized proofs. In: Proceedings of the 13th ACM SIGPLAN International Conference on Software Language Engineering, pp. 155\u2013174. SLE 2020, Association for Computing Machinery, New York, NY, USA (2020). https:\/\/doi.org\/10.1145\/3426425.3426940,https:\/\/pit-claudel.fr\/clement\/papers\/alectryon-SLE20.pdf","DOI":"10.1145\/3426425.3426940"},{"key":"10_CR18","unstructured":"Tworkowski, S., et al.: Formal premise selection with language models. In: The 7th Conference on Artificial Intelligence and Theorem Proving, AITP 2022(September), pp. 4\u20139, 2022. Aussois and online, France (2022). http:\/\/aitp-conference.org\/2022\/abstract\/AITP_2022_paper_32.pdf"},{"key":"10_CR19","series-title":"Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence)","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1007\/978-3-030-81097-9_5","volume-title":"Intelligent Computer Mathematics","author":"L Zhang","year":"2021","unstructured":"Zhang, L., Blaauwbroek, L., Piotrowski, B., \u010cern\u1ef3, P., Kaliszyk, C., Urban, J.: Online machine learning techniques for coq: a comparison. In: Kamareddine, F., Sacerdoti Coen, C. (eds.) CICM 2021. LNCS (LNAI), vol. 12833, pp. 67\u201383. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-81097-9_5"}],"container-title":["Lecture Notes in Computer Science","Automated Reasoning with Analytic Tableaux and Related Methods"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-43513-3_10","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,13]],"date-time":"2023-09-13T14:04:17Z","timestamp":1694613857000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-43513-3_10"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031435126","9783031435133"],"references-count":19,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-43513-3_10","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"14 September 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"TABLEAUX","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Automated Reasoning with Analytic Tableaux and Related Methods","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Prague","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Czech Republic","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2023","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"18 September 2023","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"21 September 2023","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"32","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"tableaux2023","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"http:\/\/tableaux2023.tableaux-ar.org\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Single-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Easychair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"43","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"20","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"5","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"47% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3 (2.92)","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Yes","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}