{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T14:43:40Z","timestamp":1770993820679,"version":"3.50.1"},"reference-count":37,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,2,28]],"date-time":"2025-02-28T00:00:00Z","timestamp":1740700800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>We present an automatic approach for generating learning problems for teaching introductory programming in different programming languages. The current implementation allows input and output in the three most popular programming languages for teaching introductory programming courses: C++, Java, and Python. The generator stores learning problems using the \u201cmeaning tree\u201d, a language-independent representation of a syntax tree. During this study, we generated a bank of 1,428,899 learning problems focused on the order of expression evaluation. They were generated in about 16 h. The learning problems were classified for further use with the used concepts, possible domain-rule violations, and required skills; they covered a wide range of difficulties and topics. The problems were validated by automatically solving them in an intelligent tutoring system that recorded the actual skills used and violations made. The generated problems were favorably assessed by 10 experts: teachers and teaching assistants in introductory programming courses. They noted that the problems are ready for use without further manual improvement and that the classification system is flexible enough to receive problems with desirable properties. The proposed approach combines the advantages of different state-of-the-art methods. It combines the diversity of learning problems generated by restricted randomization and large language models with full correctness and a natural look of template-based problems, which makes it a good fit for large-scale learning problem generation.<\/jats:p>","DOI":"10.3390\/bdcc9030057","type":"journal-article","created":{"date-parts":[[2025,2,28]],"date-time":"2025-02-28T06:45:42Z","timestamp":1740725142000},"page":"57","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Mass Generation of Programming Learning Problems from Public Code Repositories"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7296-2538","authenticated-orcid":false,"given":"Oleg","family":"Sychev","sequence":"first","affiliation":[{"name":"Software Engineering Department, Volgograd State Technical University, Lenin Ave. 28, 400005 Volgograd, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-5159-5011","authenticated-orcid":false,"given":"Dmitry","family":"Shashkov","sequence":"additional","affiliation":[{"name":"Software Engineering Department, Volgograd State Technical University, Lenin Ave. 28, 400005 Volgograd, Russia"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"143","DOI":"10.15388\/infedu.2015.09","article-title":"Programming Language Use in US Academia and Industry","volume":"14","author":"Cohen","year":"2015","journal-title":"Inform. Educ."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Siegfried, R.M., Herbert-Berger, K.G., Leune, K., and Siegfried, J.P. (2021, January 17\u201321). Trends of Commonly Used Programming Languages in CS1 And CS2 Learning. Proceedings of the 2021 16th International Conference on Computer Science & Education (ICCSE), Lancaster, UK.","DOI":"10.1109\/ICCSE51940.2021.9569444"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ali, S., and Qayyum, S. (2021). A Pragmatic Comparison of Four Different Programming Languages. ScienceOpen Prepr.","DOI":"10.14293\/S2199-1006.1.SOR-.PP5RV1O.v1"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Figueiredo, J., and Garc\u00eda-Pe\u00f1alvo, F.J. (2020, January 21\u201323). Intelligent Tutoring Systems approach to Introductory Programming Courses. Proceedings of the Eighth International Conference on Technological Ecosystems for Enhancing Multiculturality, Salamanca, Spain. TEEM\u201920.","DOI":"10.1145\/3434780.3436614"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Oli, P., Banjade, R., Lekshmi Narayanan, A.B., Brusilovsky, P., and Rus, V. (2024, January 8\u201312). Exploring The Effectiveness of Reading vs. Tutoring For Enhancing Code Comprehension For Novices. Proceedings of the 39th ACM\/SIGAPP Symposium on Applied Computing, Avila, Spain. SAC \u201924.","DOI":"10.1145\/3605098.3636007"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1007\/BF00168958","article-title":"Intelligent tutoring systems: An overview","volume":"4","author":"Nwana","year":"1990","journal-title":"Artif. Intell. Rev."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1055","DOI":"10.3923\/itj.2008.1055.1060","article-title":"Developing an Intelligent Tutoring System For Students Learning To Program in C++","volume":"7","year":"2008","journal-title":"Inf. Technol. J."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"528","DOI":"10.3923\/itj.2008.528.532","article-title":"JEE-Tutor: An Intelligent Tutoring System For Java Expressions Evaluation","volume":"7","author":"Naser","year":"2008","journal-title":"Inf. Technol. J."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Martin, B., and Mitrovic, A. (2002). Automatic Problem Generation in Constraint-Based Tutors. Proceedings of the Intelligent Tutoring Systems, Springer.","DOI":"10.1007\/3-540-47987-2_42"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Kumar, A.N. (2013, January 23\u201326). Using problets for problem-solving exercises in introductory C++\/Java\/C# courses. Proceedings of the 2013 IEEE Frontiers in Education Conference (FIE), Oklahoma City, OK, USA.","DOI":"10.1109\/FIE.2013.6684774"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Fabic, G.V.F., Mitrovic, A., and Neshatian, K. (2018, January 8\u201311). Adaptive Problem Selection in a Mobile Python Tutor. Proceedings of the Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization, ACM, Singapore. UMAP \u201918.","DOI":"10.1145\/3213586.3225235"},{"key":"ref_12","first-page":"4384","article-title":"Comparative Analysis of Python and Java for Beginners","volume":"7","author":"Khoirom","year":"2020","journal-title":"Int. Res. J. Eng. Technol."},{"key":"ref_13","unstructured":"Farooq, M.S., and zaman Khan, T. (2023). Comparative Analysis of Widely use Object-Oriented Languages. arXiv."},{"key":"ref_14","first-page":"84","article-title":"A Data-Driven Approach to Compare the Syntactic Difficulty of Programming Languages","volume":"34","author":"Lokkila","year":"2023","journal-title":"J. Inf. Syst. Educ."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sadigh, D., Seshia, S.A., and Gupta, M. (2012, January 12). Automating exercise generation: A step towards meeting the MOOC challenge for embedded systems. Proceedings of the Workshop on Embedded and Cyber-Physical Systems Education, ACM, Tampere, Finland. ESWEEK\u201912.","DOI":"10.1145\/2530544.2530546"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/j.eij.2023.03.001","article-title":"Synthesis of nested loop exercises for practice in introductory programming","volume":"24","author":"Chinedu","year":"2023","journal-title":"Egypt. Inform. J."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Ade-Ibijola, A. (2018). Syntactic Generation of Practice Novice Programs in Python. Proceedings of the ICT Education, Springer International Publishing.","DOI":"10.1007\/978-3-030-05813-5_11"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Gulwani, S. (2012, January 26\u201329). Synthesis from Examples: Interaction Models and Algorithms. Proceedings of the 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, Timisoara, Romania.","DOI":"10.1109\/SYNASC.2012.69"},{"key":"ref_19","unstructured":"Polozov, O., O\u2019Rourke, E., Smith, A.M., Zettlemoyer, L., Gulwani, S., and Popovic, Z. (2015, January 25\u201331). Personalized mathematical word problem generation. Proceedings of the 24th International Conference on Artificial Intelligence, Buenos Aires, Argentina. IJCAI\u201915."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"O\u2019Rourke, E., Butler, E., D\u00edaz Tolentino, A., and Popovi\u0107, Z. (2019). Automatic Generation of Problems and Explanations for an Intelligent Algebra Tutor. Proceedings of the Artificial Intelligence in Education, Springer International Publishing.","DOI":"10.1007\/978-3-030-23204-7_32"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Shuster, K., Poff, S., Chen, M., Kiela, D., and Weston, J. (2021). Retrieval Augmentation Reduces Hallucination in Conversation. arXiv.","DOI":"10.18653\/v1\/2021.findings-emnlp.320"},{"key":"ref_22","unstructured":"Austin, J., Odena, A., Nye, M.I., Bosma, M., Michalewski, H., Dohan, D., Jiang, E., Cai, C.J., Terry, M., and Le, Q.V. (2021). Program Synthesis with Large Language Models. arXiv."},{"key":"ref_23","unstructured":"Chen, M., Tworek, J., Jun, H., Yuan, Q., de Oliveira Pinto, H.P., Kaplan, J., Edwards, H., Burda, Y., Joseph, N., and Brockman, G. (2021). Evaluating Large Language Models Trained on Code. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1092","DOI":"10.1126\/science.abq1158","article-title":"Competition-level code generation with AlphaCode","volume":"378","author":"Li","year":"2022","journal-title":"Science"},{"key":"ref_25","unstructured":"Xue, T., Li, X., Azim, T., Smirnov, R., Yu, J., Sadrieh, A., and Pahlavan, B. (2024). Multi-Programming Language Ensemble for Code Generation in Large Language Model. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Pan, R., Ibrahimzada, A.R., Krishna, R., Sankar, D., Wassi, L.P., Merler, M., Sobolev, B., Pavuluri, R., Sinha, S., and Jabbarvand, R. (2024, January 14\u201320). Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code. Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering, Lisbon, Portugal. ICSE \u201924.","DOI":"10.1145\/3597503.3639226"},{"key":"ref_27","unstructured":"Eniser, H.F., Zhang, H., David, C., Wang, M., Christakis, M., Paulsen, B., Dodds, J., and Kroening, D. (2024). Towards Translating Real-World Code with LLMs: A Study of Translating to Rust. arXiv."},{"key":"ref_28","unstructured":"Macedo, M., Tian, Y., Nie, P., Cogo, F.R., and Adams, B. (2024). InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1002\/smi.2768","article-title":"Extending knowledge of illegitimate tasks: Student satisfaction, anxiety, and emotional exhaustion","volume":"34","author":"Fila","year":"2018","journal-title":"Stress Health"},{"key":"ref_30","unstructured":"Kumar, A. (2005, January 18\u201322). Rule-based adaptive problem generation in programming tutors and its evaluation. Proceedings of the Workshop on Adaptive Systems for Web-Based Education: Tools and Reusability, 12th International Conference on Artificial Intelligence in Education (AI-ED 2005), Amsterdam, The Netherlands."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Sychev, O. (2022, January 1\u20134). From Question Generation to Problem Mining and Classification. Proceedings of the International Conference on Advanced Learning Technologies, ICALT 2022, Bucharest, Romania.","DOI":"10.1109\/ICALT55010.2022.00097"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Sychev, O., Penskoy, N., and Prokudin, A. (2022, January 1\u20134). Generating Expression Evaluation Learning Problems from Existing Program Code. Proceedings of the 2022 International Conference on Advanced Learning Technologies (ICALT), Bucharest, Romania.","DOI":"10.1109\/ICALT55010.2022.00061"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1145\/3487019.3487022","article-title":"Static Analysis at GitHub: An experience report","volume":"19","author":"Clem","year":"2021","journal-title":"Queue"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Sychev, O., Penskoy, N., Anikin, A., Denisov, M., and Prokudin, A. (2021). Improving Comprehension: Intelligent Tutoring System Explaining the Domain Rules When Students Break Them. Educ. Sci., 11.","DOI":"10.3390\/educsci11110719"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"101261","DOI":"10.1016\/j.cogsys.2024.101261","article-title":"Educational models for cognition: Methodology of modeling intellectual skills for intelligent tutoring systems","volume":"87","author":"Sychev","year":"2024","journal-title":"Cogn. Syst. Res."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Burd, H., Bell, A., Hemberg, E., and O\u2019Reilly, U.M. (2020, January 12\u201314). Analyzing Pre-Existing Knowledge and Performance in a Programming MOOC. Proceedings of the Seventh ACM Conference on Learning @ Scale, Virtual Event, USA. L@S \u201920.","DOI":"10.1145\/3386527.3406728"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Duran, R., Haaranen, L., and Hellas, A. (2020, January 11\u201314). Gender Differences in Introductory Programming: Comparing MOOCs and Local Courses. Proceedings of the 51st ACM Technical Symposium on Computer Science Education, Portland, OR, USA. SIGCSE \u201920.","DOI":"10.1145\/3328778.3366852"}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/3\/57\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:44:20Z","timestamp":1760028260000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/3\/57"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,28]]},"references-count":37,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["bdcc9030057"],"URL":"https:\/\/doi.org\/10.3390\/bdcc9030057","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,28]]}}}