{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T18:53:29Z","timestamp":1771959209603,"version":"3.50.1"},"reference-count":28,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2024,6,13]],"date-time":"2024-06-13T00:00:00Z","timestamp":1718236800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Comput. Sci."],"abstract":"<jats:p>Mechanisms for plagiarism detection play a crucial role in maintaining academic integrity, acting both to penalize wrongdoing while also serving as a preemptive deterrent for bad behavior. This manuscript proposes a customized plagiarism detection algorithm tailored to detect source code plagiarism in the Python programming language. Our approach combines textual and syntactic techniques, employing a support vector machine (SVM) to effectively combine various indicators of similarity and calculate the resulting similarity scores. The algorithm was trained and tested using a sample of code submissions of 4 coding problems each from 45 volunteers; 15 of these were original submissions while the other 30 were plagiarized samples. The submissions of two of the questions was used for training and the other two for testing-using the leave-p-out cross-validation strategy to avoid overfitting. We compare the performance of the proposed method with two widely used tools-MOSS and JPlag\u2014and find that the proposed method results in a small but significant improvement in accuracy compared to JPlag, while significantly outperforming MOSS in flagging plagiarized samples.<\/jats:p>","DOI":"10.3389\/fcomp.2024.1393723","type":"journal-article","created":{"date-parts":[[2024,6,13]],"date-time":"2024-06-13T04:31:51Z","timestamp":1718253111000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["A Support Vector Machine based approach for plagiarism detection in Python code submissions in undergraduate settings"],"prefix":"10.3389","volume":"6","author":[{"given":"Nandini","family":"Gandhi","sequence":"first","affiliation":[]},{"given":"Kaushik","family":"Gopalan","sequence":"additional","affiliation":[]},{"given":"Prajish","family":"Prasad","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2024,6,13]]},"reference":[{"key":"B1","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1109\/NGCT.2016.7877421","article-title":"A state of art on source code plagiarism detection","volume-title":"2016 2nd International Conference on Next Generation Computing Technologies (NGCT)","author":"Agrawal","year":"2016"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1145\/3286960.3286974","article-title":"A comparison of three popular source code similarity tools for detecting student plagiarism","author":"Ahadi","year":"2019","journal-title":"Proceedings of the Twenty-First Australasian Computing Education Conference"},{"key":"B3","unstructured":"AikenA. A System for Detecting Software Similarity2023"},{"key":"B4","volume-title":"Issues Related to the Detection of Source Code Plagiarism in Students Assignments","author":"Alsmadi","year":"2014"},{"key":"B5","doi-asserted-by":"publisher","first-page":"177","DOI":"10.36548\/jaicn.2020.3.005","article-title":"Plagiarism detection in programming assignments using machine learning","volume":"2","author":"Awale","year":"2020","journal-title":"J. Artif. Intellig. Capsule Netw"},{"key":"B6","first-page":"012027","article-title":"Python to learn programming","volume-title":"Journal of Physics: Conference Series","author":"Bogdanchikov","year":"2013"},{"key":"B7","doi-asserted-by":"publisher","first-page":"1255","DOI":"10.1007\/s11948-012-9370-y","article-title":"An analysis of student privacy rights in the use of plagiarism detection systems","volume":"19","author":"Brinkman","year":"2013","journal-title":"Sci. Eng. Ethics"},{"key":"B8","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1145\/953049.800955","article-title":"A plagiarism detection system","volume":"13","author":"Donaldson","year":"1981","journal-title":"SIGCSE Bull"},{"key":"B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/ICAECC54045.2022.9716671","article-title":"Source code plagiarism detection: a machine intelligence approach","volume-title":"2022 IEEE Fourth International Conference on Advances in Electronics, Computers and Communications (ICAECC)","author":"Eppa","year":"2022"},{"key":"B10","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1145\/3511861.3511863","article-title":"The robots are coming: Exploring the implications of openai codex on introductory programming","volume-title":"Australasian Computing Education Conference","author":"Finnie-Ansley","year":"2022"},{"key":"B11","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1145\/3162087.3162101","article-title":"A quantitative comparison of program plagiarism detection tools","author":"Heres","year":"2017","journal-title":"Proceedings of the 6th Computer Science Education Research Conference"},{"key":"B12","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1109\/ICAICA50127.2020.9182389","article-title":"Code plagiarism detection method based on code similarity and student behavior characteristics","volume-title":"2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)","author":"Huang","year":"2020"},{"key":"B13","doi-asserted-by":"publisher","first-page":"86","DOI":"10.11120\/ital.2011.10010086","article-title":"Python for teaching introductory programming: a quantitative evaluation","volume":"10","author":"Jayal","year":"2011","journal-title":"Innovat. Teach. Learn. Informat. Comp. Sci"},{"key":"B14","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1093\/oxfordjournals.pan.a004868","article-title":"Logistic regression in rare events data","volume":"9","author":"King","year":"2001","journal-title":"Polit. Analy"},{"key":"B15","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1109\/IPTC.2010.90","article-title":"The source code plagiarism detection using ast","volume-title":"2010 International Symposium on Intelligence Information Processing and Trusted Computing","author":"Li","year":"2010"},{"key":"B16","doi-asserted-by":"publisher","first-page":"511","DOI":"10.2190\/EC.43.4.e","article-title":"Automatic student plagiarism detection: future perspectives","volume":"43","author":"Mozgovoy","year":"2010","journal-title":"J. Educ. Comp. Res"},{"key":"B17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3313290","article-title":"Source-code similarity detection and detection tools used in academia: a systematic review","volume":"19","author":"Novak","year":"2019","journal-title":"ACM Trans. Comp. Educ. (TOCE)"},{"key":"B18","article-title":"Plagiarism detection software","author":"Noynaert","year":"2005","journal-title":"Midwest Instruction and Computing Symposium"},{"key":"B19","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1109\/13.28038","article-title":"Computer algorithms for plagiarism detection","volume":"32","author":"Parker","year":"1989","journal-title":"IEEE Trans. Educ"},{"key":"B20","doi-asserted-by":"crossref","first-page":"738","DOI":"10.1109\/ICSESS.2013.6615411","article-title":"Ast-based multi-language plagiarism detection method","volume-title":"2013 IEEE 4th International Conference on Software Engineering and Service Science","author":"ping Zhang","year":"2013"},{"key":"B21","doi-asserted-by":"publisher","first-page":"1016","DOI":"10.5445\/IR\/542000","article-title":"Finding plagiarisms among a set of programs with jplag","volume":"8","author":"Prechelt","year":"2002","journal-title":"J. Univers. Comput. Sci"},{"key":"B22","first-page":"1","article-title":"Plagiarism detection tool \u201cparikshak\u201d","volume-title":"2015 International Conference on Communication, Information & Computing Technology (ICCICT)","author":"Sharma","year":"2015"},{"key":"B23","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1145\/2716560","article-title":"Python for beginners","volume":"58","author":"Shein","year":"2015","journal-title":"Commun. ACM"},{"key":"B24","doi-asserted-by":"publisher","first-page":"166","DOI":"10.1002\/cae.22066","article-title":"Es-plag: efficient and sensitive source code plagiarism detection tool for academic environment","volume":"27","author":"Sulistiani","year":"2019","journal-title":"Comp. Appl. Eng. Educ"},{"key":"B25","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3152894","article-title":"A controlled experiment on python vs c for an introductory programming course: Students outcomes","volume":"18","author":"Wainer","year":"2018","journal-title":"ACM Trans. Comp. Educ. (TOCE)"},{"key":"B26","doi-asserted-by":"publisher","first-page":"2683","DOI":"10.23940\/ijpe.19.10.p14.26832691","article-title":"Code similarity detection using ast and textual information","volume":"15","author":"Wen","year":"2019","journal-title":"Int. J. Performab. Eng"},{"key":"B27","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1080\/07294360701310813","article-title":"First-year university science and engineering students understanding of plagiarism","volume":"26","author":"Yeo","year":"2007","journal-title":"High Educ.Res. Dev"},{"key":"B28","first-page":"178","article-title":"An ast-based code plagiarism detection algorithm","volume-title":"2015 10th International Conference on Broadband and Wir eless Computing, Communication and Applications (BWCCA)","author":"Zhao","year":"2015"}],"container-title":["Frontiers in Computer Science"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fcomp.2024.1393723\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,21]],"date-time":"2024-11-21T18:52:12Z","timestamp":1732215132000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fcomp.2024.1393723\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,13]]},"references-count":28,"alternative-id":["10.3389\/fcomp.2024.1393723"],"URL":"https:\/\/doi.org\/10.3389\/fcomp.2024.1393723","relation":{},"ISSN":["2624-9898"],"issn-type":[{"value":"2624-9898","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,13]]},"article-number":"1393723"}}