{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T21:00:50Z","timestamp":1773090050473,"version":"3.50.1"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2020,6,1]],"date-time":"2020-06-01T00:00:00Z","timestamp":1590969600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["61972339 and 61902344"],"award-info":[{"award-number":["61972339 and 61902344"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Australian Research Council\u2019s Discovery Early Career Researcher Award","award":["DE200100021"],"award-info":[{"award-number":["DE200100021"]}]},{"name":"ANU-Data61 Collaborative Researh","award":["CO19314"],"award-info":[{"award-number":["CO19314"]}]},{"name":"National Key Research and Development Program of China","award":["2018YFB1003904"],"award-info":[{"award-number":["2018YFB1003904"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2020,7,31]]},"abstract":"<jats:p>Programming screencasts have become a pervasive resource on the Internet, which help developers learn new programming technologies or skills. The source code in programming screencasts is an important and valuable information for developers. But the streaming nature of programming screencasts (i.e., a sequence of screen-captured images) limits the ways that developers can interact with the source code in the screencasts. Many studies use the Optical Character Recognition (OCR) technique to convert screen images (also referred to as video frames) into textual content, which can then be indexed and searched easily. However, noisy screen images significantly affect the quality of source code extracted by OCR, for example, no-code frames (e.g., PowerPoint slides, web pages of API specification), non-code regions (e.g., Package Explorer view, Console view), and noisy code regions with code in completion suggestion popups. Furthermore, due to the code characteristics (e.g., long compound identifiers like ItemListener), even professional OCR tools cannot extract source code without errors from screen images. The noisy OCRed source code will negatively affect the downstream applications, such as the effective search and navigation of the source code content in programming screencasts.<\/jats:p>\n          <jats:p>\n            In this article, we propose an approach named\n            <jats:italic>psc2code<\/jats:italic>\n            to denoise the process of extracting source code from programming screencasts. First,\n            <jats:italic>psc2code<\/jats:italic>\n            leverages the Convolutional Neural Network (CNN) based image classification to remove non-code and noisy-code frames. Then,\n            <jats:italic>psc2code<\/jats:italic>\n            performs edge detection and clustering-based image segmentation to detect sub-windows in a code frame, and based on the detected sub-windows, it identifies and crops the screen region that is most likely to be a code editor. Finally,\n            <jats:italic>psc2code<\/jats:italic>\n            calls the API of a professional OCR tool to extract source code from the cropped code regions and leverages the OCRed cross-frame information in the programming screencast and the statistical language model of a large corpus of source code to correct errors in the OCRed source code.\n          <\/jats:p>\n          <jats:p>\n            We conduct an experiment on 1,142 programming screencasts from YouTube. We find that our CNN-based image classification technique can effectively remove the non-code and noisy-code frames, which achieves an F1-score of 0.95 on the valid code frames. We also find that\n            <jats:italic>psc2code<\/jats:italic>\n            can significantly improve the quality of the OCRed source code by truly correcting about half of incorrectly OCRed words. Based on the source code denoised by\n            <jats:italic>psc2code<\/jats:italic>\n            , we implement two applications: (1) a programming screencast search engine; (2) an interaction-enhanced programming screencast watching tool. Based on the source code extracted from the 1,142 collected programming screencasts, our experiments show that our programming screencast search engine achieves the precision@5, 10, and 20 of 0.93, 0.81, and 0.63, respectively. We also conduct a user study of our interaction-enhanced programming screencast watching tool with 10 participants. This user study shows that our interaction-enhanced watching tool can help participants learn the knowledge in the programming video more efficiently and effectively.\n          <\/jats:p>","DOI":"10.1145\/3392093","type":"journal-article","created":{"date-parts":[[2020,6,1]],"date-time":"2020-06-01T16:19:41Z","timestamp":1591028381000},"page":"1-38","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":24,"title":["psc2code"],"prefix":"10.1145","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1846-0921","authenticated-orcid":false,"given":"Lingfeng","family":"Bao","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, China and Ningbo Research Institute, Zhejiang University, China and PengCheng Laboratory, China"}]},{"given":"Zhenchang","family":"Xing","sequence":"additional","affiliation":[{"name":"Australian National University, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6302-3256","authenticated-orcid":false,"given":"Xin","family":"Xia","sequence":"additional","affiliation":[{"name":"Monash University, Australia"}]},{"given":"David","family":"Lo","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore"}]},{"given":"Minghui","family":"Wu","sequence":"additional","affiliation":[{"name":"Zhejiang University City College, China"}]},{"given":"Xiaohu","family":"Yang","sequence":"additional","affiliation":[{"name":"Zhejiang University, China"}]}],"member":"320","published-online":{"date-parts":[[2020,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3273934.3273935"},{"key":"e_1_2_1_2_1","volume-title":"et\u00a0al","author":"Baeza-Yates Ricardo","year":"1999","unstructured":"Ricardo Baeza-Yates , Berthier Ribeiro-Neto , et\u00a0al . 1999 . Modern Information Retrieval. Vol. 463 . ACM Press , New York, NY. Ricardo Baeza-Yates, Berthier Ribeiro-Neto, et\u00a0al. 1999. Modern Information Retrieval. Vol. 463. ACM Press, New York, NY."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2380116.2380129"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering (SANER\u201915)","author":"Bao Lingfeng","year":"2015","unstructured":"Lingfeng Bao , Jing Li , Zhenchang Xing , Xinyu Wang , and Bo Zhou . 2015 . Reverse engineering time-series interaction data from screen-captured videos . In Proceedings of the IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering (SANER\u201915) . IEEE, 399\u2013408. Lingfeng Bao, Jing Li, Zhenchang Xing, Xinyu Wang, and Bo Zhou. 2015. Reverse engineering time-series interaction data from screen-captured videos. In Proceedings of the IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering (SANER\u201915). IEEE, 399\u2013408."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2018.2802916"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2015.90"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1986.4767851"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1753326.1753554"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the Knowledge Discovery and Data Mining (KDD\u201996)","volume":"96","author":"Ester Martin","year":"1996","unstructured":"Martin Ester , Hans-Peter Kriegel , J\u00f6rg Sander , Xiaowei Xu , et\u00a0al. 1996 . A density-based algorithm for discovering clusters in large spatial databases with noise . In Proceedings of the Knowledge Discovery and Data Mining (KDD\u201996) , Vol. 96 . 226\u2013231. Martin Ester, Hans-Peter Kriegel, J\u00f6rg Sander, Xiaowei Xu, et\u00a0al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Knowledge Discovery and Data Mining (KDD\u201996), Vol. 96. 226\u2013231."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0031619"},{"key":"e_1_2_1_11_1","unstructured":"GoogleVision 2018. Google Vision API. Retrieved from https:\/\/cloud.google.com\/vision\/.  GoogleVision 2018. Google Vision API. Retrieved from https:\/\/cloud.google.com\/vision\/."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2556325.2566239"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 5th ACM Conference on Learning @ Scale. ACM, 57","author":"Khandwala Kandarp","unstructured":"Kandarp Khandwala and Philip J. Guo . 2018. Codemotion: Expanding the design space of learner interactions with computer programming tutorial videos . In Proceedings of the 5th ACM Conference on Learning @ Scale. ACM, 57 . Kandarp Khandwala and Philip J. Guo. 2018. Codemotion: Expanding the design space of learner interactions with computer programming tutorial videos. In Proceedings of the 5th ACM Conference on Learning @ Scale. ACM, 57."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2015.19"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1006\/cviu.1999.0831"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1139\u20131148","author":"Keith Palma Monserrat Toni-Jan","year":"2013","unstructured":"Toni-Jan Keith Palma Monserrat , Shengdong Zhao , Kevin McGee , and Anshul Vikram Pandey . 2013 . NoteVideo: Facilitating navigation of blackboard-style lecture videos . In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1139\u20131148 . Toni-Jan Keith Palma Monserrat, Shengdong Zhao, Kevin McGee, and Anshul Vikram Pandey. 2013. NoteVideo: Facilitating navigation of blackboard-style lecture videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1139\u20131148."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3196398.3196439"},{"key":"e_1_2_1_18_1","unstructured":"OpenCV 2018. OpenCV. Retrieved from https:\/\/opencv.org\/.  OpenCV 2018. OpenCV. Retrieved from https:\/\/opencv.org\/."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3196398.3196402"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3196321.3196359"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-019-0198-z"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884824"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2779479"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1985441.1985451"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818123"},{"key":"e_1_2_1_27_1","unstructured":"Snagit. 2018. Snagit. Retrieved from https:\/\/opencv.org\/.  Snagit. 2018. Snagit. Retrieved from https:\/\/opencv.org\/."},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ACM, 365\u2013375","author":"Tamrawi Ahmed","unstructured":"Ahmed Tamrawi , Tung Thanh Nguyen , Jafar M. Al-Kofahi , and Tien N. Nguyen . 2011. Fuzzy set and cache-based approach for bug triaging . In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ACM, 365\u2013375 . Ahmed Tamrawi, Tung Thanh Nguyen, Jafar M. Al-Kofahi, and Tien N. Nguyen. 2011. Fuzzy set and cache-based approach for bug triaging. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ACM, 365\u2013375."},{"key":"e_1_2_1_29_1","unstructured":"Tesseract 2018. Tesseract. Retrieved from https:\/\/github.com\/tesseract-ocr\/tesseract.  Tesseract 2018. Tesseract. Retrieved from https:\/\/github.com\/tesseract-ocr\/tesseract."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.2307\/3001968"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1049\/ip-vis:20000104"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10515-016-0204-z"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10515-014-0162-2"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2986012.2986021"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 22nd ACM Symposium on User Interface Software and Technology. ACM, 183\u2013192","author":"Yeh Tom","unstructured":"Tom Yeh , Tsung-Hsiang Chang , and Robert C. Miller . 2009. Sikuli: Using GUI screenshots for search and automation . In Proceedings of the 22nd ACM Symposium on User Interface Software and Technology. ACM, 183\u2013192 . Tom Yeh, Tsung-Hsiang Chang, and Robert C. Miller. 2009. Sikuli: Using GUI screenshots for search and automation. In Proceedings of the 22nd ACM Symposium on User Interface Software and Technology. ACM, 183\u2013192."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227210"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the European Conference on Computer Vision. Springer, 391\u2013405","author":"Lawrence Zitnick C.","year":"2014","unstructured":"C. Lawrence Zitnick and Piotr Doll\u00e1r . 2014 . Edge boxes: Locating object proposals from edges . In Proceedings of the European Conference on Computer Vision. Springer, 391\u2013405 . C. Lawrence Zitnick and Piotr Doll\u00e1r. 2014. Edge boxes: Locating object proposals from edges. In Proceedings of the European Conference on Computer Vision. Springer, 391\u2013405."}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3392093","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3392093","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:48Z","timestamp":1750199928000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3392093"}},"subtitle":["Denoising Code Extraction from Programming Screencasts"],"short-title":[],"issued":{"date-parts":[[2020,6]]},"references-count":37,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2020,7,31]]}},"alternative-id":["10.1145\/3392093"],"URL":"https:\/\/doi.org\/10.1145\/3392093","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6]]},"assertion":[{"value":"2019-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}