{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,28]],"date-time":"2025-09-28T04:05:19Z","timestamp":1759032319336,"version":"3.37.3"},"reference-count":67,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2020,7,4]],"date-time":"2020-07-04T00:00:00Z","timestamp":1593820800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,4]],"date-time":"2020-07-04T00:00:00Z","timestamp":1593820800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"crossref","award":["298020","328058"],"award-info":[{"award-number":["298020","328058"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018948","name":"Infotech Oulu","doi-asserted-by":"crossref","award":["-"],"award-info":[{"award-number":["-"]}],"id":[{"id":"10.13039\/501100018948","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Software Qual J"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Self-admitted technical debt refers to sub-optimal development solutions that are expressed in written code comments or commits. We reproduce and improve on a prior work by Yan et al. (2018) on detecting commits that introduce self-admitted technical debt. We use multiple natural language processing methods: Bag-of-Words, topic modeling, and word embedding vectors. We study 5 open-source projects. Our NLP approach uses logistic Lasso regression from Glmnet to automatically select best predictor words. A manually labeled dataset from prior work that identified self-admitted technical debt from code level commits serves as ground truth. Our approach achieves +\u20090.15 better area under the ROC curve performance than a prior work, when comparing only commit message features, and +\u20090.03 better result overall when replacing manually selected features with automatically selected words. In both cases, the improvement was statistically significant (<jats:italic>p<\/jats:italic>&lt; 0.0001). Our work has four main contributions, which are comparing different NLP techniques for SATD detection, improved results over previous work, showing how to generate generalizable predictor words when using multiple repositories, and producing a list of words correlating with SATD. As a concrete result, we release a list of the predictor words that correlate positively with SATD, as well as our used datasets and scripts to enable replication studies and to aid in the creation of future classifiers.<\/jats:p>","DOI":"10.1007\/s11219-020-09520-3","type":"journal-article","created":{"date-parts":[[2020,7,4]],"date-time":"2020-07-04T09:02:52Z","timestamp":1593853372000},"page":"1551-1579","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Predicting technical debt from commit contents: reproduction and extension with automated feature selection"],"prefix":"10.1007","volume":"28","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0258-8904","authenticated-orcid":false,"given":"Leevi","family":"Rantala","sequence":"first","affiliation":[]},{"given":"Mika","family":"M\u00e4ntyl\u00e4","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,7,4]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"AlOmar, E., Mkaouer, M.W., & Ouni, A. (2019). Can refactoring be self-affirmed? an exploratory study on how developers document their refactoring activities in commit messages. In 2019 IEEE\/ACM 3Rd international workshop on refactoring(IWoR) (pp. 51\u201358). IEEE.","key":"9520_CR1","DOI":"10.1109\/IWoR.2019.00017"},{"issue":"3","key":"9520_CR2","doi-asserted-by":"publisher","first-page":"619","DOI":"10.1007\/s10664-012-9231-y","volume":"19","author":"A Barua","year":"2014","unstructured":"Barua, A., Thomas, S.W., & Hassan, A.E. (2014). What are developers talking about? an analysis of topics and trends in stack overflow. Empirical Software Engineering, 19(3), 619\u2013654.","journal-title":"Empirical Software Engineering"},{"doi-asserted-by":"crossref","unstructured":"Bavota, G., & Russo, B. (2016). A large-scale empirical study on self-admitted technical debt. In Proceedings of the 13th international conference on mining software repositories, pp 315\u2013326.","key":"9520_CR3","DOI":"10.1145\/2901739.2901742"},{"doi-asserted-by":"crossref","unstructured":"Besker, T., Martini, A., & Bosch, J. (2018). Technical debt cripples software developer productivity: a longitudinal study on developers\u2019 daily software development work. In Proceedings of the 2018 International Conference on Technical Debt, pp 105\u2013114.","key":"9520_CR4","DOI":"10.1145\/3194164.3194178"},{"issue":"Jan","key":"9520_CR5","first-page":"993","volume":"3","author":"DM Blei","year":"2003","unstructured":"Blei, D.M., Ng, A.Y., & Jordan, M.I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993\u20131022.","journal-title":"Journal of machine Learning research"},{"unstructured":"Bouchet-Valat M., & Bouchet-Valat M.M. (2015). Package \u2019snowballc\u2019.","key":"9520_CR6"},{"unstructured":"Bouma, G. (2009). Normalized (pointwise) mutual information in collocation extraction. In Proceedings of GSCL, pp 31\u201340.","key":"9520_CR7"},{"issue":"2","key":"9520_CR8","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1109\/MCI.2014.2307227","volume":"9","author":"E Cambria","year":"2014","unstructured":"Cambria, E., & White, B. (2014). Jumping nlp curves: a review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48\u201357.","journal-title":"IEEE Computational Intelligence Magazine"},{"issue":"11","key":"9520_CR9","doi-asserted-by":"publisher","first-page":"1044","DOI":"10.1109\/TSE.2017.2654244","volume":"43","author":"E da Silva Maldonado","year":"2017","unstructured":"da Silva Maldonado, E., Shihab, E., & Tsantalis, N. (2017). Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software Engineering, 43(11), 1044\u20131062.","journal-title":"IEEE Transactions on Software Engineering"},{"doi-asserted-by":"crossref","unstructured":"DeLong, E.R., DeLong, D.M., & Clarke-Pearson, D.L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, pp 837\u2013845.","key":"9520_CR10","DOI":"10.2307\/2531595"},{"issue":"23","key":"9520_CR11","doi-asserted-by":"publisher","first-page":"2577","DOI":"10.1002\/sim.5328","volume":"31","author":"OV Demler","year":"2012","unstructured":"Demler, O.V., Pencina, M.J., & D\u2019Agostino, Sr R.B. (2012). Misuse of delong test to compare aucs for nested models. Statistics in Medicine, 31(23), 2577\u20132587.","journal-title":"Statistics in Medicine"},{"doi-asserted-by":"crossref","unstructured":"Efstathiou, V., Chatzilenas, C., & Spinellis, D. (2018). Word embeddings for the software engineering domain. In Proceedings of the 15th international conference on mining software repositories (pp. 38\u201341). ACM.","key":"9520_CR12","DOI":"10.1145\/3196398.3196448"},{"unstructured":"Fan, Y., Xia, X., Lo, D., & Hassan AE. (2018). Chaff from the wheat: characterizing and determining valid bug reports. IEEE Transactions on Software Engineering.","key":"9520_CR13"},{"issue":"3","key":"9520_CR14","doi-asserted-by":"publisher","first-page":"1143","DOI":"10.1007\/s10664-015-9378-4","volume":"21","author":"FA Fontana","year":"2016","unstructured":"Fontana, F.A., M\u00e4ntyl\u00e4, M.V., Zanoni, M., & Marino, A. (2016). Comparing and experimenting machine learning techniques for code smell detection. Empirical Software Engineering, 21(3), 1143\u20131191.","journal-title":"Empirical Software Engineering"},{"issue":"1","key":"9520_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v033.i01","volume":"33","author":"J Friedman","year":"2010","unstructured":"Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1.","journal-title":"Journal of Statistical Software"},{"key":"9520_CR16","doi-asserted-by":"publisher","first-page":"369","DOI":"10.1016\/j.infsof.2014.05.017","volume":"57","author":"Y Fu","year":"2015","unstructured":"Fu, Y., Yan, M., Zhang, X., Xu, L., Yang, D., & Kymer, J.D. (2015). Automated classification of software change messages by semi-supervised latent dirichlet allocation. Information and Software Technology, 57, 369\u2013377.","journal-title":"Information and Software Technology"},{"issue":"3","key":"9520_CR17","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1198\/004017007000000245","volume":"49","author":"A Genkin","year":"2007","unstructured":"Genkin, A., Lewis, D.D., & Madigan, D. (2007). Large-scale bayesian logistic regression for text categorization. Technometrics, 49(3), 291\u2013304.","journal-title":"Technometrics"},{"doi-asserted-by":"crossref","unstructured":"Guzman, E., Az\u00f3car, D., & Li, Y. (2014). Sentiment analysis of commit comments in github: an empirical study. In Proceedings of the 11th working conference on mining software repositories, pp 352\u2013355.","key":"9520_CR18","DOI":"10.1145\/2597073.2597118"},{"issue":"6","key":"9520_CR19","doi-asserted-by":"publisher","first-page":"1276","DOI":"10.1109\/TSE.2011.103","volume":"38","author":"T Hall","year":"2012","unstructured":"Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2012). A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6), 1276\u20131304.","journal-title":"IEEE Transactions on Software Engineering"},{"issue":"2-3","key":"9520_CR20","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1080\/00437956.1954.11659520","volume":"10","author":"ZS Harris","year":"1954","unstructured":"Harris, Z.S. (1954). Distributional structure. Word, 10(2-3), 146\u2013162.","journal-title":"Word"},{"issue":"2","key":"9520_CR21","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1007\/s10515-011-0090-3","volume":"19","author":"Z He","year":"2012","unstructured":"He, Z., Shu, F., Yang, Y., Li, M., & Wang, Q. (2012). An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering, 19(2), 167\u2013199.","journal-title":"Automated Software Engineering"},{"issue":"6","key":"9520_CR22","doi-asserted-by":"publisher","first-page":"e1609","DOI":"10.1002\/stvr.1609","volume":"27","author":"H Hemmati","year":"2017","unstructured":"Hemmati, H., Fang, Z., M\u00e4ntyl\u00e4, M.V., & Adams, B. (2017). Prioritizing manual test cases in rapid release environments. Software Testing, Verification and Reliability, 27(6), e1609.","journal-title":"Software Testing, Verification and Reliability"},{"doi-asserted-by":"crossref","unstructured":"Herzig, K., Just, S., & Zeller, A. (2013). It\u2019s not a bug, it\u2019s a feature: how misclassification impacts bug prediction. In Proceedings of the 2013 international conference on software engineering (pp. 392\u2013401). Piscataway: IEEE Press.","key":"9520_CR23","DOI":"10.1109\/ICSE.2013.6606585"},{"issue":"1","key":"9520_CR24","doi-asserted-by":"publisher","first-page":"418","DOI":"10.1007\/s10664-017-9522-4","volume":"23","author":"Q Huang","year":"2018","unstructured":"Huang, Q., Shihab, E., Xia, X., Lo, D., & Li, S. (2018). Identifying self-admitted technical debt in open source projects using text mining. Empirical Software Engineering, 23(1), 418\u2013451.","journal-title":"Empirical Software Engineering"},{"unstructured":"Kamei, Y., Maldonado, E.D.S., Shihab, E, & Ubayashi, N. (2016). Using analytics to quantify interest of self-admitted technical debt. In QuASoQ\/TDA@ APSEC, pp 68\u201371.","key":"9520_CR25"},{"doi-asserted-by":"crossref","unstructured":"Kanakaraj, M., & Guddeti, RMR. (2015). Nlp based sentiment analysis on twitter data using ensemble classifiers. In 2015 3Rd international conference on signal processing, communication and networking(ICSCN) (pp. 1\u20135). IEEE.","key":"9520_CR26","DOI":"10.1109\/ICSCN.2015.7219856"},{"unstructured":"Laitila, V. (2019). Technical debt analysis. http:\/\/help.softagram.com\/articles\/2887068-technical-debt-analysishttp:\/\/help.softagram.com\/articles\/2887068-technical-debt-analysis.","key":"9520_CR27"},{"issue":"4","key":"9520_CR28","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1109\/TSE.2008.35","volume":"34","author":"S Lessmann","year":"2008","unstructured":"Lessmann, S., Baesens, B., Mues, C., & Pietsch, S. (2008). Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4), 485\u2013496.","journal-title":"IEEE Transactions on Software Engineering"},{"doi-asserted-by":"crossref","unstructured":"Levy, O., & Goldberg, Y. (2014). Dependency-based word embeddings. In Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: short papers), (Vol. 2 pp. 302\u2013308).","key":"9520_CR29","DOI":"10.3115\/v1\/P14-2050"},{"doi-asserted-by":"crossref","unstructured":"Liu, Y., Liu, Z., Chua, T.S., & Sun, M. (2015). Topical word embeddings. In Twenty-Ninth AAAI conference on artificial intelligence.","key":"9520_CR30","DOI":"10.1609\/aaai.v29i1.9522"},{"doi-asserted-by":"crossref","unstructured":"Loper, E., & Bird, S. (2002). Nltk: The natural language toolkit. In Proceedings of the ACL workshop on effective tools and methodologies for teaching natural language processing and computational linguistics, pp 62\u201369.","key":"9520_CR31","DOI":"10.3115\/1118108.1118117"},{"doi-asserted-by":"crossref","unstructured":"Maldonado E.D.S., & Shihab, E. (2015). Detecting and quantifying different types of self-admitted technical debt. In 2015 IEEE 7Th international workshop on managing technical debt(MTD) (pp. 9\u201315). IEEE.","key":"9520_CR32","DOI":"10.1109\/MTD.2015.7332619"},{"doi-asserted-by":"crossref","unstructured":"Maldonado, E.D.S., Abdalkareem, R., Shihab, E., & Serebrenik, A. (2017). An empirical study on the removal of self-admitted technical debt. In Software maintenance and evolution (ICSME), 2017 IEEE international conference on (pp. 238\u2013248). IEEE.","key":"9520_CR33","DOI":"10.1109\/ICSME.2017.8"},{"doi-asserted-by":"crossref","unstructured":"Mann, H.B., & Whitney, D.R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics, pp 50\u201360.","key":"9520_CR34","DOI":"10.1214\/aoms\/1177730491"},{"doi-asserted-by":"crossref","unstructured":"Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., & McClosky, D. (2014). The stanford corenlp natural language processing toolkit. In Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55\u201360.","key":"9520_CR35","DOI":"10.3115\/v1\/P14-5010"},{"doi-asserted-by":"crossref","unstructured":"Mantyla M.V., Claes M., & Farooq U. (2018). Measuring lda topic stability from clusters of replicated runs. In Proceedings of the 12th ACM\/IEEE international symposium on empirical software engineering and measurement (p. 49). ACM.","key":"9520_CR36","DOI":"10.1145\/3239235.3267435"},{"unstructured":"Mensah, S., Keung, J., Bosu, M.F., & Bennin, K.E. (2016). Rework effort estimation of self-admitted technical debt.","key":"9520_CR37"},{"unstructured":"MITRE. (2019). Common weakness enumeration. http:\/\/cwe.mitre.org\/data\/definitions\/546.html.","key":"9520_CR38"},{"unstructured":"Moss, H.B., Leslie, D.S., & Rayson, P. (2018). Using jk fold cross validation to reduce variance when tuning nlp models. In Proceedings of the 27th international conference on computational linguistics, pp 2978\u20132989.","key":"9520_CR39"},{"unstructured":"Movshovitz-Attias, D., & Cohen, W.W. (2013). Natural language models for predicting programming comments. In Proceedings of the 51st annual meeting of the association for computational linguistics (Volume 2: Short Papers), (Vol. 2 pp. 35\u201340).","key":"9520_CR40"},{"unstructured":"Mullen, K.M., Ardia, D., Gil, D.L., Windover, D., & Cline, J. (2009). Deoptim: An r package for global optimization by differential evolution.","key":"9520_CR41"},{"unstructured":"Newman, D., Bonilla, E.V., & Buntine, W. (2011). Improving topic coherence with regularized topic models. In Advances in neural information processing systems, pp 496\u2013504.","key":"9520_CR42"},{"doi-asserted-by":"crossref","unstructured":"Oliveira, N., Cortez, P., & Areal, N. (2014). Automatic creation of stock market lexicons for sentiment analysis using stocktwits data. In Proceedings of the 18th international database engineering & applications symposium (pp. 115\u2013123). ACM.","key":"9520_CR43","DOI":"10.1145\/2628194.2628235"},{"doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532\u20131543.","key":"9520_CR44","DOI":"10.3115\/v1\/D14-1162"},{"doi-asserted-by":"crossref","unstructured":"Potdar, A., & Shihab, E. (2014). An exploratory study on self-admitted technical debt. In Software maintenance and evolution (ICSME), 2014 IEEE international conference on software maintenance and evolution(ICSME) (pp. 91\u2013100). IEEE.","key":"9520_CR45","DOI":"10.1109\/ICSME.2014.31"},{"issue":"11","key":"9520_CR46","doi-asserted-by":"publisher","first-page":"1601","DOI":"10.1109\/TKDE.2011.59","volume":"23","author":"RC Prati","year":"2011","unstructured":"Prati, R.C., Batista, G., & Monard, M.C. (2011). A survey on graphical methods for classification predictive performance evaluation. IEEE Transactions on Knowledge and Data Engineering, 23(11), 1601\u20131618.","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"9520_CR47","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1186\/1471-2105-12-77","volume":"12","author":"X Robin","year":"2011","unstructured":"Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.C., & M\u00fcller, M. (2011). Proc: an open-source package for r and s+ to analyze and compare roc curves. BMC Bioinformatics, 12, 77.","journal-title":"BMC Bioinformatics"},{"doi-asserted-by":"crossref","unstructured":"R\u00f6der, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. In Proceedings of the eighth ACM international conference on Web search and data mining (pp. 399\u2013408). ACM.","key":"9520_CR48","DOI":"10.1145\/2684822.2685324"},{"doi-asserted-by":"crossref","unstructured":"Sas, D., & Avgeriou, P. (2019). Quality attribute trade-offs in the embedded systems industry: an exploratory case study. Software Quality Journal, 1\u201330.","key":"9520_CR49","DOI":"10.1007\/s11219-019-09478-x"},{"issue":"2","key":"9520_CR50","doi-asserted-by":"publisher","first-page":"93","DOI":"10.14257\/ijseia.2016.10.2.08","volume":"10","author":"M Seok","year":"2016","unstructured":"Seok, M., Song, H.J., Park, C.Y., Kim, J.D., & Ys, Kim. (2016). Named entity recognition using word embedding as a feature. International Journal Software Engineering Application, 10(2), 93\u2013104.","journal-title":"International Journal Software Engineering Application"},{"doi-asserted-by":"crossref","unstructured":"Sinha, V., Lazar, A., & Sharif, B. (2016). Analyzing developer sentiment in commit logs. In Proceedings of the 13th international conference on mining software repositories, pp 520\u2013523.","key":"9520_CR51","DOI":"10.1145\/2901739.2903501"},{"unstructured":"SonarQube. (2019). Rules explorer. https:\/\/rules.sonarsource.com\/.","key":"9520_CR52"},{"doi-asserted-by":"crossref","unstructured":"Song, Y., Pan, S., Liu, S., Zhou, M.X., & Qian, W. (2009). Topic and keyword re-ranking for lda-based topic modeling. In Proceedings of the 18th ACM conference on information and knowledge management (pp. 1757\u20131760). ACM.","key":"9520_CR53","DOI":"10.1145\/1645953.1646223"},{"issue":"11","key":"9520_CR54","doi-asserted-by":"publisher","first-page":"1389","DOI":"10.1109\/LSP.2014.2337313","volume":"21","author":"X Sun","year":"2014","unstructured":"Sun, X., & Xu, W. (2014). Fast implementation of delong\u2019s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Processing Letters, 21(11), 1389\u20131393.","journal-title":"IEEE Signal Processing Letters"},{"doi-asserted-by":"crossref","unstructured":"Sun, X., Liu, X., Hu, J., & Zhu, J. (2014). Empirical studies on the nlp techniques for source code data preprocessing. In Proceedings of the 2014 3rd international workshop on evidential assessment of software technologies (pp. 32\u201339). ACM.","key":"9520_CR55","DOI":"10.1145\/2627508.2627514"},{"doi-asserted-by":"crossref","unstructured":"Thomas, S.W., Adams, B., Hassan, A.E., & Blostein, D. (2010). Validating the use of topic models for software evolution. In 2010 10th IEEE working conference on source code analysis and manipulation (pp. 55\u201364): IEEE.","key":"9520_CR56","DOI":"10.1109\/SCAM.2010.13"},{"issue":"1","key":"9520_CR57","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1007\/s10664-012-9219-7","volume":"19","author":"SW Thomas","year":"2014","unstructured":"Thomas, S.W., Hemmati, H., Hassan, A.E., & Blostein, D. (2014). Static test case prioritization using topic models. Empirical Software Engineering, 19(1), 182\u2013212.","journal-title":"Empirical Software Engineering"},{"issue":"6","key":"9520_CR58","doi-asserted-by":"publisher","first-page":"1498","DOI":"10.1016\/j.jss.2012.12.052","volume":"86","author":"E Tom","year":"2013","unstructured":"Tom, E., Aurum, A., & Vidgen, R. (2013). An exploration of technical debt. Journal of Systems and Software, 86(6), 1498\u20131516.","journal-title":"Journal of Systems and Software"},{"doi-asserted-by":"crossref","unstructured":"Treude, C., & Wagner, M. (2019). Predicting good configurations for github and stack overflow topic models. In Proceedings of the 16th international conference on mining software repositories (pp. 84\u201395). IEEE Press.","key":"9520_CR59","DOI":"10.1109\/MSR.2019.00022"},{"unstructured":"Turian, J., Ratinov, L., & Bengio, Y. (2010). Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th annual meeting of the association for computational linguistics, association for computational linguistics, pp 384\u2013394.","key":"9520_CR60"},{"unstructured":"Wehaibi, S., Shihab, E., & Guerrouj, L. (2016). Examining technical debt on software quality. In 2016 IEEE 23Rd international conference on software analysis, evolution, and reengineering (SANER), (Vol. 1 pp. 179\u2013188). IEEE.","key":"9520_CR61"},{"doi-asserted-by":"crossref","unstructured":"Wu, X., Zheng, W., Pu, M., Chen, J., & Mu, D. (2020). Invalid bug reports complicate the software aging situation. Software Quality Journal, 1\u201326.","key":"9520_CR62","DOI":"10.1007\/s11219-019-09481-2"},{"doi-asserted-by":"crossref","unstructured":"Yan, M., Xia, X., Shihab E., Lo, D., Yin, J., & Yang, X. (2018). Automating change-level self-admitted technical debt determination. IEEE Transactions on Software Engineering.","key":"9520_CR63","DOI":"10.1109\/TSE.2018.2831232"},{"doi-asserted-by":"crossref","unstructured":"Yang, X., Lo, D., Xia, X., Bao, L., & Sun, J. (2016). Combining word embedding with information retrieval to recommend similar bug reports. In 2016 IEEE 27Th international symposium on software reliability engineering(ISSRE) (pp. 127\u2013137). IEEE.","key":"9520_CR64","DOI":"10.1109\/ISSRE.2016.33"},{"doi-asserted-by":"crossref","unstructured":"Ye, X., Shen, H., Ma, X., Bunescu, R., & Liu, C. (2016). From word embeddings to document similarities for improved information retrieval in software engineering. In Proceedings of the 38th international conference on software engineering (pp. 404\u2013415). ACM.","key":"9520_CR65","DOI":"10.1145\/2884781.2884862"},{"doi-asserted-by":"crossref","unstructured":"Zhao, W., Chen, J.J., Perkins, R., Liu, Z., Ge, W., Ding, Y., & Zou, W. (2015). A heuristic approach to determine an appropriate number of topics in topic modeling. In BMC bioinformatics, biomed central, (Vol. 16 p. S8).","key":"9520_CR66","DOI":"10.1186\/1471-2105-16-S13-S8"},{"doi-asserted-by":"crossref","unstructured":"Zhou Y, & Sharma A. (2017). Automated identification of security issues from commit messages and bug reports. In Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 914\u2013919.","key":"9520_CR67","DOI":"10.1145\/3106237.3117771"}],"container-title":["Software Quality Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11219-020-09520-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11219-020-09520-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11219-020-09520-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,11,1]],"date-time":"2022-11-01T02:55:18Z","timestamp":1667271318000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11219-020-09520-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,4]]},"references-count":67,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["9520"],"URL":"https:\/\/doi.org\/10.1007\/s11219-020-09520-3","relation":{},"ISSN":["0963-9314","1573-1367"],"issn-type":[{"type":"print","value":"0963-9314"},{"type":"electronic","value":"1573-1367"}],"subject":[],"published":{"date-parts":[[2020,7,4]]},"assertion":[{"value":"4 July 2020","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with Ethical Standards"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Conflict of interest"}}]}}