{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T15:31:25Z","timestamp":1774539085764,"version":"3.50.1"},"reference-count":27,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,9,7]],"date-time":"2023-09-07T00:00:00Z","timestamp":1694044800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,7]],"date-time":"2023-09-07T00:00:00Z","timestamp":1694044800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecurity"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Domain name generation algorithm (DGA) classification is an essential but challenging problem. Both feature-extracting machine learning (ML) methods and deep learning (DL) models such as convolutional neural networks and long short-term memory have been developed. However, the performance of these approaches varies with different types of DGAs. Most features in the ML methods can characterize random-looking DGAs better than word-looking DGAs. To improve the classification performance on word-looking DGAs, subword tokenization is employed for the DL models. Our experimental results proved that the subword tokenization can provide excellent classification performance on the word-looking DGAs. We then propose an integrated scheme that chooses an appropriate method for DGA classification depending on the nature of the DGAs. Results show that the integrated scheme outperformed existing ML and DL methods, and also the subword DL methods.<\/jats:p>","DOI":"10.1186\/s42400-023-00183-8","type":"journal-article","created":{"date-parts":[[2023,9,7]],"date-time":"2023-09-07T02:01:43Z","timestamp":1694052103000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Use of subword tokenization for domain generation algorithm classification"],"prefix":"10.1186","volume":"6","author":[{"given":"Sea Ran Cleon","family":"Liew","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9809-5110","authenticated-orcid":false,"given":"Ngai Fong","family":"Law","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,9,7]]},"reference":[{"key":"183_CR1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2020.101787","volume":"93","author":"AO Almashhadani","year":"2020","unstructured":"Almashhadani AO, Kaiiali M, Carlin D, Sezer S (2020) MaldomDetector: a system for detecting algorithmically generated domain names with machine learning. Comput Secur 93:101787","journal-title":"Comput Secur"},{"key":"183_CR2","unstructured":"Antonakakis M, Perdisci R, Nadji Y, Vasiloglou N, Abu-Nimeh S, Lee W, Dagon D (2012) From throw-away traffic to bots: detecting the rise of DGA-based malware. In: USENIX security symposium, p 24"},{"key":"183_CR3","doi-asserted-by":"publisher","first-page":"157","DOI":"10.3390\/info10050157","volume":"10","author":"DS Berman","year":"2019","unstructured":"Berman DS (2019) DGA CapsNet: 1D application of capsule networks to DGA detection. Information 10:157","journal-title":"Information"},{"key":"183_CR4","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1145\/2584679","volume":"16","author":"L Bilge","year":"2014","unstructured":"Bilge L, \u015een S, Balzarotti D, Kirda E, Kr\u00fcgel C (2014) Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans Inf Syst Secur 16:14","journal-title":"ACM Trans Inf Syst Secur"},{"key":"183_CR5","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2020.114551","volume":"170","author":"A Cucchiarelli","year":"2021","unstructured":"Cucchiarelli A, Morbidoni C, Spalazzi L, Baldi M (2021) Algorithmically generated malicious domain names detection based on n-grams features. Expert Syst Appl 170:114551","journal-title":"Expert Syst Appl"},{"issue":"6","key":"183_CR6","first-page":"67","volume":"6","author":"Z Feng","year":"2017","unstructured":"Feng Z, Shuo C, Xiaochuan W (2017) Classification for DGA-based malicious domain names with deep learning architectures. Int J Intell Inf Syst 6(6):67\u201371","journal-title":"Int J Intell Inf Syst"},{"issue":"4","key":"183_CR7","doi-asserted-by":"publisher","first-page":"441","DOI":"10.1080\/19393555.2021.1934198","volume":"31","author":"XD Hoang","year":"2022","unstructured":"Hoang XD, Vu XH (2022) An improved model for detecting DGA botnets using random forest algorithm. Inf Secur J Glob Perspect 31(4):441\u2013450","journal-title":"Inf Secur J Glob Perspect"},{"key":"183_CR8","doi-asserted-by":"publisher","DOI":"10.1201\/9780429329913","volume-title":"Botnets: architectures, countermeasures and challenges","author":"G Kambourakis","year":"2019","unstructured":"Kambourakis G, Anagnostopoulos M, Meng W, Zhou P (2019) Botnets: architectures, countermeasures and challenges. CRC Press"},{"key":"183_CR9","unstructured":"Liew SRC, Law NF (2022) BEAM\u2014an algorithm for detecting phishing link. APSIPA ASC"},{"key":"183_CR10","doi-asserted-by":"crossref","unstructured":"Mac H, Tran D, Tong V, Nguyen LG, Tran HA (2017) DGA botnet detection using supervised learning methods. In: Proceedings of the eighth international symposium on information and communication technology, pp 211\u2013218","DOI":"10.1145\/3155133.3155166"},{"issue":"4\u20136","key":"183_CR11","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1080\/19393555.2015.1075629","volume":"24","author":"N Negash","year":"2015","unstructured":"Negash N, Che X (2015) An overview of modern botnets. Inf Secur J Glob Perspect 24(4\u20136):127\u2013132","journal-title":"Inf Secur J Glob Perspect"},{"key":"183_CR12","doi-asserted-by":"publisher","first-page":"4205","DOI":"10.3390\/app9204205","volume":"9","author":"Y Qiao","year":"2019","unstructured":"Qiao Y, Zhang B, Zhang W, Sangaiah AK, Wu H (2019) DGA domain name classification method based on long short-term memory with attention mechanism. Appl Sci 9:4205","journal-title":"Appl Sci"},{"key":"183_CR13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s42400-020-00046-6","volume":"3","author":"F Ren","year":"2020","unstructured":"Ren F, Jiang Z, Wang X, Liu J (2020) A DGA domain names detection modeling method based on integrating an attention mechanism and deep neural network. Cybersecurity 3:1\u201313","journal-title":"Cybersecurity"},{"key":"183_CR14","doi-asserted-by":"crossref","unstructured":"Saeed AMH, Wang D, Alnedhari HAM, Mei K, Wang J (2022) A survey of machine learning and deep learning based DGA detection techniques. In: Qiu M, Gai K, Qiu H (eds) Smart computing and communications. SmartCom 2021. Lecture notes in computer science, vol 13202","DOI":"10.1007\/978-3-030-97774-0_12"},{"key":"183_CR15","doi-asserted-by":"publisher","first-page":"126446","DOI":"10.1109\/ACCESS.2021.3111307","volume":"9","author":"JP Selvi","year":"2021","unstructured":"Selvi JP, Rodr\u00edguez RJ, Soria-Olivas E (2021) Toward optimal LSTM neural networks for detecting algorithmically generated domain names. IEEE Access 9:126446\u2013126456","journal-title":"IEEE Access"},{"key":"183_CR16","doi-asserted-by":"crossref","unstructured":"Vij P, Nikam SD, Bhatia A (2020) Detection of algorithmically generated domain names using LSTM. In: International conference on communication systems and networks (COMSNETS), pp 1\u20136","DOI":"10.1109\/COMSNETS48256.2020.9027342"},{"key":"183_CR17","doi-asserted-by":"publisher","first-page":"2768","DOI":"10.1109\/COMST.2017.2749442","volume":"19","author":"G Vormayr","year":"2017","unstructured":"Vormayr G, Zseby T, Fabini J (2017) Botnet communication patterns. IEEE Commun Surv Tutor 19:2768\u20132796","journal-title":"IEEE Commun Surv Tutor"},{"key":"183_CR18","doi-asserted-by":"publisher","first-page":"414","DOI":"10.3390\/electronics11030414","volume":"11","author":"HP Vranken","year":"2022","unstructured":"Vranken HP, Alizadeh H (2022) Detection of DGA-generated domain names with TF-IDF. Electronics 11:414","journal-title":"Electronics"},{"key":"183_CR19","volume":"61","author":"Z Wang","year":"2021","unstructured":"Wang Z, Guo Y (2021) Neural networks based domain name generation. J Inf Secur Appl 61:102948","journal-title":"J Inf Secur Appl"},{"issue":"4","key":"183_CR20","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1080\/19393555.2020.1834650","volume":"30","author":"T Wang","year":"2020","unstructured":"Wang T, Chen L, Genc Y (2020) A dictionary-based method for detecting machine-generated domains. Inf Secur J Glob Perspect 30(4):205\u2013218","journal-title":"Inf Secur J Glob Perspect"},{"key":"183_CR21","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2022.107841","volume":"100","author":"Z Wang","year":"2022","unstructured":"Wang Z, Guo Y, Montgomery D (2022) Machine learning-based algorithmically generated domain detection. Comput Electr Eng 100:107841","journal-title":"Comput Electr Eng"},{"key":"183_CR22","unstructured":"Woodbridge J, Anderson H, Ahuja A, Grant D (2016) Predicting domain generation algorithms with long short-term memory networks. arXiv: abs\/1611.00791"},{"key":"183_CR23","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2022.105342","volume":"144","author":"L Xu","year":"2022","unstructured":"Xu L, Magar R, Farimani AB (2022) Forecasting COVID-19 new cases using deep learning methods. Comput Biol Med 144:105342","journal-title":"Comput Biol Med"},{"key":"183_CR24","doi-asserted-by":"publisher","first-page":"209","DOI":"10.3390\/fi14070209","volume":"14","author":"C Yang","year":"2022","unstructured":"Yang C, Lu T, Yan S, Zhang J, Yu K (2022) N-trans: parallel detection algorithm for DGA domain names. Future Internet 14:209","journal-title":"Future Internet"},{"key":"183_CR25","doi-asserted-by":"crossref","unstructured":"Yu B, Gray DL, Pan J, Cock MD, Nascimento AC (2017) Inline DGA detection with deep networks. In: 2017 IEEE international conference on data mining workshops (ICDMW), pp 683\u2013692","DOI":"10.1109\/ICDMW.2017.96"},{"key":"183_CR26","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2020.101719","volume":"92","author":"M Zago","year":"2020","unstructured":"Zago M, P\u00e9rez MG, P\u00e9rez GM (2020a) UMUDGA: a dataset for profiling DGA-based botnet. Comput Secur 92:101719","journal-title":"Comput Secur"},{"key":"183_CR27","doi-asserted-by":"publisher","DOI":"10.1016\/j.dib.2020.105400","volume":"30","author":"M Zago","year":"2020","unstructured":"Zago M, Gil P\u00e9rez M, Mart\u00ednez P\u00e9rez G (2020b) UMUDGA: a dataset for profiling algorithmically generated domain names in botnet detection. Data Brief 30:105400","journal-title":"Data Brief"}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-023-00183-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42400-023-00183-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-023-00183-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,19]],"date-time":"2023-11-19T05:16:30Z","timestamp":1700370990000},"score":1,"resource":{"primary":{"URL":"https:\/\/cybersecurity.springeropen.com\/articles\/10.1186\/s42400-023-00183-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,7]]},"references-count":27,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["183"],"URL":"https:\/\/doi.org\/10.1186\/s42400-023-00183-8","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,7]]},"assertion":[{"value":"30 May 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 August 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 September 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declared that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"49"}}