{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T14:58:18Z","timestamp":1777733898127,"version":"3.51.4"},"reference-count":59,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2021,2,22]],"date-time":"2021-02-22T00:00:00Z","timestamp":1613952000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,2,22]],"date-time":"2021-02-22T00:00:00Z","timestamp":1613952000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["SN COMPUT. SCI."],"published-print":{"date-parts":[[2021,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Botnets and malware continue to avoid detection by static rule engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. To combat this, we created a novel hybrid neural network, Bilbo the \u201cbagging\u201d model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious. Bilbo is the first parallel usage of a convolutional neural network (CNN) and a long short-term memory (LSTM) network for DGA detection. Our unique architecture is found to be the most consistent in performance in terms of AUC,<jats:inline-formula><jats:alternatives><jats:tex-math>$$F_1$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>F<\/mml:mi><mml:mn>1<\/mml:mn><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>score, and accuracy when generalising across different dictionary DGA classification tasks compared to current state-of-the-art deep learning architectures. We validate using reverse-engineered dictionary DGA domains and detail our real-time implementation strategy for scoring real-world network logs within a large enterprise. In 4 h of actual network traffic, the model discovered at least five potential command-and-control networks that commercial vendor tools did not flag.<\/jats:p>","DOI":"10.1007\/s42979-021-00507-w","type":"journal-article","created":{"date-parts":[[2021,2,22]],"date-time":"2021-02-22T02:02:56Z","timestamp":1613959376000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":60,"title":["Real-Time Detection of Dictionary DGA Network Traffic Using Deep Learning"],"prefix":"10.1007","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4752-9334","authenticated-orcid":false,"given":"Kate","family":"Highnam","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Domenic","family":"Puzio","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Song","family":"Luo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nicholas R.","family":"Jennings","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,2,22]]},"reference":[{"key":"507_CR1","unstructured":"Plohmann D, Yakdan K, Klatt M, Bader J, Gerhards-Padilla E. A comprehensive measurement study of domain generating malware. 2016"},{"key":"507_CR2","doi-asserted-by":"crossref","unstructured":"Oprea A, Li Z, Yen T-F, Chin SH, Alrwais S . Detection of early-stage enterprise infection by mining large-scale log data. 2015","DOI":"10.1109\/DSN.2015.14"},{"key":"507_CR3","unstructured":"Unit 42. Threat brief: understanding domain generation algorithms (dga). 2019"},{"key":"507_CR4","doi-asserted-by":"crossref","unstructured":"Lever C, Walls R, Nadji Y, Dagon D, McDaniel P, Antonakakis M. Domain-z: 28 registrations later; measuring the exploitation of residual trust in domains. 2016","DOI":"10.1109\/SP.2016.47"},{"key":"507_CR5","doi-asserted-by":"crossref","unstructured":"Yadav S, Krishna Reddy AK, Reddy AL, Ranjan S. Detecting algorithmically generated malicious domain names. 2010","DOI":"10.1145\/1879141.1879148"},{"key":"507_CR6","unstructured":"Fraunhofer FKIE. Dgarchive. 2017"},{"key":"507_CR7","unstructured":"Chaz L, Platon K, Davide B, Juan C, Antonakakis M. A lustrum of malware network communication: evolution and insights. 2017"},{"key":"507_CR8","unstructured":"Marc K, Christian R, Holz T. Paint it black: evaluating the effectiveness of malware blacklists. 2014"},{"key":"507_CR9","doi-asserted-by":"crossref","unstructured":"Ahluwalia A, Traore I, Ganame K, Agarwal N. Detecting broad length algorithmically generated domains. 2017","DOI":"10.1007\/978-3-319-69155-8_2"},{"key":"507_CR10","doi-asserted-by":"crossref","unstructured":"Yu B, Gray DL, Pan J, Cock MD, Nascimento ACA. Inline dga detection with deep networks. 2017","DOI":"10.1109\/ICDMW.2017.96"},{"key":"507_CR11","unstructured":"Woodbridge J, Anderson HS, Ahuja A, Grant D. Predicting domain generation algorithms with long short-term memory networks. 2016"},{"key":"507_CR12","unstructured":"Manos A, Roberto P, Yacin N, Nikolaos V, Saeed A-N, Wenke L, Dagon D. From throw-away traffic to bots: detecting the rise of dga-based malware. 2012"},{"key":"507_CR13","doi-asserted-by":"crossref","unstructured":"Pereira M, Coleman S, Yu B, DeCock M, Nascimento A. Dictionary extraction and detection of algorithmically generated domain names in passive dns traffic. 2018","DOI":"10.1007\/978-3-030-00470-5_14"},{"key":"507_CR14","doi-asserted-by":"crossref","unstructured":"Tran D, Mac H, Tong V, Tran HA, Nguyen LG. A lstm based framework for handling multiclass imbalance in dga botnet detection. 2018","DOI":"10.1016\/j.neucom.2017.11.018"},{"key":"507_CR15","doi-asserted-by":"crossref","unstructured":"Akarsh S, Sriram S, Poornachandran P, Menon VK, Soman KP. Deep learning framework for domain generation algorithms prediction using long short-term memory. 2019","DOI":"10.1109\/ICACCS.2019.8728544"},{"key":"507_CR16","doi-asserted-by":"crossref","unstructured":"Vinayakumar R, Soman KP, Poornachandran P. Detecting malicious domain names using deep learning approaches at scale. 2018","DOI":"10.3233\/JIFS-169431"},{"key":"507_CR17","unstructured":"Lison P, Mavroeidis V. Automatic detection of malware-generated domains with recurrent neural models. 2017"},{"key":"507_CR18","unstructured":"Saxe J, Berlin K. Expose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. 2017"},{"key":"507_CR19","unstructured":"Shaofang Z, Lanfen L, Yuan J, Wang F, Ling Z, Cui J. Cnn-based dga detection with high coverage. 2019."},{"key":"507_CR20","doi-asserted-by":"crossref","unstructured":"Yu B, Pan J, Hu J, Nascimento A, Cock MD. Character level based detection of dga domain names, 2018.","DOI":"10.1109\/IJCNN.2018.8489147"},{"key":"507_CR21","unstructured":"Royal P. Analysis of the kraken botnet. 2008."},{"key":"507_CR22","unstructured":"Porras P, Saidi H, Yegneswaran V. An analysis of conficker\u2019s logic and rendezvous points. 2009."},{"key":"507_CR23","doi-asserted-by":"crossref","unstructured":"Porras P. Inside risks reflections on conficker. 2009.","DOI":"10.1145\/1562764.1562777"},{"key":"507_CR24","unstructured":"Sergei S. Domain name generator for murofet. 2010."},{"key":"507_CR25","doi-asserted-by":"crossref","unstructured":"Yadav S, Krishna Reddy AK, Reddy ALN, Ranjan S. Detecting algorithmically generated domain-flux attacks with dns traffic analysis. 2012.","DOI":"10.1109\/TNET.2012.2184552"},{"key":"507_CR26","doi-asserted-by":"crossref","unstructured":"Anderson HS, Woodbridge J, Filar B. Deepdga: Adversarially-tuned domain generation and detection. 2016.","DOI":"10.1145\/2996758.2996767"},{"key":"507_CR27","unstructured":"Bitdefender Labs. Tracking rovnix. 2014."},{"key":"507_CR28","unstructured":"Skuratovich S. Matsnu technical report. 2015."},{"key":"507_CR29","unstructured":"Geffner J. End-to-end analysis of a domain generating algorithm malware family. 2013."},{"key":"507_CR30","unstructured":"Srinivas K, Teryl T, Fabian M, McHugh J. Crossing the threshold: Detecting network malfeasance via sequential hypothesis testing. 2013"},{"key":"507_CR31","doi-asserted-by":"crossref","unstructured":"Raghuram J, Miller DJ, Kesidis G. Unsupervised, low latency anomaly detection of algorithmically generated domain names by generative probabilistic modeling. 2014.","DOI":"10.1016\/j.jare.2014.01.001"},{"key":"507_CR32","doi-asserted-by":"crossref","unstructured":"Schiavoni S, Maggi F, Cavallaro L, Zanero S. Phoenix: Dga-based botnet tracking and intelligence. 2014.","DOI":"10.1007\/978-3-319-08509-8_11"},{"key":"507_CR33","unstructured":"Zhou Y, Li QS, Miao Q, Yim K. Dga-based botnet detection using dns traffic. 2013."},{"key":"507_CR34","unstructured":"Samuel S, Dominik T, Patrick H, Meyer U. Fanci: Feature-based automated nxdomain classification and intelligence. 2018."},{"key":"507_CR35","doi-asserted-by":"crossref","unstructured":"Verma R, Dyer K. On the character of phishing urls: accurate and robust statistical learning classifiers. 2015.","DOI":"10.1145\/2699026.2699115"},{"key":"507_CR36","doi-asserted-by":"crossref","unstructured":"Yang L, Liu G, Zhai J, Dai Y, Yan Z, Zou Y, Huang W. A novel detection method for word-based dga. 2018.","DOI":"10.1007\/978-3-030-00009-7_43"},{"key":"507_CR37","doi-asserted-by":"crossref","unstructured":"Curtin RR, Gardner AB, Grzonkowski S, Kleymenov A, Mosquera A. Detecting dga domains with recurrent neural networks and side information. 2019.","DOI":"10.1145\/3339252.3339258"},{"key":"507_CR38","doi-asserted-by":"crossref","unstructured":"Johnson R, Zhang T. Effective use of word order for text categorization with convolutional neural networks. 2014.","DOI":"10.3115\/v1\/N15-1011"},{"key":"507_CR39","doi-asserted-by":"crossref","unstructured":"Kim Y. Convolutional neural networks for sentence classification. 2014.","DOI":"10.3115\/v1\/D14-1181"},{"key":"507_CR40","unstructured":"Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. 2015."},{"key":"507_CR41","unstructured":"Yin W, Kann K, Yu M, Schutze H. Comparative study of cnn and rnn for natural language processing. 2017."},{"key":"507_CR42","doi-asserted-by":"crossref","unstructured":"Chen G, Ye D, Cambria E, Chen J, Xing Z. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. 2017.","DOI":"10.1109\/IJCNN.2017.7966144"},{"key":"507_CR43","doi-asserted-by":"crossref","unstructured":"Kim Y, Jernite Y, Sontag D, Rush AM. Character-aware neural language models. 2016.","DOI":"10.1609\/aaai.v30i1.10362"},{"key":"507_CR44","doi-asserted-by":"crossref","unstructured":"Mohan VS, Kp VRS, Poornachandran P. S.p.o.o.f net: Syntactic patterns for identification of ominous online factors. May 2018.","DOI":"10.1109\/SPW.2018.00041"},{"key":"507_CR45","doi-asserted-by":"crossref","unstructured":"Mac H, Tran D, Tong V, Nguyen LG, Tran HA. Dga botnet detection using supervised learning methods. 2017.","DOI":"10.1145\/3155133.3155166"},{"key":"507_CR46","doi-asserted-by":"crossref","unstructured":"Berman DS, Buczak AL, Chavis JS, Corbett CL. A survey of deep learning methods for cyber security. 2019.","DOI":"10.3390\/info10040122"},{"key":"507_CR47","unstructured":"Raaghavi S, Choudhary C, Yu B, Tymchenko V, Nascimento A, De Cock M. An evaluation of dga classifiers. 2018."},{"key":"507_CR48","doi-asserted-by":"crossref","unstructured":"Feng, Shuo C, Xiaochuan W. Classification for dga-based malicious domain names with deep learning architectures. 2017.","DOI":"10.11648\/j.ijiis.20170606.11"},{"key":"507_CR49","doi-asserted-by":"crossref","unstructured":"J Koh, Rhodes B. Inline detection of domain generation algorithms with context-sensitive word embeddings. 2018.","DOI":"10.1109\/BigData.2018.8622066"},{"key":"507_CR50","doi-asserted-by":"crossref","unstructured":"Kumar AD, Thodupunoori H, Vinayakumar R, Soman KP, Poornachandran P, Alazab M, Venkatraman S. Enhanced domain generating algorithm detection based on deep neural networks. 2019.","DOI":"10.1007\/978-3-030-13057-2_7"},{"key":"507_CR51","unstructured":"Goodfellow I, Bengio Y, Courville A. Deep learning. 2016."},{"key":"507_CR52","doi-asserted-by":"crossref","unstructured":"Vosoughi S, Vijayaraghavan P, Roy D. Tweet2vec: Learning tweet embeddings using character-level cnn-lstm encoder-decoder. 2016.","DOI":"10.1145\/2911451.2914762"},{"key":"507_CR53","unstructured":"Alexa A. Csv with alexa top 1 million sites, directly from the server. 2013."},{"key":"507_CR54","unstructured":"Gomaa WH, Fahmy AA. A survey of text similarity approaches. 2013."},{"key":"507_CR55","unstructured":"KERAS\u00a0Development Team. Keras: deep learning library for theano and tensorflow. 2016."},{"key":"507_CR56","unstructured":"Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, t\u00a0al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. 2016."},{"key":"507_CR57","unstructured":"gRPC Authors. grpc. 2013."},{"key":"507_CR58","unstructured":"Chronicle Security. Virustotal\u2014free online virus, malware, and url scanner. 2017."},{"key":"507_CR59","unstructured":"AlienVault. Threatcrowd\u2014a search engine for threats. 2017."}],"container-title":["SN Computer Science"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-021-00507-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s42979-021-00507-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-021-00507-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,18]],"date-time":"2022-12-18T12:40:39Z","timestamp":1671367239000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s42979-021-00507-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,22]]},"references-count":59,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,4]]}},"alternative-id":["507"],"URL":"https:\/\/doi.org\/10.1007\/s42979-021-00507-w","relation":{},"ISSN":["2662-995X","2661-8907"],"issn-type":[{"value":"2662-995X","type":"print"},{"value":"2661-8907","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,22]]},"assertion":[{"value":"1 August 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 February 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with Ethical Standards"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"The training and testing data were acquired from DGArchive [] and may be requested from the distributors. The application data for enterprise traffic are owned by Capital One Bank and we are unable to share it.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Data availability"}},{"value":"The model architectures can be found at the following Github repository:.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}}],"article-number":"110"}}