{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T20:29:35Z","timestamp":1771619375134,"version":"3.50.1"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2022,9,14]],"date-time":"2022-09-14T00:00:00Z","timestamp":1663113600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,14]],"date-time":"2022-09-14T00:00:00Z","timestamp":1663113600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2023,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>An internet protocol (IP) address is the foundation of the Internet, allowing connectivity between people, servers, Internet of Things, and services across the globe. Knowing what is connecting to what and where connections are initiated is crucial to accurately assess a company\u2019s or individual\u2019s security posture. IP reputation assessment can be quite complex because of the numerous services that may be hosted on that IP address. For example, an IP might be serving millions of websites from millions of different companies like web hosting companies often do, or it could be a large email system sending and receiving emails for millions of independent entities. The heterogeneous nature of an IP address typically makes it challenging to interpret the security risk. To make matters worse, adversaries understand this complexity and leverage the ambiguous nature of the IP reputation to exploit further unsuspecting Internet users or devices connected to the Internet. In addition, traditional techniques like dirty-listing cannot react quickly enough to changes in the security climate, nor can they scale large enough to detect new exploits that may be created and disappear in minutes. In this paper, we introduce the use of cross-protocol analysis and graph neural networks (GNNs) in semi-supervised learning to address the speed and scalability of assessing IP reputation. In the cross-protocol supervised approach, we combine features from the web, email, and domain name system (DNS) protocols to identify ones which are the most useful in discriminating suspicious and benign IPs. In our second experiment, we leverage the most discriminant features and incorporate them into the graph as nodes\u2019 features. We use GNNs to pass messages from node to node, propagating the signal to the neighbors while also gaining the benefit of having the originating nodes being influenced by neighboring nodes. Thanks to the relational graph structure we can use only a small portion of labeled data and train the algorithm in a semi-supervised approach. Our dataset represents real-world data that is sparse and only contain a small percentage of IPs with verified clean or suspicious labels but are connected. The experimental results demonstrate that the system can achieve <jats:inline-formula><jats:alternatives><jats:tex-math>$$85.28\\%$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mn>85.28<\/mml:mn>\n                    <mml:mo>%<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> accuracy in detecting malicious IP addresses at scale with only <jats:inline-formula><jats:alternatives><jats:tex-math>$$5\\%$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mn>5<\/mml:mn>\n                    <mml:mo>%<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> of labeled data.<\/jats:p>","DOI":"10.1007\/s40747-022-00838-y","type":"journal-article","created":{"date-parts":[[2022,9,14]],"date-time":"2022-09-14T07:02:45Z","timestamp":1663138965000},"page":"3857-3869","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Graph neural networks and cross-protocol analysis for detecting malicious IP addresses"],"prefix":"10.1007","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7706-5262","authenticated-orcid":false,"given":"Yonghong","family":"Huang","sequence":"first","affiliation":[]},{"given":"Joanna","family":"Negrete","sequence":"additional","affiliation":[]},{"given":"John","family":"Wagener","sequence":"additional","affiliation":[]},{"given":"Celeste","family":"Fralick","sequence":"additional","affiliation":[]},{"given":"Armando","family":"Rodriguez","sequence":"additional","affiliation":[]},{"given":"Eric","family":"Peterson","sequence":"additional","affiliation":[]},{"given":"Adam","family":"Wosotowsky","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,14]]},"reference":[{"key":"838_CR1","unstructured":"Statista (2021) Number of internet users worldwide from 2005 to 2018 (in millions) [Online]. https:\/\/www.statista.com\/statistics\/617136\/digital-population-worldwide\/"},{"key":"838_CR2","unstructured":"Levine L (2008) DNS blacklists and whitelists [Online]. https:\/\/tools.ietf.org\/html\/draft-irtf-asrg-dnsbl-08"},{"key":"838_CR3","unstructured":"DNSWL (2017) Whitelisting DKIM-signed domains [Online]. https:\/\/www.dnswl.org\/"},{"key":"838_CR4","unstructured":"Greylisting (2016) Greylisting explained [Online]. https:\/\/www.greylisting.org\/"},{"key":"838_CR5","unstructured":"Berkeley Security Information Office (2019) Aggressive IP Distribution List [Online]. https:\/\/security.berkeley.edu\/services\/aggressive-ip-distribution-aid-list\/"},{"key":"838_CR6","unstructured":"Zhang J, Porras PA, Ullrich J (2008) Highly predictive blacklisting. In: Proc of USENIX security symposium, pp. 107\u2013122"},{"key":"838_CR7","doi-asserted-by":"crossref","unstructured":"Soldo F, Le A, Markopoulou A (2010) Predictive blacklisting as an implicit recommendation system. In: Proc of IEEE INFOCOM, pp. 1\u20139","DOI":"10.1109\/INFCOM.2010.5461982"},{"key":"838_CR8","doi-asserted-by":"crossref","unstructured":"Renjan A, Joshi KP, Narayanan SN, Joshi A (2018) DAbR: dynamic attribute-based reputation scoring for malicious IP address detection. In: Proc of IEEE Intl conf on intelligence and security informatics","DOI":"10.1109\/ISI.2018.8587342"},{"issue":"5","key":"838_CR9","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1109\/MSP.2009.130","volume":"7","author":"DK McGrath","year":"2009","unstructured":"McGrath DK, Kalafut A, Gupta M (2009) Phishing infrastructure fluxes all the way. IEEE Secur Priv 7(5):21\u201328","journal-title":"IEEE Secur Priv"},{"key":"838_CR10","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1016\/j.eswa.2016.01.028","volume":"53","author":"M Moghimi","year":"2016","unstructured":"Moghimi M, Varjani AY (2016) New rule-based phishing detection method. Expert Syst Appl 53:231\u2013242","journal-title":"Expert Syst Appl"},{"key":"838_CR11","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1145\/2584679","volume":"16","author":"L Bilge","year":"2014","unstructured":"Bilge L, Sen S, Balzarotti D, Kirda E, Kruegel C (2014) EXPOSURE: a passive DNS analysis service to detect and report malicious domains. ACM Trans Inf Syst Secur (TISSEC) 16:4\u201314","journal-title":"ACM Trans Inf Syst Secur (TISSEC)"},{"key":"838_CR12","doi-asserted-by":"crossref","unstructured":"Esquivel H, Akella A, Mori T (2010) On the effectiveness of IP reputation for spam filtering. In: Proc of intl conf on communication systems and networks","DOI":"10.1109\/COMSNETS.2010.5431981"},{"key":"838_CR13","doi-asserted-by":"crossref","unstructured":"Chiba D, Tobe K, Mori T, Goto S (2012) Detecting malicious websites by learning IP address features. In: Proc of IEEE intl symposium on applications and the internet","DOI":"10.1109\/SAINT.2012.14"},{"key":"838_CR14","doi-asserted-by":"crossref","unstructured":"Bajaj KS, Egbufor F, Pieprzyk J (2011) Critical analysis of spam prevention techniques. In: Proc of intl workshop on security and communication networks","DOI":"10.1109\/IWSCN.2011.6827721"},{"key":"838_CR15","doi-asserted-by":"crossref","unstructured":"Pagani F, De Astis M, Graziano M, Lanzi A, Balzarotti D (2016) Measuring the role of grey listing and no listing in fighting spam. In: Proc of IEEE intl conf on on dependable systems and networks","DOI":"10.1109\/DSN.2016.57"},{"key":"838_CR16","unstructured":"Antonakakis M, Perdisci R, Dagon D, Lee W, Feamster N (2011) Building a dynamic reputation system for DNS. In: Proc of USENIX security symposium"},{"key":"838_CR17","doi-asserted-by":"crossref","unstructured":"Porenta J, Ciglaric M (2011) Empirical comparison of IP reputation databases. In: Proc of annual collaboration, electronic messaging, anti-abuse and spam conference","DOI":"10.1145\/2030376.2030402"},{"issue":"6","key":"838_CR18","doi-asserted-by":"publisher","first-page":"1406","DOI":"10.1109\/TIFS.2017.2663333","volume":"12","author":"B Coskun","year":"2017","unstructured":"Coskun B (2017) (Un)wisdom of crowds: accurately spotting malicious IP clusters using not-so-accurate IP blacklists. IEEE Trans Inf Forens Secur 12(6):1406\u20131417","journal-title":"IEEE Trans Inf Forens Secur"},{"key":"838_CR19","doi-asserted-by":"crossref","unstructured":"Moura GC, Sadre R, Pras A (2011) Internet bad neighborhoods: the spam case. In: Proc of IEEE intl conf of network and service management (CNSM), pp. 1\u20138","DOI":"10.1109\/NOMS.2012.6211917"},{"key":"838_CR20","doi-asserted-by":"crossref","unstructured":"Collins M, Shimeall TJ, Faber S, Janies J, Weaver R, Shon MD, Kadane J (2007) Using uncleanliness to predict future botnet addresses. In: Proc of the ACM SIGCOMM conf on internet measurement, pp. 93\u2013104","DOI":"10.1145\/1298306.1298319"},{"key":"838_CR21","doi-asserted-by":"crossref","unstructured":"Stone-Gross B, Kruegel C, Almeroth K, Moser A, Kirda E (2009) Fire: finding rogue networks. In: Proc of the IEEE computer security applications Conf, pp. 231\u2013240","DOI":"10.1109\/ACSAC.2009.29"},{"key":"838_CR22","doi-asserted-by":"crossref","unstructured":"Huang Y, Greve P (2015) Large scale graph mining for web reputation inference. In: Proc of IEEE mach. learn. & sig. proc. workshop","DOI":"10.1109\/MLSP.2015.7324374"},{"key":"838_CR23","doi-asserted-by":"crossref","unstructured":"Huang Y, Negrete J, Wosotowsky A, Wagener J, Peterson E, Rodriguez A, Fralick C (2019) Detect malicious IP address using cross-protocol analysis. In: Proc of IEEE symposium series on computational intelligence","DOI":"10.1109\/SSCI44817.2019.9003003"},{"issue":"1","key":"838_CR24","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","volume":"20","author":"F Scarselli","year":"2009","unstructured":"Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2009) The graph neural network model. IEEE Trans Neural Netw 20(1):61\u201380","journal-title":"IEEE Trans Neural Netw"},{"key":"838_CR25","unstructured":"Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2019) A comprehensive survey on graph neural networks. arXiv:1901.00596"},{"key":"838_CR26","doi-asserted-by":"crossref","unstructured":"Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Sun M (2019) Graph neural networks: a review of methods and applications. arXiv:1812.08434","DOI":"10.1016\/j.aiopen.2021.01.001"},{"key":"838_CR27","unstructured":"Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A Zambaldi V, Malinowski M, Tacchetti A, Raposo D, Santoro A, Faulkner et al (2018) Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261"},{"key":"838_CR28","doi-asserted-by":"crossref","unstructured":"Cai H, Zheng VW, Chang K (2018) A comprehensive survey of graph embedding: problems, techniques and applications. In: IEEE transactions on knowledge and data engineering","DOI":"10.1109\/TKDE.2018.2807452"},{"key":"838_CR29","unstructured":"Kazemi SM, Goel R, Jain K, Kobyzev I, Sethi A, Forsyth P, Poupart P (2019) Relational representation learning for dynamic (knowledge) graphs: a survey. arXiv:1905.11485"},{"key":"838_CR30","doi-asserted-by":"crossref","unstructured":"Halcrow J, Mosoi A, Ruth S, Perozzi B (2020) Grale: designing networks for graph learning. In: Proc of ACM SIGKDD intl conf on knowledge discovery & data mining, pp 2523\u20132532","DOI":"10.1145\/3394486.3403302"},{"key":"838_CR31","unstructured":"Kapoor A, Ben X, Liu L, Perozzi B, Barnes M, Blais M, O\u2019Banion S (2020) Examing COVID-19 forecasting using spatio-temporal graph neural networks. arXiv:2007.03113"},{"key":"838_CR32","doi-asserted-by":"crossref","unstructured":"Epasto A, Perozzi B (2019) Is a single embedding enough? Learning node representations that capture multiple social contexts. In: Proc of the world wide web conference, pp 394-404","DOI":"10.1145\/3308558.3313660"},{"key":"838_CR33","doi-asserted-by":"crossref","unstructured":"Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proc of ACM SIGKDD intl conf on knowledge discovery & data mining, pp. 701\u2013710","DOI":"10.1145\/2623330.2623732"},{"key":"838_CR34","unstructured":"Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: Proc of ICML"},{"key":"838_CR35","unstructured":"Abu-El-Haija S, Kapoor A, Perozzi B, Lee J (2019), N-GCN: multi-scale graph convolution for semi-supervised node classification. In: Proc of conf on uncertainty in AI"},{"key":"838_CR36","doi-asserted-by":"crossref","unstructured":"Bojchevski A, Klicpera J, Perozzi B, Kapoor A, Blais M, R\u00f3zemberczki B, Lukasik M, G\u00fcnnemann S (2020) Scaling graph neural networks with approximate pagerank. In: Proc of ACM SIGKDD","DOI":"10.1145\/3394486.3403296"},{"key":"838_CR37","doi-asserted-by":"crossref","unstructured":"Palowitch J, Perozzi B (2020) MONET: debiasing graph embeddings via the metadata-orthogonal training unit. In: Proc of IEEE\/ACM intl conf on advances in social networks analysis and mining","DOI":"10.1109\/ASONAM49781.2020.9381348"},{"issue":"1","key":"838_CR38","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1186\/s40649-019-0069-y","volume":"6","author":"S Zhang","year":"2019","unstructured":"Zhang S, Tong H, Xu J, Maciejewski R (2019) Graph convolutional networks: a comprehensive review. Comput Soc Netw 6(1):11","journal-title":"Comput Soc Netw"},{"key":"838_CR39","unstructured":"Bruna J, Zaremba W, Szlam A, LeCun Y (2014) Spectral networks and locally connected networks on graphs. In: Proc of ICLR"},{"key":"838_CR40","unstructured":"Henaff M, Bruna J, LeCun Y (2015) Deep convolutional networks on graph structured data. arXiv:1506.05163"},{"key":"838_CR41","unstructured":"Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In: Proc of NIPS, pp 3844\u20133852"},{"key":"838_CR42","unstructured":"Kipf T, Willing M (2017) Semi-supervised classification with graph convolutional networks. In: Proc of ICLR"},{"key":"838_CR43","unstructured":"Duvenaud D, Maclaurin D, Aguilera-Iparraguirre J, Gomez-Bombarelli R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. In: Proc of NIPS"},{"key":"838_CR44","unstructured":"Kipf TN, Welling M (2016) Variational graph auto-encoders. In: NIPS workshop on Bayesian deep learning"},{"issue":"1","key":"838_CR45","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Random forests. Mach Learn 45(1):5\u201332","journal-title":"Mach Learn"},{"issue":"2","key":"838_CR46","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1016\/j.acha.2010.04.005","volume":"30","author":"DK Hammond","year":"2011","unstructured":"Hammond DK, Vandergheyst P, Gribonval R (2011) Wavelets on graphs via spectral graph theory. Appl Comput Harmonic Anal 30(2):129\u2013150","journal-title":"Appl Comput Harmonic Anal"},{"key":"838_CR47","unstructured":"Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proc of NIPS"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00838-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-022-00838-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00838-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,27]],"date-time":"2023-07-27T13:15:18Z","timestamp":1690463718000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-022-00838-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,14]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,8]]}},"alternative-id":["838"],"URL":"https:\/\/doi.org\/10.1007\/s40747-022-00838-y","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,14]]},"assertion":[{"value":"10 June 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 June 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 September 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}