{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T01:31:46Z","timestamp":1772501506379,"version":"3.50.1"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T00:00:00Z","timestamp":1724544000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T00:00:00Z","timestamp":1724544000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100012165","name":"Key Technologies Research and Development Program","doi-asserted-by":"publisher","award":["2023YFC2206402"],"award-info":[{"award-number":["2023YFC2206402"]}],"id":[{"id":"10.13039\/501100012165","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004739","name":"Youth Innovation Promotion Association of the Chinese Academy of Sciences","doi-asserted-by":"publisher","award":["2021156"],"award-info":[{"award-number":["2021156"]}],"id":[{"id":"10.13039\/501100004739","id-type":"DOI","asserted-by":"publisher"}]},{"name":"the Strategic Priority Research Program of the Chinese Academy of Sciences","award":["XDC02040100"],"award-info":[{"award-number":["XDC02040100"]}]},{"name":"Foundation Strengthening Program Technical Area Fund","award":["021-JCJQ-JJ-0908"],"award-info":[{"award-number":["021-JCJQ-JJ-0908"]}]},{"DOI":"10.13039\/501100013148","name":"Science and Technology Foundation of State Grid Corporation of China","doi-asserted-by":"publisher","award":["SG270000YXJS2311060"],"award-info":[{"award-number":["SG270000YXJS2311060"]}],"id":[{"id":"10.13039\/501100013148","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecurity"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Advanced Persistent Threats (APTs) achieves internal networks penetration through multiple methods, making it difficult to detect attack clues solely through boundary defense measures. To address this challenge, some research has proposed threat detection methods based on provenance graphs, which leverage entity relationships such as processes, files, and sockets found in host audit logs. However, these methods are generally inefficient, especially when faced with massive audit logs and the computational resource-intensive nature of graph algorithms. Effectively and economically extracting APT attack clues from massive system audit logs remains a significant challenge. To tackle this problem, this paper introduces the ProcSAGE method, which detects threats based on abnormal behavior patterns, offering high accuracy, low cost, and independence from expert knowledge. ProcSAGE focuses on processes or threads in host audit logs during the graph construction phase to effectively control the scale of provenance graphs and reduce performance overhead. Additionally, in the feature extraction phase, ProcSAGE considers information about the processes or threads themselves and their neighboring nodes to accurately characterize them and enhance model accuracy. In order to verify the effectiveness of the ProcSAGE method, this study conducted a comprehensive evaluation on the StreamSpot dataset. The experimental results show that the ProcSAGE method can significantly reduce the time and memory consumption in the threat detection process while improving the accuracy, and the optimization effect becomes more significant as the data size expands.<\/jats:p>","DOI":"10.1186\/s42400-024-00240-w","type":"journal-article","created":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T01:01:56Z","timestamp":1724547716000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["ProcSAGE: an efficient host threat detection method based on graph representation learning"],"prefix":"10.1186","volume":"7","author":[{"given":"Boyuan","family":"Xu","sequence":"first","affiliation":[]},{"given":"Yiru","family":"Gong","sequence":"additional","affiliation":[]},{"given":"Xiaoyu","family":"Geng","sequence":"additional","affiliation":[]},{"given":"Yun","family":"Li","sequence":"additional","affiliation":[]},{"given":"Cong","family":"Dong","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5812-8902","authenticated-orcid":false,"given":"Song","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Yuling","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Bo","family":"Jiang","sequence":"additional","affiliation":[]},{"given":"Zhigang","family":"Lu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,8,25]]},"reference":[{"key":"240_CR1","unstructured":"Alsaheel A, Nan Y, Ma S, Yu L, Walkup G, Celik ZB, Zhang X, Xu D (2021) A sequence-based learning approach for attack investigation. In: Proceedings of the 30th security symposium"},{"issue":"2","key":"240_CR2","doi-asserted-by":"publisher","first-page":"1851","DOI":"10.1109\/COMST.2019.2891891","volume":"21","author":"A Alshamrani","year":"2019","unstructured":"Alshamrani A, Myneni S, Chowdhary A, Huang D (2019) A survey on advanced persistent threats: techniques, solutions, challenges, and research opportunities. IEEE Commun Surv Tutor 21(2):1851\u20131877","journal-title":"IEEE Commun Surv Tutor"},{"key":"240_CR3","doi-asserted-by":"crossref","unstructured":"Altinisik E, Deniz F, Sencar HT (2023) Provg-searcher: a graph representation learning approach for efficient provenance graph search. In: Proceedings of the 2023 ACM SIGSAC conference on computer and communications security, pp 2247\u20132261","DOI":"10.1145\/3576915.3623187"},{"key":"240_CR4","unstructured":"Binde B, McRee R, O\u2019Connor TJ (2011) Assessing outbound traffic to uncover advanced persistent threat. SANS Institute. Whitepaper 16"},{"key":"240_CR5","unstructured":"Cheng Z, Lv Q, Liang J, Wang Y, Sun D, Pasquier T, Han X (2023) Kairos:: practical intrusion detection and investigation using whole-system provenance. arXiv preprint arXiv:2308.05034"},{"key":"240_CR6","unstructured":"Fang P, Gao P, Liu C, Ayday E, Jee K, Wang T, Ye YF, Liu Z, Xiao X (2022) $$\\{$$Back-Propagating$$\\}$$ system dependency impact for attack investigation. In: 31st USENIX security symposium (USENIX Security 22), pp 2461\u20132478"},{"key":"240_CR7","unstructured":"Fei P, Li Z, Wang Z, Yu X, Li D, Jee K (2021) $$\\{$$SEAL$$\\}$$: Storage-efficient causality analysis on enterprise logs with query-friendly compression. In: 30th USENIX security symposium (USENIX Security 21), pp 2987\u20133004"},{"key":"240_CR8","unstructured":"Forrest S, Hofmeyr SA, Somayaji A, Longstaff TA (1996) A sense of self for unix processes. In: Proceedings 1996 IEEE symposium on security and privacy, pp 120\u2013128. IEEE"},{"key":"240_CR9","unstructured":"Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 30"},{"key":"240_CR10","doi-asserted-by":"crossref","unstructured":"Han X, Pasquier TFJ-, Bates A, Mickens J, Seltzer MI (2020) Unicorn: runtime provenance-based detector for advanced persistent threats. In: 27th Annual network and distributed system security symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society. https:\/\/www.ndss-symposium.org\/ndss-paper\/unicorn-runtime-provenance-based-detector-for-advanced-persistent-threats\/","DOI":"10.14722\/ndss.2020.24046"},{"key":"240_CR11","doi-asserted-by":"crossref","unstructured":"Hassan WU, Bates A, Marino D (2020) Tactical provenance analysis for endpoint detection and response systems. In: 2020 IEEE symposium on security and privacy (SP), pp 1172\u20131189. IEEE","DOI":"10.1109\/SP40000.2020.00096"},{"key":"240_CR12","doi-asserted-by":"crossref","unstructured":"Hassan WU, Guo S, Li D, Chen Z, Jee K, Li Z, Bates A (2019) Nodoze: combatting threat alert fatigue with automated provenance triage. In: Network and distributed systems security symposium","DOI":"10.14722\/ndss.2019.23349"},{"key":"240_CR13","doi-asserted-by":"crossref","unstructured":"Hassan WU, Li D, Jee K, Yu X, Zou K, Wang D, Chen Z, Li Z, Rhee J, Gui J, et al (2020) This is why we can\u2019t cache nice things: lightning-fast threat hunting using suspicion-based hierarchical storage. In: Annual computer security applications conference, pp 165\u2013178","DOI":"10.1145\/3427228.3427255"},{"key":"240_CR14","unstructured":"Hossain MN, Milajerdi SM, Wang J, Eshete B, Gjomemo R, Sekar R, Stoller S, Venkatakrishnan V (2017) $$\\{$$SLEUTH$$\\}$$: real-time attack scenario reconstruction from $$\\{$$COTS$$\\}$$ audit data. In: 26th USENIX security symposium (USENIX Security 17), pp 487\u2013504"},{"key":"240_CR15","doi-asserted-by":"crossref","unstructured":"Hossain MN, Sheikhi S, Sekar R (2020) Combating dependence explosion in forensic analysis using alternative tag propagation semantics. In: 2020 IEEE symposium on security and privacy (SP), pp 1139\u20131155. IEEE","DOI":"10.1109\/SP40000.2020.00064"},{"key":"240_CR16","unstructured":"Hossain MN, Wang J, Weisse O, Sekar R, Genkin D, He B, Stoller SD, Fang G, Piessens F, Downing E, et al (2018) $$\\{$$Dependence-Preserving$$\\}$$ data compaction for scalable forensic analysis. In: 27th USENIX security symposium (USENIX Security 18), pp 1723\u20131740"},{"key":"240_CR17","doi-asserted-by":"crossref","unstructured":"King ST, Chen PM (2003) Backtracking intrusions. In: Proceedings of the nineteenth ACM symposium on operating systems principles, pp 223\u2013236","DOI":"10.1145\/1165389.945467"},{"key":"240_CR18","unstructured":"Kruegel C, Kirda E, Mutz D, Robertson W, Vigna G (2005) Automating mimicry attacks using static binary analysis. In: USENIX security symposium, vol 14, pp 11\u201311"},{"key":"240_CR19","doi-asserted-by":"crossref","unstructured":"Lee KH, Zhang X, Xu D (2013) Loggc: garbage collecting audit log. In: Proceedings of the 2013 ACM SIGSAC conference on computer & communications security, pp 1005\u20131016","DOI":"10.1145\/2508859.2516731"},{"key":"240_CR20","first-page":"1","volume":"2021","author":"Z Li","year":"2021","unstructured":"Li Z, Cheng X, Sun L, Zhang J, Chen B (2021) A hierarchical approach for advanced persistent threat detection with attention-based graph neural networks. Secur Commun Netw 2021:1\u201314","journal-title":"Secur Commun Netw"},{"key":"240_CR21","doi-asserted-by":"crossref","unstructured":"Liu F, Wen Y, Zhang D, Jiang X, Xing X, Meng D (2019) Log2vec: a heterogeneous graph embedding based approach for detecting cyber threats within enterprise. In: Proceedings of the 2019 ACM SIGSAC conference on computer and communications security, pp 1777\u20131794","DOI":"10.1145\/3319535.3363224"},{"key":"240_CR22","doi-asserted-by":"crossref","unstructured":"Liu Y, Zhang M, Li D, Jee K, Li Z, Wu Z, Rhee J, Mittal P (2018) Towards a timely causality analysis for enterprise security. In: NDSS","DOI":"10.14722\/ndss.2018.23254"},{"key":"240_CR23","doi-asserted-by":"crossref","unstructured":"Manzoor E, Milajerdi SM, Akoglu L (2016a) Fast memory-efficient anomaly detection in streaming heterogeneous graphs. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1035\u20131044","DOI":"10.1145\/2939672.2939783"},{"key":"240_CR24","doi-asserted-by":"crossref","unstructured":"Manzoor E, Milajerdi SM, Akoglu L (2016b) Fast memory-efficient anomaly detection in streaming heterogeneous graphs. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1035\u20131044","DOI":"10.1145\/2939672.2939783"},{"key":"240_CR25","doi-asserted-by":"crossref","unstructured":"Milajerdi SM, Eshete B, Gjomemo R, Venkatakrishnan V (2019b) Poirot: aligning attack behavior with kernel audit records for cyber threat hunting. In: Proceedings of the 2019 ACM SIGSAC conference on computer and communications security, pp. 1795\u20131812","DOI":"10.1145\/3319535.3363217"},{"key":"240_CR26","doi-asserted-by":"crossref","unstructured":"Milajerdi SM, Gjomemo R, Eshete B, Sekar R, Venkatakrishnan V (2019a) Holmes: real-time apt detection through correlation of suspicious information flows. In: 2019 IEEE symposium on security and privacy (SP), pp 1137\u20131152. IEEE","DOI":"10.1109\/SP.2019.00026"},{"key":"240_CR27","doi-asserted-by":"crossref","unstructured":"Pasquier T, Han X, Goldstein M, Moyer T, Eyers D, Seltzer M, Bacon J (2017) Practical whole-system provenance capture. In: Proceedings of the 2017 symposium on cloud computing, pp 405\u2013418","DOI":"10.1145\/3127479.3129249"},{"key":"240_CR28","doi-asserted-by":"crossref","unstructured":"Pei K, Gu Z, Saltaformaggio B, Ma S, Wang F, Zhang Z, Si L, Zhang X, Xu D (2016) Hercule: attack story reconstruction via community discovery on correlated log graph. In: Proceedings of the 32Nd annual conference on computer security applications, pp 583\u2013595","DOI":"10.1145\/2991079.2991122"},{"issue":"10","key":"240_CR29","doi-asserted-by":"publisher","first-page":"2506","DOI":"10.1109\/TIFS.2018.2821095","volume":"13","author":"X Sun","year":"2018","unstructured":"Sun X, Dai J, Liu P, Singhal A, Yen J (2018) Using bayesian networks for probabilistic identification of zero-day attack paths. IEEE Trans Inf Forensics Secur 13(10):2506\u20132521","journal-title":"IEEE Trans Inf Forensics Secur"},{"key":"240_CR30","doi-asserted-by":"crossref","unstructured":"Tan KM, Maxion RA (2002) \u201c why 6?\u201d defining the operational limits of stide, an anomaly-based intrusion detector. In: Proceedings 2002 IEEE symposium on security and privacy, pp 188\u2013201. IEEE","DOI":"10.1109\/SECPRI.2002.1004371"},{"issue":"20","key":"240_CR31","first-page":"10","volume":"1050","author":"P Velickovic","year":"2017","unstructured":"Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y et al (2017) Graph attention networks. Stat 1050(20):10\u201348550","journal-title":"Stat"},{"key":"240_CR32","doi-asserted-by":"crossref","unstructured":"Wagner D, Soto P (2002) Mimicry attacks on host-based intrusion detection systems. In: Proceedings of the 9th ACM conference on computer and communications security, pp 255\u2013264","DOI":"10.1145\/586110.586145"},{"key":"240_CR33","doi-asserted-by":"publisher","first-page":"3972","DOI":"10.1109\/TIFS.2022.3208815","volume":"17","author":"S Wang","year":"2022","unstructured":"Wang S, Wang Z, Zhou T, Sun H, Yin X, Han D, Zhang H, Shi X, Yang J (2022) Threatrace: detecting and tracing host-based threats in node level through provenance graph learning. IEEE Trans Inf Forensics Secur 17:3972\u20133987","journal-title":"IEEE Trans Inf Forensics Secur"},{"key":"240_CR34","doi-asserted-by":"crossref","unstructured":"Wang Q, Hassan WU, Li D, Jee K, Yu X, Zou K, Rhee J, Chen Z, Cheng W, Gunter CA, et al (2020) You are what you do: Hunting stealthy malware via data provenance analysis. In: NDSS","DOI":"10.14722\/ndss.2020.24167"},{"key":"240_CR35","doi-asserted-by":"crossref","unstructured":"Warrender C, Forrest S, Pearlmutter B (1999) Detecting intrusions using system calls: alternative data models. In: Proceedings of the 1999 IEEE symposium on security and privacy (Cat. No. 99CB36344), pp 133\u2013145. IEEE","DOI":"10.1109\/SECPRI.1999.766910"},{"issue":"1","key":"240_CR36","doi-asserted-by":"publisher","first-page":"551","DOI":"10.1109\/TDSC.2020.2971484","volume":"19","author":"C Xiong","year":"2020","unstructured":"Xiong C, Zhu T, Dong W, Ruan L, Yang R, Cheng Y, Chen Y, Cheng S, Chen X (2020) Conan: a practical real-time apt detection system with high accuracy and efficiency. IEEE Trans Dependable Secure Comput 19(1):551\u2013565","journal-title":"IEEE Trans Dependable Secure Comput"},{"key":"240_CR37","doi-asserted-by":"crossref","unstructured":"Xu Z, Fang P, Liu C, Xiao X, Wen Y, Meng D (2022) Depcomm: Graph summarization on system audit logs for attack investigation. In: 2022 IEEE symposium on security and privacy (SP), pp 540\u2013557. IEEE","DOI":"10.1109\/SP46214.2022.9833632"},{"key":"240_CR38","doi-asserted-by":"crossref","unstructured":"Xu Z, Wu Z, Li Z, Jee K, Rhee J, Xiao X, Xu F, Wang H, Jiang G (2016) High fidelity data reduction for big data security dependency analyses. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp 504\u2013516","DOI":"10.1145\/2976749.2978378"},{"key":"240_CR39","doi-asserted-by":"crossref","unstructured":"Zengy J, Wang X, Liu J, Chen Y, Liang Z, Chua T-S, Chua ZL (2022) Shadewatcher: Recommendation-guided cyber threat analysis using system audit records. In: 2022 IEEE symposium on security and privacy (SP), pp 489\u2013506. IEEE","DOI":"10.1109\/SP46214.2022.9833669"},{"key":"240_CR40","doi-asserted-by":"publisher","first-page":"3312","DOI":"10.1109\/TIFS.2021.3076288","volume":"16","author":"T Zhu","year":"2021","unstructured":"Zhu T, Wang J, Ruan L, Xiong C, Yu J, Li Y, Chen Y, Lv M, Chen T (2021) General, efficient, and real-time data compaction strategy for apt forensic analysis. IEEE Trans Inf Forensics Secur 16:3312\u20133325","journal-title":"IEEE Trans Inf Forensics Secur"},{"key":"240_CR41","doi-asserted-by":"publisher","first-page":"501","DOI":"10.1016\/j.future.2020.01.032","volume":"106","author":"A Zimba","year":"2020","unstructured":"Zimba A, Chen H, Wang Z, Chishimba M (2020) Modeling and detection of the multi-stages of advanced persistent threats attacks based on semi-supervised learning and complex networks characteristics. Future Gen Comput Syst 106:501\u2013517","journal-title":"Future Gen Comput Syst"}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-024-00240-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42400-024-00240-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-024-00240-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,27]],"date-time":"2024-11-27T05:47:05Z","timestamp":1732686425000},"score":1,"resource":{"primary":{"URL":"https:\/\/cybersecurity.springeropen.com\/articles\/10.1186\/s42400-024-00240-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,25]]},"references-count":41,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["240"],"URL":"https:\/\/doi.org\/10.1186\/s42400-024-00240-w","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,25]]},"assertion":[{"value":"21 January 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 April 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 August 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"51"}}