{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T16:26:48Z","timestamp":1754152008689,"version":"3.41.2"},"reference-count":106,"publisher":"Association for Computing Machinery (ACM)","issue":"6","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2025,7,31]]},"abstract":"<jats:p>Graph classification aims to categorize graphs based on their structural and attribute features, with applications in diverse fields such as social network analysis and bioinformatics. Among the methods proposed to solve this task, those relying on patterns (i.e., subgraphs) provide good explainability, as the patterns used for classification can be directly interpreted. To identify meaningful patterns, a standard approach is to use a quality measure, i.e., a function that evaluates the discriminative power of each pattern. However, the literature provides tens of such measures, making it difficult to select the most appropriate for a given application. Only a handful of surveys try to provide some insight by comparing these measures, and none of them specifically focuses on graphs. This typically results in the systematic use of the most widespread measures, without thorough evaluation. To address this issue, we present a comparative analysis of 38 quality measures from the literature. We characterize them theoretically, based on four mathematical properties. We leverage publicly available datasets to constitute a benchmark, and propose a method to elaborate a gold standard ranking of the patterns. We exploit these resources to perform an empirical comparison of the measures, both in terms of pattern ranking and classification performance. Moreover, we propose a clustering-based preprocessing step, which groups patterns appearing in the same graphs to enhance classification performance. Our experimental results demonstrate the effectiveness of this step, reducing the number of patterns to be processed while achieving comparable performance. Additionally, we show that some popular measures widely used in the literature are not associated with the best results.<\/jats:p>","DOI":"10.1145\/3743143","type":"journal-article","created":{"date-parts":[[2025,6,11]],"date-time":"2025-06-11T09:40:08Z","timestamp":1749634808000},"page":"1-49","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Pattern-Based Graph Classification: Comparison of Quality Measures and Importance of Preprocessing"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9053-6604","authenticated-orcid":false,"given":"Lucas","family":"Potin","sequence":"first","affiliation":[{"name":"Laboratoire Informatique d\u2019Avignon, Avignon Universit\u00e9, Avignon, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0344-2686","authenticated-orcid":false,"given":"Rosa","family":"Figueiredo","sequence":"additional","affiliation":[{"name":"Laboratoire Informatique d\u2019Avignon, Avignon Universit\u00e9, Avignon, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2619-2835","authenticated-orcid":false,"given":"Vincent","family":"Labatut","sequence":"additional","affiliation":[{"name":"Laboratoire Informatique d\u2019Avignon, Avignon Universit\u00e9, Avignon, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1059-4095","authenticated-orcid":false,"given":"Christine","family":"Largeron","sequence":"additional","affiliation":[{"name":"Laboratoire Hubert Curien, Universite Jean Monnet, Saint-Etienne, France"}]}],"member":"320","published-online":{"date-parts":[[2025,7,21]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2016.01.030"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-07821-2_2"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/170036.170072"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1142\/s0219649204000869"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-75765-6_2"},{"key":"e_1_3_2_7_2","first-page":"115","volume-title":"3rd International Conference on Knowledge Discovery and Data Mining","author":"Ali K.","year":"1997","unstructured":"K. Ali, S. Manganaris, and R. Srikant. 1997. Partial classification using association rules. In 3rd International Conference on Knowledge Discovery and Data Mining, 115\u2013118. Retrieved from https:\/\/dl.acm.org\/doi\/10.5555\/3001392.3001412"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-64575-6_68"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-48061-7_59"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1111\/0824-7935.00154"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1177\/0972150921988955"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.123667"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/312129.312263"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1201\/9781315139470"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/253262.253325"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/1961189.1961199"},{"key":"e_1_3_2_17_2","unstructured":"Y. Chen W. Gan Y. Wu and P. S. Yu. 2022. Contrast pattern mining: A survey. arXiv:2209.13556. Retrieved from https:\/\/arxiv.org\/abs\/2209.13556"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3473042"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.5555\/89086.89095"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/bf00994018"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.5555\/3495724.3497168"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1021\/jm00106a046"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-021-02550-9"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/s0022-2836(03)00628-4"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1201\/b12986"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/312129.312191"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1137\/s0895480102412856"},{"key":"e_1_3_2_28_2","unstructured":"G. Fang W. Wang B. Oatley B. Van Ness M. Steinbach and V. Kumar. 2011. Characterizing discriminative patterns. arXiv:1102.4104. Retrieved from https:\/\/arxiv.org\/abs\/1102.4104"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1469-1809.1936.tb02137.x"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-37188-3_13"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46131-1_8"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/access.2021.3119110"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/tfuzz.2020.2992849"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-41822-8_39"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2010.04.008"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.2307\/2532178"},{"key":"e_1_3_2_37_2","unstructured":"R. Gras S. Ag Almouloud M. Bailleul A. Larher M. Polo H. Ratsimba-Rajohn and A. Totohasina. 1996. L\u2019implication statistique nouvelle m\u00e9thode exploratoire de donn\u00e9es. La Pens\u00e9e Sauvage. Retrieved from https:\/\/cir.nii.ac.jp\/crid\/1130282269046733952"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505614"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1515\/comp-2018-0018"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/cvpr.2007.383049"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1137\/1008024"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.123012"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2016.06.004"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/icdm.2003.1250974"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1017\/s0269888912000331"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1002\/minf.201800155"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dam.2018.02.018"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/tse.2021.3069978"},{"issue":"1","key":"e_1_3_2_49_2","first-page":"373","article-title":"Semantic malware detection by deploying graph mining","volume":"9","author":"Karbalaie F.","year":"2012","unstructured":"F. Karbalaie, A. Sami, and M. Ahmadi. 2012. Semantic malware detection by deploying graph mining. International Journal of Computer Science Issues 9, 1 (2012), 373. Retrieved from https:\/\/www.researchgate.net\/publication\/257351721","journal-title":"International Journal of Computer Science Issues"},{"key":"e_1_3_2_50_2","first-page":"321","volume-title":"20th International Conference on Machine Learning","author":"Kashima H.","year":"2003","unstructured":"H. Kashima, K. Tsuda, and A. H. Inokuchi. 2003. Marginalized kernels between labeled graphs. In 20th International Conference on Machine Learning, 321\u2013328. Retrieved from https:\/\/cdn.aaai.org\/ICML\/2003\/ICML03-044.pdf"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/30.1-2.81"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-58547-0_12"},{"key":"e_1_3_2_53_2","first-page":"249","volume-title":"Explora: A Multipattern and Multistrategy Discovery Assistant","author":"Kl\u00f6sgen W.","year":"1996","unstructured":"W. Kl\u00f6sgen. 1996. Explora: A Multipattern and Multistrategy Discovery Assistant. American Association for Artificial Intelligence, 249\u2013271. Retrieved from https:\/\/dl.acm.org\/doi\/10.5555\/257938.257965"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-44673-7_1"},{"key":"e_1_3_2_55_2","first-page":"1623","volume-title":"30th International Conference on Neural Information Processing Systems","author":"Kriege N. M.","unstructured":"N. M. Kriege, P. L. Giscard, and R. Wilson. 2016. On valid optimal assignment kernels and applications to graph classification. In 30th International Conference on Neural Information Processing Systems, 1623\u20131631. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2016\/hash\/0efe32849d230d7f53049ddc4a4b0c60-Abstract.html"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1007\/s41109-019-0195-3"},{"key":"e_1_3_2_57_2","first-page":"26598","article-title":"Shapley residuals: Quantifying the limits of the Shapley value for explanations","volume":"34","author":"Kumar I.","year":"2021","unstructured":"I. Kumar, C. Scheidegger, S. Venkatasubramanian, and S. Friedler. 2021. Shapley residuals: Quantifying the limits of the Shapley value for explanations. In Advances in Neural Information Processing Systems, Vol. 34, 26598\u201326608. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2021\/file\/dfc6aa246e88ab3e32caeaaecf433550-Paper.pdf","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-48751-4_17"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.5555\/1005332.1005338"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-44918-8_3"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45571-x_29"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-19-8043-5_26"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10723-020-09526-y"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.3233\/ida-140705"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-019-0138-9"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295230"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-020-01523-7"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/1168149.1168167"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3487553.3524258"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.5555\/2832747.2832773"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1098\/rsta.1896.0007"},{"key":"e_1_3_2_72_2","first-page":"229","volume-title":"Knowledge Discovery in Databases","author":"Piatetsky-Shapiro G.","year":"1991","unstructured":"G. Piatetsky-Shapiro. 1991. Discovery, analysis and presentation of strong rules. In Knowledge Discovery in Databases. G. Piatetsky-Shapiro and W. Frawley (Eds.), AAAI Press, Chapter 13, 229\u2013248. Retrieved from https:\/\/mitpress.mit.edu\/9780262660709\/knowledge-discovery-in-databases\/"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1145\/380995.381018"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-43427-3_5"},{"key":"e_1_3_2_75_2","first-page":"1409","article-title":"A statistical method for evaluating systematic relationships","volume":"38","author":"Sokal R. R.","year":"1958","unstructured":"R. R. Sokal and C. D. Michener. 1958. A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin 38 (1958), 1409\u20131438. Retrieved from http:\/\/www.citeulike.org\/user\/BioNica\/article\/5845721","journal-title":"University of Kansas Science Bulletin"},{"key":"e_1_3_2_76_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11280-006-0012-7"},{"key":"e_1_3_2_77_2","first-page":"5448","volume-title":"36th International Conference on Machine Learning. Proceedings of Machine Learning Research","volume":"97","author":"Rieck B.","year":"2019","unstructured":"B. Rieck, C. Bock, and K. Borgwardt. 2019. A persistent Weisfeiler-Lehman procedure for graph classification. In 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, Vol. 97, 5448\u20135458. Retrieved from https:\/\/proceedings.mlr.press\/v97\/rieck19a.html"},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-89689-0_33"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/p15-1164"},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3159678"},{"key":"e_1_3_2_81_2","first-page":"28","volume-title":"European Knowledge Acquisition Workshop","volume":"88","author":"Sebag M.","year":"1988","unstructured":"M. Sebag and M. Schoenauer. 1988. Generation of rules with certainty and confidence factors from incomplete and incoherent learning bases. In European Knowledge Acquisition Workshop, Vol. 88, 28. Retrieved from https:\/\/publica.fraunhofer.de\/entities\/event\/b65b8403-dcf9-4edd-9ccd-2b066d3b6608\/details"},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1515\/9781400881970-018"},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.1016\/s0306-4379(03)00072-3"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-023-02010-5"},{"key":"e_1_3_2_86_2","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972795.92"},{"key":"e_1_3_2_87_2","doi-asserted-by":"publisher","DOI":"10.1002\/sam.10084"},{"key":"e_1_3_2_88_2","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972764.76"},{"key":"e_1_3_2_89_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btg130"},{"key":"e_1_3_2_90_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4419-6045-0_11"},{"key":"e_1_3_2_91_2","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630300621"},{"key":"e_1_3_2_92_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-33858-3_2"},{"key":"e_1_3_2_93_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2022.102097"},{"key":"e_1_3_2_94_2","doi-asserted-by":"publisher","DOI":"10.1109\/icdm.2006.39"},{"key":"e_1_3_2_95_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-005-0255-4"},{"key":"e_1_3_2_96_2","doi-asserted-by":"publisher","DOI":"10.1145\/1852102.1852106"},{"key":"e_1_3_2_97_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.09.008"},{"key":"e_1_3_2_98_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2022.109849"},{"key":"e_1_3_2_99_2","doi-asserted-by":"publisher","DOI":"10.1109\/tcyb.2014.2327111"},{"key":"e_1_3_2_100_2","doi-asserted-by":"publisher","DOI":"10.1109\/tnnls.2020.2978386"},{"key":"e_1_3_2_101_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2002.1184038"},{"key":"e_1_3_2_102_2","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783417"},{"key":"e_1_3_2_103_2","doi-asserted-by":"publisher","DOI":"10.5555\/645526.657137"},{"key":"e_1_3_2_104_2","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972733.40"},{"key":"e_1_3_2_105_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-46027-6"},{"key":"e_1_3_2_106_2","doi-asserted-by":"publisher","DOI":"10.1002\/widm.10"},{"key":"e_1_3_2_107_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.aiopen.2021.01.001"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3743143","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T11:02:53Z","timestamp":1753095773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3743143"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,21]]},"references-count":106,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,7,31]]}},"alternative-id":["10.1145\/3743143"],"URL":"https:\/\/doi.org\/10.1145\/3743143","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"type":"print","value":"1556-4681"},{"type":"electronic","value":"1556-472X"}],"subject":[],"published":{"date-parts":[[2025,7,21]]},"assertion":[{"value":"2024-10-03","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-26","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}