{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,22]],"date-time":"2025-12-22T05:43:40Z","timestamp":1766382220487,"version":"3.41.0"},"reference-count":58,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2022,1,8]],"date-time":"2022-01-08T00:00:00Z","timestamp":1641600000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2022,8,31]]},"abstract":"<jats:p>Risk patterns are crucial in biomedical research and have served as an important factor in precision health and disease prevention. Despite recent development in parallel and high-performance computing, existing risk pattern mining methods still struggle with problems caused by large-scale datasets, such as redundant candidate generation, inability to discover long significant patterns, and prolonged post pattern filtering. In this article, we propose a novel dynamic tree structure, Risk Hierarchical Pattern Tree (RHPTree), and a top-down search method, RHPSearch, which are capable of efficiently analyzing a large volume of data and overcoming the limitations of previous works. The dynamic nature of the RHPTree avoids costly tree reconstruction for the iterative search process and dataset updates. We also introduce two specialized search methods, the extended target search (RHPSearch-TS) and the parallel search approach (RHPSearch-SD), to further speed up the retrieval of certain items of interest. Experiments on both UCI machine learning datasets and sampled datasets of the Simons Foundation Autism Research Initiative (SFARI)\u2014Simon\u2019s Simplex Collection (SSC) datasets demonstrate that our method is not only faster but also more effective in identifying comprehensive long risk patterns than existing works. Moreover, the proposed new tree structure is generic and applicable to other pattern mining problems.<\/jats:p>","DOI":"10.1145\/3488380","type":"journal-article","created":{"date-parts":[[2022,1,8]],"date-time":"2022-01-08T20:51:00Z","timestamp":1641675060000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["RHPTree\u2014Risk Hierarchical Pattern Tree for Scalable Long Pattern Mining"],"prefix":"10.1145","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8525-0436","authenticated-orcid":false,"given":"Danlu","family":"Liu","sequence":"first","affiliation":[{"name":"University of Missouri-Columbia, Columbia, Missouri"}]},{"given":"Yu","family":"Li","sequence":"additional","affiliation":[{"name":"University of Missouri-Columbia, Columbia, Missouri"}]},{"given":"William","family":"Baskett","sequence":"additional","affiliation":[{"name":"University of Missouri-Columbia, Columbia, Missouri"}]},{"given":"Dan","family":"Lin","sequence":"additional","affiliation":[{"name":"University of Missouri-Columbia, Columbia, Missouri"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9197-9522","authenticated-orcid":false,"given":"Chi-Ren","family":"Shyu","sequence":"additional","affiliation":[{"name":"University of Missouri-Columbia, Columbia, Missouri"}]}],"member":"320","published-online":{"date-parts":[[2022,1,8]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.5555\/2677098"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2018.03.041"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.5555\/645806.756628"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-019-0081-5"},{"key":"e_1_3_1_6_2","first-page":"D832\u2013D836","article-title":"AutDB: A gene reference resource for autism research","volume":"37","author":"Basu Saumyendra N.","year":"2008","unstructured":"Saumyendra N. Basu, Ravi Kollu, and Sharmila Banerjee-Basu. 2008. AutDB: A gene reference resource for autism research. Nucleic Acids Research 37, Supplement 1 (Nov. 2008), D832\u2013D836. DOI:DOI:https:\/\/doi.org\/10.1093\/nar\/gkn835","journal-title":"Nucleic Acids Research"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-020-01899-7"},{"key":"e_1_3_1_8_2","article-title":"Parallel Collections Overview","author":"Campos Lu\u00eds","year":"2013","unstructured":"Lu\u00eds Campos, Aaron Hawley, Olivier Blanvillain, and Heather Miller. 2013. Parallel Collections Overview. Retrieved September 30, 2010 from https:\/\/docs.scala-lang.org\/overviews\/parallel-collections\/overview.html.","journal-title":"https:\/\/docs.scala-lang.org\/overviews\/parallel-collections\/overview.html"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.31887\/DCNS.2012.14.3\/pchaste"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1001\/jamainternmed.2015.7444"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1001\/archpediatrics.2009.31"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2015.03.004"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/312129.312191"},{"key":"e_1_3_1_14_2","unstructured":"Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. (2017). Retrieved 01 May 2020 from http:\/\/archive.ics.uci.edu\/ml."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2006.95"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-018-03202-2"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuron.2010.10.006"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2750353"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.105241"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-53917-6_9"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-53917-6_9"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2017.04.038"},{"issue":"66","key":"e_1_3_1_23_2","first-page":"E66","article-title":"Defining and measuring chronic conditions: Imperatives for research, policy, program, and practice","volume":"10","author":"Goodman Richard A.","year":"2013","unstructured":"Richard A. Goodman, Samuel F. Posner, Elbert S. Huang, Anand K. Parekh, and Howard K. Koh. 2013. Defining and measuring chronic conditions: Imperatives for research, policy, program, and practice. Preventing Chronic Disease 10, E66 (2013), E66. DOI:DOI:https:\/\/doi.org\/10.5888\/pcd10.120239","journal-title":"Preventing Chronic Disease"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335372"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150473"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/69.634757"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-18038-0_56"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.07.043"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2003.1245290"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-34624-8_6"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-34624-8_6"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.5555\/1236480.1236481"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/1065167.1065215"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/1454008.1454027"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/1281192.1281240"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITB.2007.891163"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2010.12.082"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2876531"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2904220"},{"key":"e_1_3_1_40_2","doi-asserted-by":"crossref","unstructured":"JoAnn E. Manson Nancy R. Cook I.-Min Lee William Christen Shari S. Bassuk Samia Mora Heike Gibson Christine M. Albert David Gordon and Trisha Copeland. 2019. Marine n-3 fatty acids and prevention of cardiovascular disease and cancer. 380 1 (2019) 23\u201332.","DOI":"10.1056\/NEJMoa1811403"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2015.1086"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330929"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2979289"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1038\/nrneurol.2016.187"},{"key":"e_1_3_1_45_2","volume-title":"Distributed Frequent Hierarchical Pattern Mining for Robust andEfficient Large-Scale Association Discovery","author":"Phinney Michael","year":"2017","unstructured":"Michael Phinney. 2017. Distributed Frequent Hierarchical Pattern Mining for Robust andEfficient Large-Scale Association Discovery. Thesis."},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2013.10.013"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-020-01464-1"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2016.2637914"},{"key":"e_1_3_1_49_2","article-title":"A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data","author":"Shabtay Lior","year":"2020","unstructured":"Lior Shabtay, Philippe Fournier-Viger, Rami Yaari, and Itai Dattner. 2020. A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data. Information Sciences 553, (2020), 353\u2013375. DOI:DOI:https:\/\/doi.org\/10.1016\/j.ins.2020.10.020","journal-title":"Information Sciences"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41525-020-0128-1"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/2677855.2677914"},{"issue":"3","key":"e_1_3_1_52_2","first-page":"227","article-title":"Explaining odds ratios","volume":"19","author":"Szumilas Magdalena","year":"2010","unstructured":"Magdalena Szumilas. 2010. Explaining odds ratios. Journal of the Canadian Academy of Child and Adolescent Psychiatry 19, 3 (2010), 227\u2013229.","journal-title":"Journal of the Canadian Academy of Child and Adolescent Psychiatry"},{"key":"e_1_3_1_53_2","doi-asserted-by":"crossref","unstructured":"Sotirios Tsimikas Ewa Karwatowska-Prokopczuk Ioanna Gouni-Berthold Jean-Claude Tardif Seth J. Baum Elizabeth Steinhagen-Thiessen Michael D. Shapiro Erik S. Stroes Patrick M. Moriarty and B\u00f8rge G. Nordestgaard. 2020. Lipoprotein (a) reduction in persons with cardiovascular disease. 382 3 (2020) 244\u2013255.","DOI":"10.1056\/NEJMoa1905239"},{"key":"e_1_3_1_54_2","article-title":"LCM ver. 2: Efficient mining algorithms for frequent\/closed\/maximal itemsets","author":"Uno Takeaki","year":"2004","unstructured":"Takeaki Uno, Masashi Kiyomi, and Hiroki Arimura. 2004. LCM ver. 2: Efficient mining algorithms for frequent\/closed\/maximal itemsets. In Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations.","journal-title":"Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1001\/jamanetworkopen.2019.14718"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-18038-0_38"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1007\/11564126_75"},{"key":"e_1_3_1_58_2","doi-asserted-by":"crossref","unstructured":"Yuan Yuan Sihong Xie Chun-Ta Lu Jie Tang and Philip S. Yu. 2016. Interpretable and Effective Opinion Spam Detection Via Temporal Patterns Mining Across Websites. In Proceedings of the 2016 IEEE International Conference on Big Data 96\u2013105 pages. DOI:DOI:https:\/\/doi.org\/10.1109\/BigData.2016.7840593","DOI":"10.1109\/BigData.2016.7840593"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2018.12.029"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3488380","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3488380","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:24Z","timestamp":1750188624000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3488380"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,8]]},"references-count":58,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,8,31]]}},"alternative-id":["10.1145\/3488380"],"URL":"https:\/\/doi.org\/10.1145\/3488380","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"type":"print","value":"1556-4681"},{"type":"electronic","value":"1556-472X"}],"subject":[],"published":{"date-parts":[[2022,1,8]]},"assertion":[{"value":"2021-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-01-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}