{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T03:33:09Z","timestamp":1762659189816,"version":"3.41.0"},"reference-count":27,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2012,6,1]],"date-time":"2012-06-01T00:00:00Z","timestamp":1338508800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000145","name":"Division of Information and Intelligent Systems","doi-asserted-by":"publisher","award":["IIS-0325838DMI-0600384"],"award-info":[{"award-number":["IIS-0325838DMI-0600384"]}],"id":[{"id":"10.13039\/100000145","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000143","name":"Division of Computing and Communication Foundations","doi-asserted-by":"publisher","award":["CNS-0721491CCF-0915922"],"award-info":[{"award-number":["CNS-0721491CCF-0915922"]}],"id":[{"id":"10.13039\/100000143","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000006","name":"Office of Naval Research","doi-asserted-by":"publisher","award":["N000140610607"],"award-info":[{"award-number":["N000140610607"]}],"id":[{"id":"10.13039\/100000006","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000144","name":"Division of Computer and Network Systems","doi-asserted-by":"publisher","award":["CNS-0721491CCF-0915922"],"award-info":[{"award-number":["CNS-0721491CCF-0915922"]}],"id":[{"id":"10.13039\/100000144","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS-0325838DMI-0600384"],"award-info":[{"award-number":["IIS-0325838DMI-0600384"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004963","name":"Seventh Framework Programme","doi-asserted-by":"publisher","award":["15964 AEOLUS"],"award-info":[{"award-number":["15964 AEOLUS"]}],"id":[{"id":"10.13039\/501100004963","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. ACM"],"published-print":{"date-parts":[[2012,6]]},"abstract":"<jats:p>As advances in technology allow for the collection, storage, and analysis of vast amounts of data, the task of screening and assessing the significance of discovered patterns is becoming a major challenge in data mining applications. In this work, we address significance in the context of frequent itemset mining. Specifically, we develop a novel methodology to identify a meaningful support threshold<jats:italic>s<\/jats:italic><jats:sup>*<\/jats:sup>for a dataset, such that the number of itemsets with support at least<jats:italic>s<\/jats:italic><jats:sup>*<\/jats:sup>represents a substantial deviation from what would be expected in a random dataset with the same number of transactions and the same individual item frequencies. These itemsets can then be flagged as statistically significant with a small false discovery rate. We present extensive experimental results to substantiate the effectiveness of our methodology.<\/jats:p>","DOI":"10.1145\/2220357.2220359","type":"journal-article","created":{"date-parts":[[2012,7,10]],"date-time":"2012-07-10T16:40:44Z","timestamp":1341938444000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets"],"prefix":"10.1145","volume":"59","author":[{"given":"Adam","family":"Kirsch","sequence":"first","affiliation":[{"name":"Harvard University"}]},{"given":"Michael","family":"Mitzenmacher","sequence":"additional","affiliation":[{"name":"Harvard University"}]},{"given":"Andrea","family":"Pietracaprina","sequence":"additional","affiliation":[{"name":"University of Padova"}]},{"given":"Geppino","family":"Pucci","sequence":"additional","affiliation":[{"name":"University of Padova"}]},{"given":"Eli","family":"Upfal","sequence":"additional","affiliation":[{"name":"Brown University"}]},{"given":"Fabio","family":"Vandin","sequence":"additional","affiliation":[{"name":"Brown University"}]}],"member":"320","published-online":{"date-parts":[[2012,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/275487.275490"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/170035.170072"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1214\/ss\/1177012015"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1995.tb02031.x"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1013699998"},{"key":"e_1_2_1_6_1","first-page":"36","article-title":"Determining hit rate in pattern search. In Proceedings of Pattern Detection and Discovery","volume":"2447","author":"Bolton R.","year":"2002","unstructured":"Bolton , R. , Hand , D. , and Adams , N. 2002 . Determining hit rate in pattern search. In Proceedings of Pattern Detection and Discovery . Lecture Notes in Artificial Intelligence , vol. 2447 , 36 -- 48 . Bolton, R., Hand, D., and Adams, N. 2002. Determining hit rate in pattern search. In Proceedings of Pattern Detection and Discovery. Lecture Notes in Artificial Intelligence, vol. 2447, 36--48.","journal-title":"Lecture Notes in Artificial Intelligence"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1214\/ss\/1056397487"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1080\/00031305.1999.10474456","article-title":"Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system","volume":"53","author":"DuMouchel W.","year":"1999","unstructured":"DuMouchel , W. 1999 . Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system . Amer. Statistician 53 , 177 -- 202 . DuMouchel, W. 1999. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. Amer. Statistician 53, 177--202.","journal-title":"Amer. Statistician"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/502512.502526"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150424"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 1st Workshop on Frequent Itemset Mining Implementations (FIMI03)","volume":"90","author":"Goethals B.","unstructured":"Goethals , B. , and Zaki , M. J. , Eds . 2003 . Proceedings of the 1st Workshop on Frequent Itemset Mining Implementations (FIMI03) . Vol. 90 , CEUR-WS Workshop Online Proceedings. Goethals, B., and Zaki, M. J., Eds. 2003. Proceedings of the 1st Workshop on Frequent Itemset Mining Implementations (FIMI03). Vol. 90, CEUR-WS Workshop Online Proceedings."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 2nd Workshop on Frequent Itemset Mining Implementations (FIMI04)","volume":"126","author":"Goethals B.","unstructured":"Goethals , B. , Bayardo , R. , and Zaki , M. J. , Eds . 2004 . Proceedings of the 2nd Workshop on Frequent Itemset Mining Implementations (FIMI04) . Vol. 126 , CEUR-WS Workshop Online Proceedings. Goethals, B., Bayardo, R., and Zaki, M. J., Eds. 2004. Proceedings of the 2nd Workshop on Frequent Itemset Mining Implementations (FIMI04). Vol. 126, CEUR-WS Workshop Online Proceedings."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.144"},{"key":"e_1_2_1_14_1","volume-title":"Data Mining: Concepts and Techniques. Morgan Kaufmann","author":"Han J.","year":"2001","unstructured":"Han , J. , and Kamber , M . 2001 . Data Mining: Concepts and Techniques. Morgan Kaufmann , San Mateo, CA . Han, J., and Kamber, M. 2001. Data Mining: Concepts and Techniques. Morgan Kaufmann, San Mateo, CA."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-006-0059-1"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1081870.1081887"},{"volume-title":"Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining. 274--278","author":"Megiddo N.","key":"e_1_2_1_17_1","unstructured":"Megiddo , N. and Srikant , R . 1998. Discovering predictive association rules . In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining. 274--278 . Megiddo, N. and Srikant, R. 1998. Discovering predictive association rules. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining. 274--278."},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Mitzenmacher M. and Upfal E. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press Cambridge UK. Mitzenmacher M. and Upfal E. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis . Cambridge University Press Cambridge UK.","DOI":"10.1017\/CBO9780511813603"},{"volume-title":"Proceedings of the 7th International Conference on Database Theory. 398--416","author":"Pasquier N.","key":"e_1_2_1_19_1","unstructured":"Pasquier , N. , Bastide , Y. , Taouil , R. , and Lakhal , L . 1999. Discovering frequent closed itemsets for association rules . In Proceedings of the 7th International Conference on Database Theory. 398--416 . Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. 1999. Discovering frequent closed itemsets for association rules. In Proceedings of the 7th International Conference on Database Theory. 398--416."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1137\/S0097539703422881"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1133905.1133912"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.553165"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009713703947"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/233269.233311"},{"key":"e_1_2_1_25_1","unstructured":"Tan P. Steinbach M. and Kumar V. 2006. Introduction to Data Mining. Addison Wesley. Tan P. Steinbach M. and Kumar V. 2006. Introduction to Data Mining . Addison Wesley."},{"volume-title":"Proceedings of the 31st Very Large Data Base Conference. 709--720","author":"Xin D.","key":"e_1_2_1_26_1","unstructured":"Xin , D. , Han , J. , Yan , X. , and Cheng , H . 2005. Mining compressed frequent-pattern sets . In Proceedings of the 31st Very Large Data Base Conference. 709--720 . Xin, D., Han, J., Yan, X., and Cheng, H. 2005. Mining compressed frequent-pattern sets. In Proceedings of the 31st Very Large Data Base Conference. 709--720."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014094"}],"container-title":["Journal of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2220357.2220359","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2220357.2220359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T20:00:46Z","timestamp":1750276846000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2220357.2220359"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,6]]},"references-count":27,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2012,6]]}},"alternative-id":["10.1145\/2220357.2220359"],"URL":"https:\/\/doi.org\/10.1145\/2220357.2220359","relation":{},"ISSN":["0004-5411","1557-735X"],"issn-type":[{"type":"print","value":"0004-5411"},{"type":"electronic","value":"1557-735X"}],"subject":[],"published":{"date-parts":[[2012,6]]},"assertion":[{"value":"2009-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}