{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,15]],"date-time":"2026-02-15T09:16:16Z","timestamp":1771146976704,"version":"3.50.1"},"reference-count":23,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2005,6,1]],"date-time":"2005-06-01T00:00:00Z","timestamp":1117584000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGKDD Explor. Newsl."],"published-print":{"date-parts":[[2005,6]]},"abstract":"<jats:p>The burgconing amount of textual data in distributed sources combined with the obstacles involved in creating and maintaining central repositories motivates the need for effective distributed information extraction and mining techniques. Recently, as the need to mine patterns across distributed databases has grown, Distributed Association Rule Mining (D-ARM) algorithms have been developed. These algorithms, however, assume that the databases are either horizontally or vertically distributed. In the special case of databases populated from information extracted from textual data, existing D-ARM algorithms cannot discover rules based on higher-order associations between items in distributed textual documents that are neither vertically nor horizontally distributed, but rather a hybrid of the two. In this article we present D-HOTM, a framework for Distributed Higher Order Text Mining. D-HOTM is a hybrid approach that combines information extraction and distributed data mining. We employ a novel information extraction technique to extract meaningful entities from unstructured text in a distributed environment. The information extracted is stored in local databases and a mapping function is applied to identify globally unique keys. Based on the extracted information, a novel distributed association rule mining algorithm is applied to discover higher-order associations between items (i.e., entities) in records fragmented across the distributed databases using the keys. Unlike existing algorithms, D-HOTM requires neither knowledge of a global schema nor that the distribution of data be horizontal or vertical. Evaluation methods are proposed to incorporate the performance of the mapping function into the traditional support metric used in ARM evaluation. An example application of the algorithm on distributed law enforcement data demonstrates the relevance of D-HOTM in the fight against terrorism.<\/jats:p>","DOI":"10.1145\/1089815.1089820","type":"journal-article","created":{"date-parts":[[2007,1,17]],"date-time":"2007-01-17T18:32:02Z","timestamp":1169058722000},"page":"26-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["Distributed higher order association rule mining using information extracted from textual data"],"prefix":"10.1145","volume":"7","author":[{"given":"Shenzhi","family":"Li","sequence":"first","affiliation":[{"name":"Lehigh University, Bethlehem, PA"}]},{"given":"Tianhao","family":"Wu","sequence":"additional","affiliation":[{"name":"Lehigh University, Bethlehem, PA"}]},{"given":"William M.","family":"Pottenger","sequence":"additional","affiliation":[{"name":"Lehigh University, Bethlehem, PA"}]}],"member":"320","published-online":{"date-parts":[[2005,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/170035.170072"},{"key":"e_1_2_1_2_1","first-page":"307","volume-title":"Advances in Knowledge Discovery and Data Mining","author":"Agrawal R.","year":"1996"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.553164"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/MDSO.2004.1285877"},{"key":"e_1_2_1_5_1","unstructured":"Boyd D. Director of the Department of Homeland Security's new Office of Interoperability and Compatibility in a presentation at the Technologies for Public Safety in Critical Incident Response Conference and Exposition 2004 New Orleans LA September.  Boyd D. Director of the Department of Homeland Security's new Office of Interoperability and Compatibility in a presentation at the Technologies for Public Safety in Critical Incident Response Conference and Exposition 2004 New Orleans LA September."},{"key":"e_1_2_1_6_1","first-page":"31","volume-title":"Proc. Parallel and Distributed Information Systems, IEEE CS Press","author":"Cheung D. W.","year":"1996"},{"key":"e_1_2_1_7_1","volume-title":"Retrieved","author":"Dean M.","year":"2004"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/645484.656235"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775080"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/253260.253400"},{"key":"e_1_2_1_11_1","volume-title":"Retrieved","author":"GJXDM.","year":"2004"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/88.242438"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218213004001843"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/1034914.1034926"},{"key":"e_1_2_1_15_1","unstructured":"Papakonstantinou Y. and Vassalos V. Architecture and Implementation of an Xquery-based Information Integration Platform. IEEE Data Engineering Bullentin vol 25 n. 1 pg 18--26 2002.  Papakonstantinou Y. and Vassalos V. Architecture and Implementation of an Xquery-based Information Integration Platform. IEEE Data Engineering Bullentin vol 25 n. 1 pg 18--26 2002."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/s007780100057"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/375663.375728"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775142"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/971617.971618"},{"key":"e_1_2_1_20_1","volume-title":"Morgan Kaufmann","author":"Witten I.","year":"2000"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/1059502.1059506"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/4434.806975"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1008694.1008698"}],"container-title":["ACM SIGKDD Explorations Newsletter"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1089815.1089820","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1089815.1089820","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T16:08:16Z","timestamp":1750262896000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1089815.1089820"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,6]]},"references-count":23,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2005,6]]}},"alternative-id":["10.1145\/1089815.1089820"],"URL":"https:\/\/doi.org\/10.1145\/1089815.1089820","relation":{},"ISSN":["1931-0145","1931-0153"],"issn-type":[{"value":"1931-0145","type":"print"},{"value":"1931-0153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,6]]},"assertion":[{"value":"2005-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}