{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T12:12:33Z","timestamp":1763467953780,"version":"3.41.0"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2010,1,1]],"date-time":"2010-01-01T00:00:00Z","timestamp":1262304000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2010,1]]},"abstract":"<jats:p>In this article, we propose a methodology for identifying predictive physiological patterns in the absence of prior knowledge. We use the principle of conservation to identify activity that consistently precedes an outcome in patients, and describe a two-stage process that allows us to efficiently search for such patterns in large datasets. This involves first transforming continuous physiological signals from patients into symbolic sequences, and then searching for patterns in these reduced representations that are strongly associated with an outcome.<\/jats:p>\n          <jats:p>Our strategy of identifying conserved activity that is unlikely to have occurred purely by chance in symbolic data is analogous to the discovery of regulatory motifs in genomic datasets. We build upon existing work in this area, generalizing the notion of a regulatory motif and enhancing current techniques to operate robustly on non-genomic data. We also address two significant considerations associated with motif discovery in general: computational efficiency and robustness in the presence of degeneracy and noise. To deal with these issues, we introduce the concept of active regions and new subset-based techniques such as a two-layer Gibbs sampling algorithm. These extensions allow for a framework for information inference, where precursors are identified as approximately conserved activity of arbitrary complexity preceding multiple occurrences of an event.<\/jats:p>\n          <jats:p>We evaluated our solution on a population of patients who experienced sudden cardiac death and attempted to discover electrocardiographic activity that may be associated with the endpoint of death. To assess the predictive patterns discovered, we compared likelihood scores for motifs in the sudden death population against control populations of normal individuals and those with non-fatal supraventricular arrhythmias. Our results suggest that predictive motif discovery may be able to identify clinically relevant information even in the absence of significant prior knowledge.<\/jats:p>","DOI":"10.1145\/1644873.1644875","type":"journal-article","created":{"date-parts":[[2010,1,12]],"date-time":"2010-01-12T20:23:07Z","timestamp":1263327787000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":20,"title":["Motif discovery in physiological datasets"],"prefix":"10.1145","volume":"4","author":[{"given":"Zeeshan","family":"Syed","sequence":"first","affiliation":[{"name":"University of Michigan"}]},{"given":"Collin","family":"Stultz","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]},{"given":"Manolis","family":"Kellis","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]},{"given":"Piotr","family":"Indyk","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]},{"given":"John","family":"Guttag","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]}],"member":"320","published-online":{"date-parts":[[2010,1,18]]},"reference":[{"volume-title":"Proceedings of the International Conference on Intelligence Systems in Molecular Biology. 21--29","author":"Bailey T.","key":"e_1_2_1_1_1","unstructured":"Bailey , T. and Eklan , C . 1995. The value of prior knowledge in discovery motifs with MEME . In Proceedings of the International Conference on Intelligence Systems in Molecular Biology. 21--29 . Bailey, T. and Eklan, C. 1995. The value of prior knowledge in discovery motifs with MEME. In Proceedings of the International Conference on Intelligence Systems in Molecular Biology. 21--29."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956808"},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Cesa-Bianchi N. and Lugosi G. 2006. Prediction Learning and Games. Cambridge University Press Cambridge UK.   Cesa-Bianchi N. and Lugosi G. 2006. Prediction Learning and Games. Cambridge University Press Cambridge UK.","DOI":"10.1017\/CBO9780511546921"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1101\/gr.849004"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1063\/1.1531823"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Durbin R. Eddy S. Krogh A. and Mitchison G. 1998. Biological Sequence Analysis. Cambridge University Press Cambridge UK.  Durbin R. Eddy S. Krogh A. and Mitchison G. 1998. Biological Sequence Analysis. Cambridge University Press Cambridge UK.","DOI":"10.1017\/CBO9780511790492"},{"volume-title":"Data Mining and Knowledge Discovery with Evolutionary Algorithms","author":"Freitas A.","key":"e_1_2_1_7_1","unstructured":"Freitas , A. 2001. Data Mining and Knowledge Discovery with Evolutionary Algorithms . Springer-Verlag , Berlin, Germany . Freitas, A. 2001. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag, Berlin, Germany."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1089\/10665270252935566"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010884214864"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150424"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1161\/01.CIR.101.23.e215"},{"volume-title":"Proceedings of the 13th International Symposium on Foundation of Intelligent Systems.","author":"Harms S.","key":"e_1_2_1_12_1","unstructured":"Harms , S. , Deogun , J. and Tadesse , T . 2002. Discovering sequential association rules with constraints and time lags in multiple sequences . In Proceedings of the 13th International Symposium on Foundation of Intelligent Systems. Harms, S., Deogun, J. and Tadesse, T. 2002. Discovering sequential association rules with constraints and time lags in multiple sequences. In Proceedings of the 13th International Symposium on Foundation of Intelligent Systems."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007396710653"},{"volume-title":"Proceedings of the 3rd International Conference on Intelligent Data Engineering and Automated Learning.","author":"Jin X.","key":"e_1_2_1_14_1","unstructured":"Jin , X. , Wang , L. , Lu , Y. , and Shi , C . 2002. Indexing and mining of the local patterns in sequence database . In Proceedings of the 3rd International Conference on Intelligent Data Engineering and Automated Learning. Jin, X., Wang, L., Lu, Y., and Shi, C. 2002. Indexing and mining of the local patterns in sequence database. In Proceedings of the 3rd International Conference on Intelligent Data Engineering and Automated Learning."},{"key":"e_1_2_1_15_1","unstructured":"Jones N. and Pevzner P. 2004. An Introduction to Bioinformatics Algorithms. The MIT Press. Cambridge MA.  Jones N. and Pevzner P. 2004. An Introduction to Bioinformatics Algorithms. The MIT Press. Cambridge MA."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature01644"},{"volume-title":"Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.","author":"Lin J.","key":"e_1_2_1_17_1","unstructured":"Lin , J. , Keogh , E. , Lonardi , S. , and Patel , P . 2002. Finding motifs in time series . In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Lin, J., Keogh, E., Lonardi, S., and Patel, P. 2002. Finding motifs in time series. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/882082.882086"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009748302351"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/312129.312281"},{"volume-title":"Proceedings of the International Conference on Data Mining.","author":"Patel P.","key":"e_1_2_1_21_1","unstructured":"Patel , P. , Keogh , E. , Lin , J. , and Lonardi , S . 2002. Mining motifs in massive time series databases . In Proceedings of the International Conference on Data Mining. Patel, P., Keogh, E., Lin, J., and Lonardi, S. 2002. Mining motifs in massive time series databases. In Proceedings of the International Conference on Data Mining."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1162\/089976601750264965"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.86.4.1183"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1155\/2007\/67938"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1644873.1644875","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1644873.1644875","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T12:41:18Z","timestamp":1750250478000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1644873.1644875"}},"subtitle":["A methodology for inferring predictive elements"],"short-title":[],"issued":{"date-parts":[[2010,1]]},"references-count":24,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,1]]}},"alternative-id":["10.1145\/1644873.1644875"],"URL":"https:\/\/doi.org\/10.1145\/1644873.1644875","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"type":"print","value":"1556-4681"},{"type":"electronic","value":"1556-472X"}],"subject":[],"published":{"date-parts":[[2010,1]]},"assertion":[{"value":"2008-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-01-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}