{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T17:50:24Z","timestamp":1777485024369,"version":"3.51.4"},"reference-count":29,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2005,12,1]],"date-time":"2005-12-01T00:00:00Z","timestamp":1133395200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2005,12]]},"abstract":"<jats:p>\n            Graphs have become increasingly important in modelling complicated structures and schemaless data such as chemical compounds, proteins, and XML documents. Given a\n            <jats:italic>graph query<\/jats:italic>\n            , it is desirable to retrieve graphs quickly from a large database via indices. In this article, we investigate the issues of indexing graphs and propose a novel indexing model based on\n            <jats:italic>discriminative frequent structures<\/jats:italic>\n            that are identified through a graph mining process. We show that the compact index built under this model can achieve better performance in processing graph queries. Since discriminative frequent structures capture the intrinsic characteristics of the data, they are relatively stable to database updates, thus facilitating sampling-based feature extraction and incremental index maintenance. Our approach not only provides an elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit from data mining, especially frequent pattern mining. Furthermore, the concepts developed here can be generalized and applied to indexing sequences, trees, and other complicated structures as well.\n          <\/jats:p>","DOI":"10.1145\/1114244.1114248","type":"journal-article","created":{"date-parts":[[2006,5,8]],"date-time":"2006-05-08T16:09:20Z","timestamp":1147104560000},"page":"960-993","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":78,"title":["Graph indexing based on discriminative frequent structure analysis"],"prefix":"10.1145","volume":"30","author":[{"given":"Xifeng","family":"Yan","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Urbana, IL"}]},{"given":"Philip S.","family":"Yu","sequence":"additional","affiliation":[{"name":"IBM T. J. Watson Research Center, Hawthorne, NY"}]},{"given":"Jiawei","family":"Han","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Urbana, IL"}]}],"member":"320","published-online":{"date-parts":[[2005,12]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 1994 International Conference on Very Large Data Bases (VLDB'94)","author":"Agrawal R.","unstructured":"Agrawal , R. and Srikant , R . 1994. Fast algorithms for mining association rules . In Proceedings of the 1994 International Conference on Very Large Data Bases (VLDB'94) (Santiago, Chile). 487--499.]] Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 1994 International Conference on Very Large Data Bases (VLDB'94) (Santiago, Chile). 487--499.]]"},{"key":"e_1_2_1_2_1","unstructured":"Alon N. and Spencer J. 1992. The Probabilistic Method. Wiley New York NY.]]  Alon N. and Spencer J. 1992. The Probabilistic Method. Wiley New York NY.]]"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.954600"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 2002 International Conference on Data Mining (ICDM'02)","author":"Borgelt C.","unstructured":"Borgelt , C. and Berthold , M . 2002. Mining molecular fragments: Finding relevant substructures of molecules . In Proceedings of the 2002 International Conference on Data Mining (ICDM'02) (Maebashi, Japan). 211--218.]] Borgelt, C. and Berthold, M. 2002. Mining molecular fragments: Finding relevant substructures of molecules. In Proceedings of the 2002 International Conference on Data Mining (ICDM'02) (Maebashi, Japan). 211--218.]]"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872776"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564706"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/800157.805047"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 2001 International Conference on Very Large Data Bases (VLDB'01)","author":"Cooper B.","unstructured":"Cooper , B. , Sample , N. , Franklin , M. , Hjaltason , G. , and Shadmon , M . 2001. A fast index for semistructured data . In Proceedings of the 2001 International Conference on Very Large Data Bases (VLDB'01) (Roma, Italy). 341--350.]] Cooper, B., Sample, N., Franklin, M., Hjaltason, G., and Shadmon, M. 2001. A fast index for semistructured data. In Proceedings of the 2001 International Conference on Very Large Data Bases (VLDB'01) (Roma, Italy). 341--350.]]"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 1997 International Conference on Very Large Data Bases (VLDB'97)","author":"Goldman R.","unstructured":"Goldman , R. and Widom , J . 1997. Dataguides: Enabling query formulation and optimization in semistructured databases . In Proceedings of the 1997 International Conference on Very Large Data Bases (VLDB'97) (Athens, Greece). 436--445.]] Goldman, R. and Widom, J. 1997. Dataguides: Enabling query formulation and optimization in semistructured databases. In Proceedings of the 1997 International Conference on Very Large Data Bases (VLDB'97) (Athens, Greece). 436--445.]]"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the AAAI'94 Workshop on Knowledge Discovery in Databases (KDD'94)","author":"Holder L.","unstructured":"Holder , L. , Cook , D. , and Djoko , S . 1994. Substructure discovery in the subdue system . In Proceedings of the AAAI'94 Workshop on Knowledge Discovery in Databases (KDD'94) (Seattle, WA). 169--180.]] Holder, L., Cook, D., and Djoko, S. 1994. Substructure discovery in the subdue system. In Proceedings of the AAAI'94 Workshop on Knowledge Discovery in Databases (KDD'94) (Seattle, WA). 169--180.]]"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 2000 European Symposium Principle of Data Mining and Knowledge Discovery (PKDD'00)","author":"Inokuchi A.","unstructured":"Inokuchi , A. , Washio , T. , and Motoda , H . 2000. An apriori-based algorithm for mining frequent substructures from graph data . In Proceedings of the 2000 European Symposium Principle of Data Mining and Knowledge Discovery (PKDD'00) (Lyon, France). 13--23.]] Inokuchi, A., Washio, T., and Motoda, H. 2000. An apriori-based algorithm for mining frequent substructures from graph data. In Proceedings of the 2000 European Symposium Principle of Data Mining and Knowledge Discovery (PKDD'00) (Lyon, France). 13--23.]]"},{"key":"e_1_2_1_12_1","unstructured":"James C. Weininger D. and Delany J. 2003. Daylight theory manual daylight version 4.82. Daylight Chemical Information Systems Inc.]]  James C. Weininger D. and Delany J. 2003. Daylight theory manual daylight version 4.82. Daylight Chemical Information Systems Inc.]]"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 2000 International Conference on Data Engineering (ICDE'02)","author":"Kaushik R.","unstructured":"Kaushik , R. , Shenoy , P. , Bohannon , P. , and Gudes , E . 2002. Exploiting local similarity for efficient indexing of paths in graph structured data . In Proceedings of the 2000 International Conference on Data Engineering (ICDE'02) (San Jose, CA). 129--140.]] Kaushik, R., Shenoy, P., Bohannon, P., and Gudes, E. 2002. Exploiting local similarity for efficient indexing of paths in graph structured data. In Proceedings of the 2000 International Conference on Data Engineering (ICDE'02) (San Jose, CA). 129--140.]]"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 2001 International Conference on Data Mining (ICDM'01)","author":"Kuramochi M.","unstructured":"Kuramochi , M. and Karypis , G . 2001. Frequent subgraph discovery . In Proceedings of the 2001 International Conference on Data Mining (ICDM'01) (San Jose, CA). 313--320.]] Kuramochi, M. and Karypis, G. 2001. Frequent subgraph discovery. In Proceedings of the 2001 International Conference on Data Mining (ICDM'01) (San Jose, CA). 313--320.]]"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 2004 SIAM International Conference on Data Mining (SDM'04)","author":"Kuramochi M.","unstructured":"Kuramochi , M. and Karypis , G . 2004. Finding frequent patterns in a large sparse graph . In Proceedings of the 2004 SIAM International Conference on Data Mining (SDM'04) (Orlando, FL).]] Kuramochi, M. and Karypis, G. 2004. Finding frequent patterns in a large sparse graph. In Proceedings of the 2004 SIAM International Conference on Data Mining (SDM'04) (Orlando, FL).]]"},{"key":"e_1_2_1_16_1","first-page":"289","article-title":"Threading a database of protein cores","volume":"3","author":"Madej T.","year":"1995","unstructured":"Madej , T. , Gibrat , J. , and Bryant , S. 1995 . Threading a database of protein cores . Proteins 3-2 , 289 -- 306 .]] Madej, T., Gibrat, J., and Bryant, S. 1995. Threading a database of protein cores. Proteins 3-2, 289--306.]]","journal-title":"Proteins"},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Milo T. and Suciu D. 1999. Index Structures for Path Expressions. Lecture Notes in Computer Science vol. 1540. Springer-Verlag New York 277--295.]]   Milo T. and Suciu D. 1999. Index Structures for Path Expressions. Lecture Notes in Computer Science vol. 1540. Springer-Verlag New York 277--295.]]","DOI":"10.1007\/3-540-49257-7_18"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.599932"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the 1995 International Conference Very Large Data Bases (VLDB'95)","author":"Savasere A.","unstructured":"Savasere , A. , Omiecinski , E. , and Navathe , S . 1995. An efficient algorithm for mining association rules in large databases . In Proceedings of the 1995 International Conference Very Large Data Bases (VLDB'95) (Zurich, Switzerland). 432--443.]] Savasere, A., Omiecinski, E., and Navathe, S. 1995. An efficient algorithm for mining association rules in large databases. In Proceedings of the 1995 International Conference Very Large Data Bases (VLDB'95) (Zurich, Switzerland). 432--443.]]"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/543613.543620"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'99)","author":"Shokoufandeh A.","unstructured":"Shokoufandeh , A. , Dickinson , S. , Siddiqi , K. , and Zucker , S . 1999. Indexing using a spectral encoding of topological structure . In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'99) (Fort Collins, CO). IEEE Computer Society Press, Los Alamitos, CA, 2491--2497.]] Shokoufandeh, A., Dickinson, S., Siddiqi, K., and Zucker, S. 1999. Indexing using a spectral encoding of topological structure. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'99) (Fort Collins, CO). IEEE Computer Society Press, Los Alamitos, CA, 2491--2497.]]"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 2003 International Conference on Very Large Data Bases (VLDB'03)","author":"Srinivasa S.","unstructured":"Srinivasa , S. and Kumar , S . 2003. A platform based on the multidimensional data model for analysis of bio-molecular structures . In Proceedings of the 2003 International Conference on Very Large Data Bases (VLDB'03) (Berlin, Germany). 975--986.]] Srinivasa, S. and Kumar, S. 2003. A platform based on the multidimensional data model for analysis of bio-molecular structures. In Proceedings of the 2003 International Conference on Very Large Data Bases (VLDB'03) (Berlin, Germany). 975--986.]]"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 1996 International Conference on Very Large Data Bases (VLDB'96)","author":"Toivonen H.","year":"1996","unstructured":"Toivonen , H. 1996 . Sampling large databases for association rules . In Proceedings of the 1996 International Conference on Very Large Data Bases (VLDB'96) (Bombay, India). 134--145.]] Toivonen, H. 1996. Sampling large databases for association rules. In Proceedings of the 1996 International Conference on Very Large Data Bases (VLDB'96) (Bombay, India). 134--145.]]"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 2002 International Conference on Data Mining (ICDM'02)","author":"Vanetik N.","unstructured":"Vanetik , N. , Gudes , E. , and Shimony , S. E . 2002. Computing frequent graph patterns from semistructured data . In Proceedings of the 2002 International Conference on Data Mining (ICDM'02) (Maebashi, Japan). 458--465.]] Vanetik, N., Gudes, E., and Shimony, S. E. 2002. Computing frequent graph patterns from semistructured data. In Proceedings of the 2002 International Conference on Data Mining (ICDM'02) (Maebashi, Japan). 458--465.]]"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/959242.959249"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/99.641604"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 2002 International Conference on Data Mining (ICDM'02)","author":"Yan X.","unstructured":"Yan , X. and Han , J . 2002. gSpan: Graph-based substructure pattern mining . In Proceedings of the 2002 International Conference on Data Mining (ICDM'02) (Maebashi, Japan). 721--724.]] Yan, X. and Han, J. 2002. gSpan: Graph-based substructure pattern mining. In Proceedings of the 2002 International Conference on Data Mining (ICDM'02) (Maebashi, Japan). 721--724.]]"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956784"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956788"}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1114244.1114248","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1114244.1114248","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T16:08:37Z","timestamp":1750262917000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1114244.1114248"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,12]]},"references-count":29,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2005,12]]}},"alternative-id":["10.1145\/1114244.1114248"],"URL":"https:\/\/doi.org\/10.1145\/1114244.1114248","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"value":"0362-5915","type":"print"},{"value":"1557-4644","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,12]]},"assertion":[{"value":"2005-12-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}