{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T06:48:12Z","timestamp":1774421292570,"version":"3.50.1"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2008,7,1]],"date-time":"2008-07-01T00:00:00Z","timestamp":1214870400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","award":["250960-06"],"award-info":[{"award-number":["250960-06"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2008,7]]},"abstract":"<jats:p>\n            Attribute data and relationship data are two principal types of data, representing the intrinsic and extrinsic properties of entities. While attribute data have been the main source of data for cluster analysis, relationship data such as social networks or metabolic networks are becoming increasingly available. It is also common to observe both data types carry complementary information such as in market segmentation and community identification, which calls for a joint cluster analysis of both data types so as to achieve better results. In this article, we introduce the novel Connected\n            <jats:italic>k<\/jats:italic>\n            -Center (\n            <jats:italic>CkC<\/jats:italic>\n            ) problem, a clustering model taking into account attribute data as well as relationship data. We analyze the complexity of the problem and prove its NP-hardness. Therefore, we analyze the approximability of the problem and also present a constant factor approximation algorithm. For the special case of the\n            <jats:italic>CkC<\/jats:italic>\n            problem where the relationship data form a tree structure, we propose a dynamic programming method giving an optimal solution in polynomial time. We further present NetScan, a heuristic algorithm that is efficient and effective for large real databases. Our extensive experimental evaluation on real datasets demonstrates the meaningfulness and accuracy of the NetScan results.\n          <\/jats:p>","DOI":"10.1145\/1376815.1376816","type":"journal-article","created":{"date-parts":[[2008,7,22]],"date-time":"2008-07-22T13:04:05Z","timestamp":1216731845000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":34,"title":["Joint cluster analysis of attribute data and relationship data"],"prefix":"10.1145","volume":"2","author":[{"given":"Rong","family":"Ge","sequence":"first","affiliation":[{"name":"Simon Fraser University, Burnaby, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Martin","family":"Ester","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Burnaby, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Byron J.","family":"Gao","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Burnaby, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zengjian","family":"Hu","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Burnaby, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Binay","family":"Bhattacharya","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Burnaby, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Boaz","family":"Ben-Moshe","sequence":"additional","affiliation":[{"name":"Ariel University Center, Ariel, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2008,7,24]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-001-0110-y"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304187"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0378-4371(02)00736-7"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/380752.380754"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014062"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btg363"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 11th Annual European Symposium on Algorithms","author":"Brandes U.","unstructured":"Brandes , U. , Gaertler , M. , and Wagner , D . 2003. Experiments on graph clustering algorithms . In Proceedings of the 11th Annual European Symposium on Algorithms ( Budapest, Hungary). Springer-Verlag, Berlin\/Heidelberg, Germany, 568--579.]] Brandes, U., Gaertler, M., and Wagner, D. 2003. Experiments on graph clustering algorithms. In Proceedings of the 11th Annual European Symposium on Algorithms (Budapest, Hungary). Springer-Verlag, Berlin\/Heidelberg, Germany, 568--579.]]"},{"key":"e_1_2_1_8_1","volume-title":"On the complexity of clustering problems","author":"Brucker P.","unstructured":"Brucker , P. 1977. On the complexity of clustering problems . In Optimization and Operations Research, R. Hehn, B. Korte, and W. Oettli, Eds. Springer-Verlag, Berlin , Germany , 45--54.]] Brucker, P. 1977. On the complexity of clustering problems. In Optimization and Operations Research, R. Hehn, B. Korte, and W. Oettli, Eds. Springer-Verlag, Berlin, Germany, 45--54.]]"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/43.310898"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.2002.1882"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2003.07.0014"},{"key":"e_1_2_1_12_1","unstructured":"CiteSeer. 2006. Scientific literature digital library. http:\/\/citeseer.ist.psu.edu\/.]]  CiteSeer. 2006. Scientific literature digital library. http:\/\/citeseer.ist.psu.edu\/.]]"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 5th SIAM International Conference on Data Mining","author":"Davidson I.","unstructured":"Davidson , I. and Ravi , S. S . 2005. Clustering with constraints: Feasibility issues and the k-means algorithm . In Proceedings of the 5th SIAM International Conference on Data Mining ( Newport Beach, CA). Society for Industrial and Applied Mathematics, Philadelphia, PA, 138--149.]] Davidson, I. and Ravi, S. S. 2005. Clustering with constraints: Feasibility issues and the k-means algorithm. In Proceedings of the 5th SIAM International Conference on Data Mining (Newport Beach, CA). Society for Industrial and Applied Mathematics, Philadelphia, PA, 138--149.]]"},{"key":"e_1_2_1_14_1","unstructured":"DBLP. Computer science bibliography. http:\/\/www.informatik.uni-trier.de\/~ley\/db\/index.html.]]  DBLP. Computer science bibliography. http:\/\/www.informatik.uni-trier.de\/~ley\/db\/index.html.]]"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.1115"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 7th Scandinavian Workshop on Algorithm Theory","author":"Doddi S.","unstructured":"Doddi , S. , Marathe , M. V. , Ravi , S. S. , Taylor , D. , and Widmayer , P . 2000. Approximation algorithms for clustering to minimize the sum of diameters . In Proceedings of the 7th Scandinavian Workshop on Algorithm Theory ( Bergen, Norway). Springer-Verlag, Berlin\/Heidelberg, Germany, 237--250.]] Doddi, S., Marathe, M. V., Ravi, S. S., Taylor, D., and Widmayer, P. 2000. Approximation algorithms for clustering to minimize the sum of diameters. In Proceedings of the 7th Scandinavian Workshop on Algorithm Theory (Bergen, Norway). Springer-Verlag, Berlin\/Heidelberg, Germany, 237--250.]]"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/0167-6377(85)90002-1"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 6th SIAM Conference on Data Mining","author":"Ester M.","unstructured":"Ester , M. , Ge , R. , Gao , B. J. , Hu , Z. , and Ben-Moshe , B . 2006. Joint cluster analysis of attribute data and relationship data: the connected k-center problem . In Proceedings of the 6th SIAM Conference on Data Mining ( Bethesda, MD). Society for Industrial and Applied Mathematics, Philadelphia, PA, 246--257.]] Ester, M., Ge, R., Gao, B. J., Hu, Z., and Ben-Moshe, B. 2006. Joint cluster analysis of attribute data and relationship data: the connected k-center problem. In Proceedings of the 6th SIAM Conference on Data Mining (Bethesda, MD). Society for Industrial and Applied Mathematics, Philadelphia, PA, 246--257.]]"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining","author":"Ester M.","unstructured":"Ester , M. , Kriegel , H. , Sander , J. , and Xu , X . 1996. A density-based algorithm for discovering clusters in large spatial databases with noise . In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining ( Portland, OR). AAAI Press, 226--231.]] Ester, M., Kriegel, H., Sander, J., and Xu, X. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (Portland, OR). AAAI Press, 226--231.]]"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/62212.62255"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 13th Annual Conference on Information Science and Systems","author":"Frederickson G. N.","unstructured":"Frederickson , G. N. and Johnson , D. B . 1979. Optimal algorithms for generating quantile information in x + y and matrices with sorted columns . In Proceedings of the 13th Annual Conference on Information Science and Systems . The Johns Hopkins Univ., Baltimore, MD, 47--52.]] Frederickson, G. N. and Johnson, D. B. 1979. Optimal algorithms for generating quantile information in x + y and matrices with sorted columns. In Proceedings of the 13th Annual Conference on Information Science and Systems. The Johns Hopkins Univ., Baltimore, MD, 47--52.]]"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(85)90224-5"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 15th International Conference on Data Engineering","author":"Guha S.","unstructured":"Guha , S. , Rastogi , R. , and Shim , K . 1999. Rock: A robust clustering algorithm for categorical attributes . In Proceedings of the 15th International Conference on Data Engineering ( Sydney, Austrialia). IEEE Computer Society, Los Alamitos, CA, 512--521.]] Guha, S., Rastogi, R., and Shim, K. 1999. Rock: A robust clustering algorithm for categorical attributes. In Proceedings of the 15th International Conference on Data Engineering (Sydney, Austrialia). IEEE Computer Society, Los Alamitos, CA, 512--521.]]"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0166-218X(98)00100-0"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/18.suppl_1.S145"},{"key":"e_1_2_1_26_1","unstructured":"Hanneman R. A. and Riddle M. 2005. Introduction to social network methods. http:\/\/faculty.ucr.edu\/~hanneman\/.]]  Hanneman R. A. and Riddle M. 2005. Introduction to social network methods. http:\/\/faculty.ucr.edu\/~hanneman\/.]]"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0020-0190(00)00142-3"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.10.2.180"},{"key":"e_1_2_1_29_1","volume-title":"Networks in Marketing","author":"Iacobucci D.","unstructured":"Iacobucci , D. 1996. Networks in Marketing . Sage Publications , Thousand Oaks, CA .]] Iacobucci, D. 1996. Networks in Marketing. Sage Publications, Thousand Oaks, CA.]]"},{"key":"e_1_2_1_30_1","unstructured":"Jain A. and Dubes R. 1988. Algorithms for clustering data. Prentice-Hall Englewood Cliffs NJ.]]   Jain A. and Dubes R. 1988. Algorithms for clustering data. Prentice-Hall Englewood Cliffs NJ.]]"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/375827.375845"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1137\/0137041"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.781637"},{"key":"e_1_2_1_34_1","volume-title":"Data: An Introduction to Cluster Analysis","author":"Kaufman L.","year":"1990","unstructured":"Kaufman , L. and Rousseeuw , P . 1990 . Finding Groups in Data: An Introduction to Cluster Analysis . Wiley , New York .]] Kaufman, L. and Rousseeuw, P. 1990. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.]]"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102409"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(92)90208-D"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1982.1056489"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 5th Berkeley Symposium on Mathematics, Statistics and Probability. 281--297","author":"MacQueen J.","year":"1967","unstructured":"MacQueen , J. 1967 . Some methods for classification and analysis of multivariate observations . In Proceedings of the 5th Berkeley Symposium on Mathematics, Statistics and Probability. 281--297 .]] MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematics, Statistics and Probability. 281--297.]]"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1137\/0213014"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1137\/0210023"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1281192.1281248"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 20th International Conference on Very Large Data Bases (Santiago de Chile, Chile). Morgan Kaufmann","author":"Ng R. T.","unstructured":"Ng , R. T. and Han , J . 1994. Efficient and effective clustering methods for spatial data mining . In Proceedings of the 20th International Conference on Very Large Data Bases (Santiago de Chile, Chile). Morgan Kaufmann , San Francisco, CA, 144--155.]] Ng, R. T. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th International Conference on Very Large Data Bases (Santiago de Chile, Chile). Morgan Kaufmann, San Francisco, CA, 144--155.]]"},{"key":"e_1_2_1_43_1","volume-title":"Social Network Analysis: A handbook","author":"Scott J.","unstructured":"Scott , J. 2000. Social Network Analysis: A handbook . Sage Publications, Thousand Oaks , CA .]] Scott, J. 2000. Social Network Analysis: A handbook. Sage Publications, Thousand Oaks, CA.]]"},{"key":"e_1_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Segal E. Wang H. and Koller D. 2003. Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics (Suppl. 1) 19 264--272.]]  Segal E. Wang H. and Koller D. 2003. Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics (Suppl. 1) 19 264--272.]]","DOI":"10.1093\/bioinformatics\/btg1037"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.868688"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1091\/mbc.9.12.3273"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkj109"},{"key":"e_1_2_1_48_1","volume-title":"KDD Workshop on Text Mining.]]","author":"Steinbach M.","unstructured":"Steinbach , M. , Karypis , G. , and Kumar , V . 2000. A comparison of document clustering techniques . In KDD Workshop on Text Mining.]] Steinbach, M., Karypis, G., and Kumar, V. 2000. A comparison of document clustering techniques. In KDD Workshop on Text Mining.]]"},{"key":"e_1_2_1_49_1","first-page":"801","article-title":"Sur la division des corp materiels en parties. Bulletin L'Acadmie Polonaise des Science C1","author":"Steinhaus H.","year":"1956","unstructured":"Steinhaus , H. 1956 . Sur la division des corp materiels en parties. Bulletin L'Acadmie Polonaise des Science C1 . III , IV , 801 -- 804 .]] Steinhaus, H. 1956. Sur la division des corp materiels en parties. Bulletin L'Acadmie Polonaise des Science C1. III, IV, 801--804.]]","journal-title":"III"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-004-1112-3"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/0167-6377(96)00021-1"},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of 17th International Joint Conference on Artificial Intelligence","author":"Taskar B.","unstructured":"Taskar , B. , Segal , E. , and Koller , D . 2001. Probabilistic classification and clustering in relational data . In Proceedings of 17th International Joint Conference on Artificial Intelligence ( Seattle, WA). Morgan Kaufmann, San Francisco, CA, 870--878.]] Taskar, B., Segal, E., and Koller, D. 2001. Probabilistic classification and clustering in relational data. In Proceedings of 17th International Joint Conference on Artificial Intelligence (Seattle, WA). Morgan Kaufmann, San Francisco, CA, 870--878.]]"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1287\/opre.19.6.1363"},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 8th International Conference on Database Theory","author":"Tung A. K. H.","unstructured":"Tung , A. K. H. , Ng , R. T. , Lakshmanan , L. V. S. , and Han , J . 2001. Constraint-based clustering in large databases . In Proceedings of the 8th International Conference on Database Theory ( London, UK). Springer-Verlag, New York, 405--419.]] Tung, A. K. H., Ng, R. T., Lakshmanan, L. V. S., and Han, J. 2001. Constraint-based clustering in large databases. In Proceedings of the 8th International Conference on Database Theory (London, UK). Springer-Verlag, New York, 405--419.]]"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1186\/1752-0509-1-8"},{"key":"e_1_2_1_56_1","doi-asserted-by":"crossref","unstructured":"Wasserman S. and Faust K. 1994. Social Network Analysis. Cambridge University Press Cambridge UK.]]  Wasserman S. and Faust K. 1994. Social Network Analysis. Cambridge University Press Cambridge UK.]]","DOI":"10.1017\/CBO9780511815478"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1441-3582(04)70094-4"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1376815.1376816","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1376815.1376816","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:57:55Z","timestamp":1750255075000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1376815.1376816"}},"subtitle":["The connected\n            <i>k<\/i>\n            -center problem, algorithms and applications"],"short-title":[],"issued":{"date-parts":[[2008,7]]},"references-count":57,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2008,7]]}},"alternative-id":["10.1145\/1376815.1376816"],"URL":"https:\/\/doi.org\/10.1145\/1376815.1376816","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,7]]},"assertion":[{"value":"2006-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2007-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2008-07-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}