{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T19:53:26Z","timestamp":1775246006499,"version":"3.50.1"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2011,8,1]],"date-time":"2011-08-01T00:00:00Z","timestamp":1312156800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000145","name":"Division of Information and Intelligent Systems","doi-asserted-by":"publisher","award":["IIS-0546280CCF-0836359DMS-0915110"],"award-info":[{"award-number":["IIS-0546280CCF-0836359DMS-0915110"]}],"id":[{"id":"10.13039\/100000145","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000143","name":"Division of Computing and Communication Foundations","doi-asserted-by":"publisher","award":["IIS-0546280CCF-0836359DMS-0915110"],"award-info":[{"award-number":["IIS-0546280CCF-0836359DMS-0915110"]}],"id":[{"id":"10.13039\/100000143","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000121","name":"Division of Mathematical Sciences","doi-asserted-by":"publisher","award":["IIS-0546280CCF-0836359DMS-0915110"],"award-info":[{"award-number":["IIS-0546280CCF-0836359DMS-0915110"]}],"id":[{"id":"10.13039\/100000121","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2011,8]]},"abstract":"<jats:p>Document understanding techniques such as document clustering and multidocument summarization have been receiving much attention recently. Current document clustering methods usually represent the given collection of documents as a document-term matrix and then conduct the clustering process. Although many of these clustering methods can group the documents effectively, it is still hard for people to capture the meaning of the documents since there is no satisfactory interpretation for each document cluster. A straightforward solution is to first cluster the documents and then summarize each document cluster using summarization methods. However, most of the current summarization methods are solely based on the sentence-term matrix and ignore the context dependence of the sentences. As a result, the generated summaries lack guidance from the document clusters. In this article, we propose a new language model to simultaneously cluster and summarize documents by making use of both the document-term and sentence-term matrices. By utilizing the mutual influence of document clustering and summarization, our method makes; (1) a better document clustering method with more meaningful interpretation; and (2) an effective document summarization method with guidance from document clustering. Experimental results on various document datasets show the effectiveness of our proposed method and the high interpretability of the generated summaries.<\/jats:p>","DOI":"10.1145\/1993077.1993078","type":"journal-article","created":{"date-parts":[[2011,8,10]],"date-time":"2011-08-10T16:16:22Z","timestamp":1312992982000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":59,"title":["Integrating Document Clustering and Multidocument Summarization"],"prefix":"10.1145","volume":"5","author":[{"given":"Dingding","family":"Wang","sequence":"first","affiliation":[{"name":"Florida International University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shenghuo","family":"Zhu","sequence":"additional","affiliation":[{"name":"NEC Laboratories America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tao","family":"Li","sequence":"additional","affiliation":[{"name":"Florida International University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yun","family":"Chi","sequence":"additional","affiliation":[{"name":"NEC Laboratories America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yihong","family":"Gong","sequence":"additional","affiliation":[{"name":"NEC Laboratories America"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2011,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Blei D. M. Ng A. Y. and Jordan M. I. 2002. Latent Dirichlet allocation. In Advances in Neural Information Processing Systems 14 T. G. Dietterich S. Becker and Z. Ghahramani Eds. MIT Press Cambridge MA 601--608. Blei D. M. Ng A. Y. and Jordan M. I. 2002. Latent Dirichlet allocation. In Advances in Neural Information Processing Systems 14 T. G. Dietterich S. Becker and Z. Ghahramani Eds. MIT Press Cambridge MA 601--608.","DOI":"10.7551\/mitpress\/1120.003.0082"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of SIAM International Conference on Data Mining.","author":"Cho H.","unstructured":"Cho , H. , Dhillon , I. , Guan , Y. , and Sra , S . 2004. Minimum sum squared residue co-clustering of gene expression data . In Proceedings of SIAM International Conference on Data Mining. Cho, H., Dhillon, I., Guan, Y., and Sra, S. 2004. Minimum sum squared residue co-clustering of gene expression data. In Proceedings of SIAM International Conference on Data Mining."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.384042"},{"key":"e_1_2_1_4_1","volume-title":"Statistics: The Exploration and Analysis of Data","author":"Devore J.","year":"1977","unstructured":"Devore , J. and Peck , R . 1977 . Statistics: The Exploration and Analysis of Data . Duxbury Press . Devore, J. and Peck, R. 1977. Statistics: The Exploration and Analysis of Data. Duxbury Press."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/502512.502550"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956764"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the IEEE International Conference on Data Mining (ICDM). 107--114","author":"Ding C.","unstructured":"Ding , C. , He , X. , Zha , H. , Gu , M. , and Simon , H . 2001. A min-max cut algorithm for graph partitioning and data clustering . In Proceedings of the IEEE International Conference on Data Mining (ICDM). 107--114 . Ding, C., He, X., Zha, H., Gu, M., and Simon, H. 2001. A min-max cut algorithm for graph partitioning and data clustering. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 107--114."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150420"},{"key":"e_1_2_1_9_1","unstructured":"Duda R. Hart P. and Stork D. 2001. Pattern Classification. John Wiley and Sons Inc. Duda R. Hart P. and Stork D. 2001. Pattern Classification . John Wiley and Sons Inc."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2007.01.003"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143881"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of International Conference on Empirical Method on Natural Language Processing (EMNLP).","author":"Erkan G.","unstructured":"Erkan , G. and Radev , D . 2004. Lexpagerank: Prestige in multi-document text summarization . In Proceedings of International Conference on Empirical Method on Natural Language Processing (EMNLP). Erkan, G. and Radev, D. 2004. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of International Conference on Empirical Method on Natural Language Processing (EMNLP)."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/312624.312665"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.383955"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of International Joint Conference on Neural Networks (IJCNN).","author":"He J.","unstructured":"He , J. , Lan , M. , Tan , C. , Sung , S. , and Low , H . 2004. Initialization of cluster reffinement algorithms: A review and comparative study . In Proceedings of International Joint Conference on Neural Networks (IJCNN). He, J., Lan, M., Tan, C., Sung, S., and Low, H. 2004. Initialization of cluster reffinement algorithms: A review and comparative study. In Proceedings of International Joint Conference on Neural Networks (IJCNN)."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/312624.312649"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/974305.974329"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.1048"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(02)00222-9"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the Conference on Neural Information Processing Systems (NIPS).","author":"Lee D. D.","unstructured":"Lee , D. D. and Seung , H. S . 2001. Algorithms for non-negative matrix factorization . In Proceedings of the Conference on Neural Information Processing Systems (NIPS). Lee, D. D. and Seung, H. S. 2001. Algorithms for non-negative matrix factorization. In Proceedings of the Conference on Neural Information Processing Systems (NIPS)."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1081870.1081894"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.160"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1008992.1009031"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073160"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073445.1073465"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/564376.564411"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150439"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2009.107"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/984321.984323"},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Mani I. 2001. Automatic Summarization. John Benjamins Publishing Company. Mani I. 2001. Automatic Summarization . John Benjamins Publishing Company.","DOI":"10.1075\/nlp.3"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 2nd International Conference on Human Language Technology Research.","author":"McKeown K. R.","unstructured":"McKeown , K. R. , Barzilay , R. , Evans , D. , Hatzivassiloglou , V. , Klavans , J. L. , Nenkova , A. , Sable , C. , Schiffman , B. , and Sigelman , S . 2002. Tracking and summarizing news on a daily basis with Columbia\u2019s newsblaster . In Proceedings of the 2nd International Conference on Human Language Technology Research. McKeown, K. R., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J. L., Nenkova, A., Sable, C., Schiffman, B., and Sigelman, S. 2002. Tracking and summarizing news on a daily basis with Columbia\u2019s newsblaster. In Proceedings of the 2nd International Conference on Human Language Technology Research."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of International Conference on Natural Language Processing (IJCNLP).","author":"Mihalcea R.","unstructured":"Mihalcea , R. and Tarau , P . 2005. A language independent algorithm for single and multiple document summarization . In Proceedings of International Conference on Natural Language Processing (IJCNLP). Mihalcea, R. and Tarau, P. 2005. A language independent algorithm for single and multiple document summarization. In Proceedings of International Conference on Natural Language Processing (IJCNLP)."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/1613715.1613812"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-69507-3_66"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2003.10.006"},{"key":"e_1_2_1_36_1","unstructured":"Ricardo B. and Berthier R. 1999. Modern Information Retrieval. ACM Press. Ricardo B. and Berthier R. 1999. Modern Information Retrieval . ACM Press."},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 2862--2867","author":"Shen D.","unstructured":"Shen , D. , Sun , J.-T. , Li , H. , Yang , Q. , and Chen , Z . 2007. Document summarization using conditional random fields . In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 2862--2867 . Shen, D., Sun, J.-T., Li, H., Yang, Q., and Chen, Z. 2007. Document summarization using conditional random fields. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 2862--2867."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.868688"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1162\/153244303321897735"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of SIAM International Conference on Data Mining (SDM).","author":"Tang J.","unstructured":"Tang , J. , Yao , L. , and Chen , D . 2009. Multi-topic based query-oriented Summarization . In Proceedings of SIAM International Conference on Data Mining (SDM). Tang, J., Yao, L., and Chen, D. 2009. Multi-topic based query-oriented Summarization. In Proceedings of SIAM International Conference on Data Mining (SDM)."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277766"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 1586--1591","author":"Wan X.","unstructured":"Wan , X. and Xiao , J . 2009. Graph-based multi-modality learning for topic-focused multi-document summarization . In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 1586--1591 . Wan, X. and Xiao, J. 2009. Graph-based multi-modality learning for topic-focused multi-document summarization. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 1586--1591."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390386"},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 2903--2908","author":"Wan X.","unstructured":"Wan , X. , Yang , J. , and Xiao , J . 2007. Manifold-ranking based topic-focused multi-document summarization . In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 2903--2908 . Wan, X., Yang, J., and Xiao, J. 2007. Manifold-ranking based topic-focused multi-document summarization. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 2903--2908."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390387"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458319"},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL\u201909)","author":"Wang D.","unstructured":"Wang , D. , Zhu , S. , Li , T. , and Gong , Y . 2009. Multi-document summarization using sentence-based topic models . In Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL\u201909) . Wang, D., Zhu, S., Li, T., and Gong, Y. 2009. Multi-document summarization using sentence-based topic models. In Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL\u201909)."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277760"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390384"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/1008992.1009029"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/860435.860485"},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 1776--1782","author":"Yih W.-T.","unstructured":"Yih , W.-T. , Goodman , J. , Vanderwende , L. , and Suzuki , H . 2007. Multi-document summarization by maximizing informative content-words . In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 1776--1782 . Yih, W.-T., Goodman, J., Vanderwende, L., and Suzuki, H. 2007. Multi-document summarization by maximizing informative content-words. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 1776--1782."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.290956"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/502585.502591"},{"key":"e_1_2_1_55_1","unstructured":"Zhong S. and Ghosh J. 2003. A unified framework for model-based clustering. J. Mach. Learn. Res. 1001--1037. Zhong S. and Ghosh J. 2003. A unified framework for model-based clustering. J. Mach. Learn. Res. 1001--1037."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.5555\/1086342.1086348"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/43.784130"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1993077.1993078","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1993077.1993078","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T11:05:45Z","timestamp":1750244745000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1993077.1993078"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,8]]},"references-count":57,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2011,8]]}},"alternative-id":["10.1145\/1993077.1993078"],"URL":"https:\/\/doi.org\/10.1145\/1993077.1993078","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,8]]},"assertion":[{"value":"2009-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-08-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}