{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,9]],"date-time":"2025-09-09T22:33:46Z","timestamp":1757457226968},"reference-count":43,"publisher":"IGI Global","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,7,1]]},"abstract":"<p>The Web brings an open-ended set of semantic relations. Discovering the significant types is very challenging. Unsupervised algorithms have been developed to extract relations from a corpus without knowing the relation types in advance, but most rely on tagging arguments of predefined types. One recently reported system is able to jointly extract relations and their argument semantic classes, taking a set of relation instances extracted by an open IE (Information Extraction) algorithm as input. However, it cannot handle polysemy of relation phrases and fails to group many similar (\u201csynonymous\u201d) relation instances because of the sparseness of features. In this paper, the authors present a novel unsupervised algorithm that provides a more general treatment of the polysemy and synonymy problems. The algorithm incorporates various knowledge sources which they will show to be very effective for unsupervised relation extraction. Moreover, it explicitly disambiguates polysemous relation phrases and groups synonymous ones. While maintaining approximately the same precision, the algorithm achieves significant improvement on recall compared to the previous method. It is also very efficient. Experiments on a real-world dataset show that it can handle 14.7 million relation instances and extract a very large set of relations from the Web.<\/p>","DOI":"10.4018\/jswis.2012070101","type":"journal-article","created":{"date-parts":[[2013,1,14]],"date-time":"2013-01-14T21:57:39Z","timestamp":1358200659000},"page":"1-23","source":"Crossref","is-referenced-by-count":11,"title":["Towards Large-Scale Unsupervised Relation Extraction from the Web"],"prefix":"10.4018","volume":"8","author":[{"given":"Bonan","family":"Min","sequence":"first","affiliation":[{"name":"New York University, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuming","family":"Shi","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ralph","family":"Grishman","sequence":"additional","affiliation":[{"name":"New York University, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chin-Yew","family":"Lin","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"2432","reference":[{"key":"jswis.2012070101-0","unstructured":"Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., & Etzioni, O. (2007). Open information extraction from the web. In Proceedings of the International Joint Conference on Artificial Intelligence 2007, Hyderabad, India."},{"key":"jswis.2012070101-1","unstructured":"Banko, M., & Etzioni, O. (2008, June 15-20). The tradeoffs between open and traditional relation extraction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2008, Columbus, OH."},{"key":"jswis.2012070101-2","unstructured":"Berant, J., Dagan, I., & Goldberger, J. (2011, June 19-24). Global learning of typed entailment rules. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2011, Portland, OR."},{"key":"jswis.2012070101-3","first-page":"993","article-title":"Latent Dirichlet allocation.","volume":"3","author":"D.Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"jswis.2012070101-4","doi-asserted-by":"crossref","unstructured":"Bunescu, R., & Mooney, R. J. (2004, July). Collective information extraction with relational Markov networks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2004, Barcelona, Spain (pp. 439-446).","DOI":"10.3115\/1218955.1219011"},{"key":"jswis.2012070101-5","unstructured":"Chen, J., Ji, D., Tan, C. L., & Niu, Z. (2005, October 11-13). Unsupervised feature selection for relation extraction. In Proceedings of the International Joint Conference on Natural Language Processing 2005, Jeju Island, Korea."},{"key":"jswis.2012070101-6","doi-asserted-by":"crossref","unstructured":"Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D., & Yates, A. (2004, May 17-22). Web-scale information extraction in KnowItAll (preliminary results). In Proceedings of the International World Wide Web Conference 2004, New York, NY.","DOI":"10.1145\/988672.988687"},{"key":"jswis.2012070101-7","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2005.03.001"},{"key":"jswis.2012070101-8","unstructured":"Fader, A., Soderland, S., & Etzioni, O. (2011, July 27-31). Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2011, Edinburgh, UK."},{"key":"jswis.2012070101-9","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/7287.001.0001","author":"C.Fellbaum","year":"1998","journal-title":"WordNet: An electronic lexical database"},{"key":"jswis.2012070101-10","author":"Z. S.Harris","year":"1985","journal-title":"Distributional structure. The philosophy of linguistics"},{"key":"jswis.2012070101-11","doi-asserted-by":"crossref","unstructured":"Hasegawa, T., Sekine, S., & Grishman, R. (2004, July 21-26). Discovering relations among named entities from large corpora. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2004, Barcelona, Spain.","DOI":"10.3115\/1218955.1219008"},{"key":"jswis.2012070101-12","doi-asserted-by":"crossref","unstructured":"Hearst, M. A. (1992, August 23-28). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the International Conference on Computational Linguistics 1992, Nantes, France.","DOI":"10.3115\/992133.992154"},{"key":"jswis.2012070101-13","unstructured":"Kok, S., & Domingos, P. (2008, September 15-19). Extracting semantic networks from text via relational clustering. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2008, Antwerp, Belgium."},{"key":"jswis.2012070101-14","unstructured":"Kozareva, Z., Riloff, E., & Hovy, E. (2008, June 15-20). Semantic class learning from the web with hypo-nym pattern linkage graphs. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2008, Columbus, OH."},{"key":"jswis.2012070101-15","doi-asserted-by":"crossref","unstructured":"Lin, D., & Pantel, P. (2001, August 26-29). DIRT \u2013 discovery of inference rules from text. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2001, San Francisco, CA.","DOI":"10.1145\/502512.502559"},{"key":"jswis.2012070101-16","doi-asserted-by":"crossref","unstructured":"McCallum, A., Nigam, K., & Ungar, L. (2000, August 20-23). Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2000, Boston, MA.","DOI":"10.1145\/347090.347123"},{"key":"jswis.2012070101-17","unstructured":"Min, B., Shi, S., Grishman, R., & Lin, C. Y. (2012, July 12). Ensemble semantics for large-scale unsupervised relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2012, Jeju Island, Korea."},{"key":"jswis.2012070101-18","doi-asserted-by":"crossref","unstructured":"Pantel, P., Crestan, E., Borkovsky, A., Popescu, A., & Vyas, V. (2009, August 6-7). Web-scale distributional similarity and entity set expansion. In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2009, Singapore.","DOI":"10.3115\/1699571.1699635"},{"key":"jswis.2012070101-19","doi-asserted-by":"crossref","unstructured":"Pantel, P., & Lin, D. (2002, July 23-26). Discovering word senses from text. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2002, Edmonton, Canada.","DOI":"10.1145\/775047.775138"},{"key":"jswis.2012070101-20","unstructured":"Pantel, P., & Ravichandran, D. (2004, May 2-7). Automatically labeling semantic classes. In Proceedings of the North American Chapter of the Association for Computational Linguistics Conference 2004, Boston, MA."},{"key":"jswis.2012070101-21","doi-asserted-by":"crossref","unstructured":"Pasca, M. (2004, October 30-November 1). Acquisition of categorized named entities for web search. In Proceedings of the ACM Conference on Information and Knowledge Management 2004, Maui, HI.","DOI":"10.1145\/1031171.1031194"},{"key":"jswis.2012070101-22","doi-asserted-by":"crossref","unstructured":"Pasca, M. (2007, November 6-10). Weakly-supervised discovery of named entities using web search queries. In Proceedings of the ACM Conference on Information and Knowledge Management 2007, Lisbon, Portugal.","DOI":"10.1145\/1321440.1321536"},{"key":"jswis.2012070101-23","doi-asserted-by":"crossref","unstructured":"Pasca, M., & Dienes, P. (2005, June 25-30). Aligning needles in a haystack: Paraphrase acquisition across the web. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2005, Ann Arbor, MI.","DOI":"10.1007\/11562214_11"},{"key":"jswis.2012070101-24","doi-asserted-by":"crossref","unstructured":"Pennacchiotti, M., & Pantel, P. (2009, August 6-7). Entity extraction via ensemble semantics. In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2009, Singapore.","DOI":"10.3115\/1699510.1699542"},{"key":"jswis.2012070101-25","doi-asserted-by":"crossref","unstructured":"Rosenfeld, B., & Feldman, R. (2007, November 6-10). Clustering for unsupervised relation identification. In Proceedings of the ACM Conference on Information and Knowledge Management 2007, Lisbon, Portugal.","DOI":"10.1145\/1321440.1321499"},{"key":"jswis.2012070101-26","doi-asserted-by":"crossref","unstructured":"Sarmento, L., Jijkoun, V., de Rijke, M., & Oliveira, E. (2007, November 6-10). More like these: Growing entity classes from seeds. In Proceedings of the ACM Conference on Information and Knowledge Management 2007, Lisbon, Portugal.","DOI":"10.1145\/1321440.1321585"},{"key":"jswis.2012070101-27","unstructured":"Sekine, S. (2005, October 14). Automatic paraphrase discovery based on context and keywords between NE pairs. In Proceedings of the International Workshop on Paraphrasing 2005, Jeju Island, Korea."},{"key":"jswis.2012070101-28","doi-asserted-by":"crossref","unstructured":"Shinyama, Y., & Sekine, S. (2006, June 4-9). Preemptive Information extraction using unrestricted relation discovery. In Proceedings of the North American Chapter of the Association for Computational Linguistics Conference 2006, New York, NY.","DOI":"10.3115\/1220835.1220874"},{"key":"jswis.2012070101-29","unstructured":"Snow, R., Jurafsky, D., & Ng, A. Y. (2005, December 5-10). Learning syntactic patterns for automatic hypernym discovery. In Proceedings of Advances in Neural Information Processing Systems 2005 Conference, Vancouver, Canada."},{"key":"jswis.2012070101-30","unstructured":"Soderland, S., & Mandhani, B. (2007, March 26-28). Moving from Textual Relations to Ontologized Relations. In Proceedings of the 2007 AAAI Spring Symposium on Machine Reading, Stanford, CA."},{"key":"jswis.2012070101-31","doi-asserted-by":"crossref","unstructured":"Talukdar, P. P., Reisinger, J., Pasca, M., Ravichandran, D., Bhagat, R., & Pereira, F. (2008, October 25-27). Weakly-supervised acquisition of labeled class instances using graph random walks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2008, Waikiki, HI.","DOI":"10.3115\/1613715.1613787"},{"key":"jswis.2012070101-32","unstructured":"Vickrey, D., Kipersztok, O., & Koller, D. (2010, July 11-16). An active learning approach to finding related terms. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2010. Uppsala, Sweden."},{"key":"jswis.2012070101-33","doi-asserted-by":"crossref","unstructured":"Vyas, V., & Pantel, P. (2009, May 31-June 5). Semiautomatic entity set refinement. In Proceedings of the North American Chapter of the Association for Computational Linguistics Conference 2009, Boulder, CO.","DOI":"10.3115\/1620754.1620796"},{"key":"jswis.2012070101-34","doi-asserted-by":"crossref","unstructured":"Vyas, V., Pantel, P., & Crestan, E. (2009, October 29-November 2). Helping editors choose better seed sets for entity set expansion. In Proceedings of the ACM Conference on Information and Knowledge Management 2009, Maui, HI.","DOI":"10.1145\/1645953.1645984"},{"key":"jswis.2012070101-35","doi-asserted-by":"crossref","unstructured":"Wang, R. C., & Cohen, W. W. (2007, October 28-31). Language-independent set expansion of named entities using the web. In Proceedings of IEEE International Conference on Data Mining 2007, Omaha, NE.","DOI":"10.1109\/ICDM.2007.104"},{"key":"jswis.2012070101-36","unstructured":"Wang, R. C., & Cohen, W. W. (2009, August 2-7). Automatic set instance extraction using the web. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2009, Singapore."},{"key":"jswis.2012070101-37","doi-asserted-by":"crossref","unstructured":"Wang, W., Besan\u00e7on, R., & Ferret, O. (2011). Filtering and Clustering Relations for Unsupervised Information Extraction in Open Domain. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, Glasgow, Scotland, UK.","DOI":"10.1145\/2063576.2063780"},{"key":"jswis.2012070101-38","unstructured":"Wu, F., & Weld, D. S. (2010, July 11-16). Open information extraction using Wikipedia. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2010, Uppsala, Sweden."},{"key":"jswis.2012070101-39","doi-asserted-by":"crossref","unstructured":"Wu, H., & Zhou, M. (2003, July 12). Synonymous collocation extraction using translation information. In Proceedings of the ACL Workshop on Multiword Expressions: Integrating Processing 2003, Sapporo, Japan.","DOI":"10.3115\/1075096.1075112"},{"key":"jswis.2012070101-40","unstructured":"Yao, L., Haghighi, A., Riedel, S., & McCallum, A. (2011, July 27-31). Structured relation discovery using generative models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2011, Edinburgh, UK."},{"key":"jswis.2012070101-41","unstructured":"Yates, A., & Etzioni, O. (2007, April 22-27). Unsupervised resolution of objects and relations on the web. In Proceedings of the North American Chapter of the Association for Computational Linguistics Conference 2007, Rochester, NY."},{"key":"jswis.2012070101-42","unstructured":"Zhang, H., Zhu, M., Shi, S., & Wen, J. (2009, August 2-7). Employing topic models for pattern-based semantic class discovery. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2009, Singapore."}],"container-title":["International Journal on Semantic Web and Information Systems"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=74337","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,1]],"date-time":"2022-06-01T21:36:24Z","timestamp":1654119384000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jswis.2012070101"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2012,7,1]]},"references-count":43,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2012,7]]}},"URL":"https:\/\/doi.org\/10.4018\/jswis.2012070101","relation":{},"ISSN":["1552-6283","1552-6291"],"issn-type":[{"value":"1552-6283","type":"print"},{"value":"1552-6291","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,7,1]]}}}