{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:24:51Z","timestamp":1759134291337},"reference-count":70,"publisher":"Cambridge University Press (CUP)","issue":"3","license":[{"start":{"date-parts":[[2010,6,15]],"date-time":"2010-06-15T00:00:00Z","timestamp":1276560000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2010,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We explore the use of independent component analysis (ICA) for the automatic extraction of linguistic roles or features of words. The extraction is based on the unsupervised analysis of text corpora. We contrast ICA with singular value decomposition (SVD), widely used in statistical text analysis, in general, and specifically in latent semantic analysis (LSA). However, the representations found using the SVD analysis cannot easily be interpreted by humans. In contrast, ICA applied on word context data gives distinct features which reflect linguistic categories. In this paper, we provide justification for our approach called WordICA, present the WordICA method in detail, compare the obtained results with traditional linguistic categories and with the results achieved using an SVD-based method, and discuss the use of the method in practical natural language engineering solutions such as machine translation systems. As the WordICA method is based on unsupervised learning and thus provides a general means for efficient knowledge acquisition, we foresee that the approach has a clear potential for practical applications.<\/jats:p>","DOI":"10.1017\/s1351324910000057","type":"journal-article","created":{"date-parts":[[2010,6,15]],"date-time":"2010-06-15T13:17:38Z","timestamp":1276607858000},"page":"277-308","source":"Crossref","is-referenced-by-count":12,"title":["WordICA\u2014emergence of linguistic representations for words by independent component analysis"],"prefix":"10.1017","volume":"16","author":[{"given":"TIMO","family":"HONKELA","sequence":"first","affiliation":[]},{"given":"AAPO","family":"HYV\u00c4RINEN","sequence":"additional","affiliation":[]},{"given":"JAAKKO J.","family":"V\u00c4YRYNEN","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2010,6,15]]},"reference":[{"key":"S1351324910000057_ref9","doi-asserted-by":"publisher","DOI":"10.3115\/1626394.1626403"},{"key":"S1351324910000057_ref50","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-0193(1998)6:5\/6<368::AID-HBM7>3.0.CO;2-E"},{"key":"S1351324910000057_ref8","first-page":"59","volume-title":"Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI)","author":"Buntine","year":"2004"},{"key":"S1351324910000057_ref1","first-page":"401","volume-title":"Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000)","author":"Bazzi","year":"2000"},{"key":"S1351324910000057_ref31","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/7011.001.0001","volume-title":"Unsupervised Learning","author":"Hinton","year":"1999"},{"key":"S1351324910000057_ref69","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324998001946"},{"key":"S1351324910000057_ref20","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9"},{"key":"S1351324910000057_ref10","first-page":"109","volume-title":"Proceedings of the Second Conference of the North American chapter of the Association for Computational Linguistics (NAACL'01)","author":"Choi","year":"2001"},{"key":"S1351324910000057_ref2","first-page":"198","volume-title":"Poster Proceedings of the 10th International World Wide Web Conference (WWW10)","author":"Bingham","year":"2001"},{"key":"S1351324910000057_ref53","doi-asserted-by":"publisher","DOI":"10.3115\/977035.977046"},{"key":"S1351324910000057_ref14","first-page":"91","volume-title":"Proceedings of the Fourth Conference on Computational Language Learning (CoNLL-2000)","author":"Clark","year":"2000"},{"key":"S1351324910000057_ref47","doi-asserted-by":"publisher","DOI":"10.1017\/S0257543400001061"},{"key":"S1351324910000057_ref4","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324910000057_ref37","doi-asserted-by":"publisher","DOI":"10.1109\/72.761722"},{"key":"S1351324910000057_ref63","first-page":"347","volume-title":"Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL-03)","author":"Ueffing","year":"2003"},{"key":"S1351324910000057_ref12","doi-asserted-by":"publisher","DOI":"10.3115\/974235.974260"},{"key":"S1351324910000057_ref38","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-84882-491-1"},{"key":"S1351324910000057_ref43","doi-asserted-by":"crossref","first-page":"32","DOI":"10.3923\/itj.2005.32.37","article-title":"Improving Arabic information retrieval systems using part of speech tagging","volume":"4","author":"Kanaan","year":"2005","journal-title":"Information Technology Journal"},{"key":"S1351324910000057_ref5","first-page":"689","volume-title":"Proceedings of ICA 2007, the 7th Conference on Independent Component Analysis and Signal Separation","author":"Borschbach","year":"2007"},{"key":"S1351324910000057_ref59","doi-asserted-by":"publisher","DOI":"10.1109\/SUPERC.1992.236684"},{"key":"S1351324910000057_ref21","volume-title":"AAAI Symposium on Cross-Language Text and Speech Retrieval","author":"Dumais","year":"1997"},{"key":"S1351324910000057_ref15","unstructured":"Clark A. 2001. Unsupervised Language Acquisition: Theory and Practice. PhD thesis. Falmer, East Sussex, UK: University of Sussex."},{"key":"S1351324910000057_ref17","doi-asserted-by":"publisher","DOI":"10.1145\/1187415.1187418"},{"key":"S1351324910000057_ref70","first-page":"1293","volume-title":"Proceedings of the 25th Annual Meeting of Cognitive Science Society (CogSci 2003)","author":"Yu","year":"2003"},{"key":"S1351324910000057_ref18","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511803864"},{"key":"S1351324910000057_ref11","volume-title":"The Logical Structure of Linguistic Theory","author":"Chomsky","year":"1975"},{"key":"S1351324910000057_ref34","first-page":"3","volume-title":"Proceedings of ICANN-95, International Conference on Artificial Neural Networks","author":"Honkela","year":"1995"},{"key":"S1351324910000057_ref30","volume-title":"Neural Networks. A Comprehensive Foundation","author":"Haykin","year":"1999"},{"key":"S1351324910000057_ref19","doi-asserted-by":"publisher","DOI":"10.1093\/0199246297.001.0001"},{"key":"S1351324910000057_ref23","doi-asserted-by":"publisher","DOI":"10.1016\/B978-0-444-89488-5.50115-9"},{"key":"S1351324910000057_ref7","doi-asserted-by":"publisher","DOI":"10.3115\/1075527.1075553"},{"key":"S1351324910000057_ref28","first-page":"320","volume-title":"Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL'06)","author":"Haghighi","year":"2006"},{"key":"S1351324910000057_ref24","doi-asserted-by":"publisher","DOI":"10.1080\/01638539809545029"},{"key":"S1351324910000057_ref13","first-page":"22","article-title":"Word association norms, mutual information and lexicography","volume":"16","author":"Church","year":"1990","journal-title":"Computational Linguistics"},{"key":"S1351324910000057_ref25","volume-title":"Brown Corpus Manual: Manual of Information to Accompany a Standard Corpus of Present Day Edited American English","author":"Francis","year":"1964"},{"key":"S1351324910000057_ref26","doi-asserted-by":"publisher","DOI":"10.1145\/32206.32212"},{"key":"S1351324910000057_ref27","first-page":"881","volume-title":"ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics","author":"Haghighi","year":"2006"},{"key":"S1351324910000057_ref57","first-page":"1300","volume-title":"Proceedings of the 30th Annual Conference of the Cognitive Science Society, CogSci'08","author":"Sahlgren","year":"2008"},{"key":"S1351324910000057_ref29","first-page":"148","volume-title":"Proceedings of AKRR'05, International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning","author":"Hansen","year":"2005"},{"key":"S1351324910000057_ref32","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2005.07.002"},{"key":"S1351324910000057_ref62","first-page":"152","volume-title":"Logic, Semantics and Metamathematics","author":"Tarski","year":"1983"},{"key":"S1351324910000057_ref33","first-page":"129","volume-title":"Proceedings of NCPW9, Neural Computation and Psychology Workshop","author":"Honkela","year":"2005"},{"key":"S1351324910000057_ref55","doi-asserted-by":"publisher","DOI":"10.1007\/BF00203171"},{"key":"S1351324910000057_ref36","unstructured":"Hurri J. , G\u00e4vert H. , S\u00e4rel\u00e4 J. , and Hyv\u00e4rinen A. 2002. FastICA software package. Technical report, Laboratory of Computer and Information Science, Helsinki University of Technology, Espoo, Finland."},{"key":"S1351324910000057_ref39","doi-asserted-by":"publisher","DOI":"10.1002\/0471221317"},{"key":"S1351324910000057_ref40","first-page":"296","volume-title":"Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007)","author":"Johnson","year":"2007"},{"key":"S1351324910000057_ref41","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.114.1.1"},{"key":"S1351324910000057_ref44","first-page":"868","volume-title":"Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007)","author":"Koehn","year":"2007"},{"key":"S1351324910000057_ref45","first-page":"229","volume-title":"Advances in Independent Component Analysis","author":"Kolenda","year":"2000"},{"key":"S1351324910000057_ref46","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.104.2.211"},{"key":"S1351324910000057_ref48","first-page":"603","volume-title":"Proceedings of the 18th Annual Conference of the Cognitive Science Society","author":"Lund","year":"1996"},{"key":"S1351324910000057_ref68","first-page":"219","volume-title":"Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'09)","author":"Wang","year":"2005"},{"key":"S1351324910000057_ref49","volume-title":"Foundations Of Statistical Natural Language Processing","author":"Manning","year":"1999"},{"key":"S1351324910000057_ref54","first-page":"1","volume-title":"Proceedings of the Joint IAPR International Workshops, SSPR 2004 and SPR 2004","author":"Oja","year":"2004"},{"key":"S1351324910000057_ref51","first-page":"155","article-title":"Tagging English text with a probabilistic model","volume":"20","author":"Merialdo","year":"1994","journal-title":"Computational Linguistics"},{"key":"S1351324910000057_ref16","doi-asserted-by":"publisher","DOI":"10.1016\/0165-1684(94)90029-9"},{"key":"S1351324910000057_ref52","first-page":"187","volume-title":"Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text","author":"Niu","year":"2004"},{"key":"S1351324910000057_ref56","unstructured":"Sahlgren M. 2006. The Word-Space Model: Using Distributional Analysis to Represent Syntagmatic and Paradigmatic Relations Between Words in High-Dimensional Vector Spaces. PhD thesis, Computational Linguistics, Stockholm University."},{"key":"S1351324910000057_ref61","first-page":"1","volume-title":"Proceedings of HLT'05: Conference on Human Language Technology and Empirical Methods in Natural Language Processing","author":"Steinberger","year":"2005"},{"key":"S1351324910000057_ref64","first-page":"135","volume-title":"Proceedings of AKRR'05, International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning","author":"V\u00e4yrynen","year":"2005"},{"key":"S1351324910000057_ref58","doi-asserted-by":"publisher","DOI":"10.1145\/361219.361220"},{"key":"S1351324910000057_ref65","first-page":"300","volume-title":"Proceedings of NORSIG 2004, the 6th Nordic Signal Processing Symposium","author":"V\u00e4yrynen","year":"2004"},{"key":"S1351324910000057_ref66","first-page":"101","volume-title":"Proceedings of SCAI'06, Scandinavian Conference on Artificial Intelligence","author":"V\u00e4yrynen","year":"2006"},{"key":"S1351324910000057_ref67","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.1074"},{"key":"S1351324910000057_ref60","first-page":"141","volume-title":"Proceedings of the 7th Conference on European ACL","author":"Sch\u00fctze","year":"1995"},{"key":"S1351324910000057_ref3","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1145\/564376.564444","volume-title":"Proceedings of the 25th ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Bingham","year":"2002"},{"key":"S1351324910000057_ref42","doi-asserted-by":"publisher","DOI":"10.1016\/0165-1684(91)90079-X"},{"key":"S1351324910000057_ref35","doi-asserted-by":"publisher","DOI":"10.3765\/bls.v13i0.1834"},{"key":"S1351324910000057_ref6","doi-asserted-by":"publisher","DOI":"10.3115\/974147.974178"},{"key":"S1351324910000057_ref22","first-page":"1","volume-title":"Universals in Linguistic Theory","author":"Fillmore","year":"1968"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324910000057","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,10,29]],"date-time":"2021-10-29T13:52:07Z","timestamp":1635515527000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324910000057\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,6,15]]},"references-count":70,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2010,7]]}},"alternative-id":["S1351324910000057"],"URL":"https:\/\/doi.org\/10.1017\/s1351324910000057","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,6,15]]}}}