{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:25:24Z","timestamp":1750307124665,"version":"3.41.0"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2011,10,1]],"date-time":"2011-10-01T00:00:00Z","timestamp":1317427200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2011,10]]},"abstract":"<jats:p>Statistical bilingual word alignment has been well studied in the field of machine translation. This article adapts the bilingual word alignment algorithm into a monolingual scenario to extract collocations from monolingual corpus, based on the fact that the words in a collocation tend to co-occur in similar contexts as in bilingual word alignment. First, the monolingual corpus is replicated to generate a parallel corpus, in which each sentence pair consists of two identical sentences. Next, the monolingual word alignment algorithm is employed to align potentially collocated words. Finally, the aligned word pairs are ranked according to the alignment scores and candidates with higher scores are extracted as collocations. We conducted experiments on Chinese and English corpora respectively. Compared to previous approaches that use association measures to extract collocations from co-occurrence word pairs within a given window, our method achieves higher precision and recall. According to human evaluation, our method achieves precisions of 62% on a Chinese corpus and 64% on an English corpus. In particular, we can extract collocations with longer spans, achieving a higher precision of 83% on the long-span (&gt; 6 words) Chinese collocations.<\/jats:p>","DOI":"10.1145\/2036264.2036280","type":"journal-article","created":{"date-parts":[[2012,10,12]],"date-time":"2012-10-12T20:56:02Z","timestamp":1350075362000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Two-Word Collocation Extraction Using Monolingual Word Alignment Method"],"prefix":"10.1145","volume":"3","author":[{"given":"Zhanyi","family":"Liu","sequence":"first","affiliation":[{"name":"Harbin Institute of Technology Baidu"}]},{"given":"Haifeng","family":"Wang","sequence":"additional","affiliation":[{"name":"Baidu"}]},{"given":"Hua","family":"Wu","sequence":"additional","affiliation":[{"name":"Baidu"}]},{"given":"Sheng","family":"Li","sequence":"additional","affiliation":[{"name":"Harbin Institute of Technology"}]}],"member":"320","published-online":{"date-parts":[[2011,10]]},"reference":[{"volume-title":"Johns Hopkins University Workshop.","author":"Al-Onaizan Y.","key":"e_1_2_1_1_1"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.3115\/991886.991965"},{"volume-title":"Proceedings of the ACL Workshop on Collocation: Computational Extraction. 54--60","author":"Blaheta D.","key":"e_1_2_1_3_1"},{"volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. 468--477","author":"Boukobza R.","key":"e_1_2_1_4_1"},{"volume-title":"Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives. 74--83","year":"1993","author":"Breidt E.","key":"e_1_2_1_5_1"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/972470.972474"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-009-9097-9"},{"key":"e_1_2_1_8_1","first-page":"34","article-title":"Automatic retrieval of frequent idiomatic and collocational expressions in a large corpus","volume":"4","author":"Choueka Y.","year":"1983","journal-title":"J. Liter. Linguist. Comput."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/89086.89095"},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","unstructured":"Church K. Gale W. Hanks P. and Hindle D. 1991. Using statistics in lexical analysis. In Lexical Acquisition: Using On-line Resources to Build a Lexicon U. Zernik Ed. Lawrence Erlbaum 115--164. Church K. Gale W. Hanks P. and Hindle D. 1991. Using statistics in lexical analysis. In Lexical Acquisition: Using On-line Resources to Build a Lexicon U. Zernik Ed. Lawrence Erlbaum 115--164.","DOI":"10.4324\/9781315785387-8"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/972450.972452"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/972450.972454"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220355.1220491"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.3115\/1072228.1072261"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0031619"},{"volume-title":"Memory of J","author":"Halliday M. A. K.","key":"e_1_2_1_16_1"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/243199.243212"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324900000048"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.2307\/2529310"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220175.1220301"},{"key":"e_1_2_1_21_1","first-page":"123","article-title":"Similarity based chinese synonym collocation extraction","volume":"10","author":"Li W. Y.","year":"2005","journal-title":"Computat. Linguist. Chinese Lang. Process."},{"volume-title":"Proceedings of the 1st Workshop on Computational Terminology. 57--63","year":"1998","author":"Lin D. K.","key":"e_1_2_1_22_1"},{"volume-title":"A Large Dictionary of Chinese-English Collocations","year":"2004","author":"Lin W. C.","key":"e_1_2_1_23_1"},{"volume-title":"Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering. 333--338","author":"Lu Q.","key":"e_1_2_1_24_1"},{"key":"e_1_2_1_25_1","unstructured":"Manning C. D. and Sch\u00fctze H. 1999. Foundations of Statistical Natural Language Processing. Bradford Book & MIT Press. Manning C. D. and Sch\u00fctze H. 1999. Foundations of Statistical Natural Language Processing . Bradford Book & MIT Press."},{"key":"e_1_2_1_26_1","unstructured":"Mckeown K. R. and Radev D. R. 2000. Collocations. In A Handbook of Natural Language Processing R. Dale H. Moisl and H. Somers Eds. Marcel Dekker New York 507--523. Mckeown K. R. and Radev D. R. 2000. Collocations. In A Handbook of Natural Language Processing R. Dale H. Moisl and H. Somers Eds. Marcel Dekker New York 507--523."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1093\/ijl\/13.3.151"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075218.1075274"},{"volume-title":"Proceedings of NAACL Workshop on Wordnet and Other Lexical Resources: Applications, Extensions and Customizations. 41--46","year":"2001","author":"Pearce D.","key":"e_1_2_1_29_1"},{"volume-title":"Proceedings of the COLING\/ACL Main Conference Poster Sessions. 651--658","author":"Pecina P.","key":"e_1_2_1_30_1"},{"volume-title":"Proceedings of the Workshop on Multi-Word-Expressions in a Multilingual Context. 17--24","author":"Piao S.","key":"e_1_2_1_31_1"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-12320-7_9"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220175.1220295"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/972450.972458"},{"key":"e_1_2_1_36_1","first-page":"29","article-title":"A Preliminary Study on the quantitative analysis on chinese collocations (In Chinese)","volume":"1","author":"Sun M. S.","year":"1997","journal-title":"Chinese Linguist."},{"volume-title":"Proceedings of the Chinese Information Processing International Conference. 230--236","year":"1998","author":"Sun H. L.","key":"e_1_2_1_37_1"},{"volume-title":"Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties. 12--19","author":"Villada M.","key":"e_1_2_1_38_1"},{"key":"e_1_2_1_39_1","first-page":"1","article-title":"Building a chinese collocation bank","volume":"22","author":"Xu R. F.","year":"2009","journal-title":"Int. J. Comput. Process. Lang."},{"volume-title":"Proceedings of the 7th Conference on Computational Lexicography and Corpus Research, F. Kiefer and J. Pajzs Eds. 73--81","year":"2003","author":"Walde S.","key":"e_1_2_1_40_1"},{"volume-title":"Proceedings of the Multiword Expressions Conference From Theory to Applications. 28--36","author":"Wehrli E.","key":"e_1_2_1_41_1"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220355.1220496"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1075\/ijcl.7.1.03wil"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075096.1075112"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.3115\/981658.981684"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2036264.2036280","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2036264.2036280","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:48:29Z","timestamp":1750240109000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2036264.2036280"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,10]]},"references-count":44,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2011,10]]}},"alternative-id":["10.1145\/2036264.2036280"],"URL":"https:\/\/doi.org\/10.1145\/2036264.2036280","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"type":"print","value":"2157-6904"},{"type":"electronic","value":"2157-6912"}],"subject":[],"published":{"date-parts":[[2011,10]]},"assertion":[{"value":"2010-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-10-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}