{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T14:39:56Z","timestamp":1777559996905,"version":"3.51.4"},"reference-count":41,"publisher":"SAGE Publications","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AIC"],"published-print":{"date-parts":[[2022,5,10]]},"abstract":"<jats:p>Recently, the emergence of the digital language division and the availability of cross-lingual benchmarks make researches of cross-lingual texts more popular. However, the performance of existing methods based on mapping relation are not good enough, because sometimes the structures of language spaces are not isomorphic. Besides, polysemy makes the extraction of interaction features hard. For cross-lingual word embedding, a model named Cross-lingual Word Embedding Space Based on Pseudo Corpus (CWE-PC) is proposed to obtain cross-lingual and multilingual word embedding. For cross-lingual sentence pair interaction feature capture, a Cross-language Feature Capture Based on Similarity Matrix (CFC-SM) model is built to extract cross-lingual interaction features. ELMo pretrained model and multiple layer convolution are used to alleviate polysemy and extract interaction features. These models are evaluated on multiple language pairs and results show that they outperform the state-of-the-art cross-lingual word embedding methods.<\/jats:p>","DOI":"10.3233\/aic-210085","type":"journal-article","created":{"date-parts":[[2022,4,15]],"date-time":"2022-04-15T10:48:24Z","timestamp":1650019704000},"page":"1-14","source":"Crossref","is-referenced-by-count":3,"title":["A cross-lingual sentence pair interaction feature capture model based on pseudo-corpus and multilingual embedding"],"prefix":"10.1177","volume":"35","author":[{"given":"Gang","family":"Liu","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, China"},{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yichao","family":"Dong","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kai","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhizheng","family":"Yan","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","reference":[{"key":"10.3233\/AIC-210085_ref1","doi-asserted-by":"crossref","unstructured":"M.\u00a0Artetxe, G.\u00a0Labaka and E.\u00a0Agirre, Learning bilingual word embeddings with (almost) no bilingual data, in: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp.\u00a01017\u20131042.","DOI":"10.18653\/v1\/P17-1042"},{"key":"10.3233\/AIC-210085_ref2","doi-asserted-by":"crossref","unstructured":"M.\u00a0Artetxe, G.\u00a0Labaka and E.\u00a0Agirre, A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, pp.\u00a0789\u2013798.","DOI":"10.18653\/v1\/P18-1073"},{"key":"10.3233\/AIC-210085_ref3","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-4605"},{"key":"10.3233\/AIC-210085_ref4","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1162\/tacl_a_00320","article-title":"Hierarchical mapping for crosslingual word embedding alignment","volume":"8","author":"Azpiazu","year":"2020","journal-title":"Transactions of the Association for Computational Linguistics"},{"issue":"15","key":"10.3233\/AIC-210085_ref5","first-page":"104","article-title":"Linear transformations for cross-lingual semantic textual similarity","volume":"187","author":"Brychcin","year":"2020","journal-title":"Knowledge-Based Systems"},{"issue":"15","key":"10.3233\/AIC-210085_ref6","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1016\/j.artint.2016.07.005","article-title":"Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities","volume":"240","author":"Camacho-Collados","year":"2016","journal-title":"Artificial Intelligence"},{"issue":"1","key":"10.3233\/AIC-210085_ref7","doi-asserted-by":"crossref","first-page":"1887","DOI":"10.1007\/s10462-020-09895-6","article-title":"On the evaluation and combination of state-of-the-art features in Twitter sentiment analysis","volume":"54","author":"Carvalho","year":"2021","journal-title":"Artifcial Intelligence Review"},{"key":"10.3233\/AIC-210085_ref8","unstructured":"W.\u00a0Che, Y.\u00a0Liu, Y.\u00a0Wang, B.\u00a0Zheng and T.\u00a0Liu, Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation, in: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2018, pp.\u00a055\u201364."},{"issue":"10","key":"10.3233\/AIC-210085_ref9","first-page":"116","article-title":"Adversarial deep averaging networks for cross-lingual sentiment classification","volume":"6","author":"Chen","year":"2018","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"10.3233\/AIC-210085_ref10","unstructured":"A.\u00a0Conneau, G.\u00a0Lample, M.\u00a0Ranzato and L.\u00a0Denoyer, Word translation without parallel data, in: Proceedings of International Conference on Learning Representations, 2017, pp.\u00a0430\u2013439."},{"issue":"48","key":"10.3233\/AIC-210085_ref11","doi-asserted-by":"publisher","first-page":"1888","DOI":"10.1007\/s10489-020-01922-x","article-title":"Cluster-based information retrieval using pattern mining","volume":"51","author":"Djenouri","year":"2021","journal-title":"Applied Intelligence"},{"key":"10.3233\/AIC-210085_ref12","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1027"},{"key":"10.3233\/AIC-210085_ref13","unstructured":"L.\u00a0Duong, H.\u00a0Kanayama, T.\u00a0Ma and S.\u00a0Bird, Learning cross-lingual word embeddings with bilingual corpora, in: Proceedings of the 2019 Conference of the North, 2018, pp.\u00a0156\u2013163."},{"key":"10.3233\/AIC-210085_ref14","doi-asserted-by":"crossref","unstructured":"J.\u00a0Ferrero, F.\u00a0Agnes, L.\u00a0Besacier and D.\u00a0Schwab, Using word embedding for cross-language plagiarism detection, in: Proceedings of Conference of the European Chapter of the Association for Computational Linguistics, 2017, pp.\u00a0146\u2013154.","DOI":"10.18653\/v1\/E17-2066"},{"key":"10.3233\/AIC-210085_ref15","unstructured":"S.\u00a0Gouws, Y.\u00a0Bengio and G.\u00a0Corrado, Bilbowa: Fast bilingual distributed representations without word alignments, in: Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015, pp.\u00a01160\u20131166."},{"key":"10.3233\/AIC-210085_ref16","doi-asserted-by":"crossref","unstructured":"J.\u00a0Grover and P.\u00a0Mitra, Bilingual word embeddings with bucketed CNN for parallel sentence extraction, in: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp.\u00a011\u201316.","DOI":"10.18653\/v1\/P17-3003"},{"issue":"8","key":"10.3233\/AIC-210085_ref17","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1007\/s10590-020-09257-7","article-title":"Cross-lingual embedding for cross-lingual question retrieval in low-resource community question answering","volume":"34","author":"HajiAminShirazi","year":"2020","journal-title":"Machine Translation"},{"key":"10.3233\/AIC-210085_ref18","doi-asserted-by":"crossref","unstructured":"K.\u00a0Hermann and P.\u00a0Blunsom, Multilingual models for compositional distributional semantics, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, pp.\u00a058\u201368.","DOI":"10.3115\/v1\/P14-1006"},{"key":"10.3233\/AIC-210085_ref19","doi-asserted-by":"crossref","unstructured":"B.\u00a0Li, H.\u00a0Zhou, J.\u00a0He, M.\u00a0Wang, Y.\u00a0Yang and L.\u00a0Li, On the sentence embeddings from pre-trained language models, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, pp.\u00a09119\u20139130.","DOI":"10.18653\/v1\/2020.emnlp-main.733"},{"issue":"5","key":"10.3233\/AIC-210085_ref20","first-page":"157","article-title":"Enrich cross-lingual entity links for online wikis via multi-modal semantic matching","volume":"57","author":"Liu","year":"2020","journal-title":"Information Processing & Management"},{"key":"10.3233\/AIC-210085_ref21","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W15-1521"},{"key":"10.3233\/AIC-210085_ref22","unstructured":"T.\u00a0Mikolov, K.\u00a0Chen, G.\u00a0Corrado and J.\u00a0Dean, Efficient estimation of word representations in vector space, in: Proceedings of the First International Conference on Learning Representations, 2013, pp.\u00a0127\u2013139."},{"issue":"3","key":"10.3233\/AIC-210085_ref23","first-page":"71","article-title":"Exploiting similarities among languages for machine translation","volume":"16","author":"Mikolov","year":"2013","journal-title":"Computer Science"},{"issue":"15","key":"10.3233\/AIC-210085_ref24","first-page":"117","article-title":"Document plagiarism detection using a new concept similarity in formal concept analysis","volume":"78","author":"Muangprathub","year":"2021","journal-title":"Journal of Applied Mathematics"},{"key":"10.3233\/AIC-210085_ref25","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-1303"},{"key":"10.3233\/AIC-210085_ref26","doi-asserted-by":"crossref","unstructured":"L.\u00a0Nguyen and D.\u00a0Dien, Vietnamese\u2013English cross-lingual paraphrase identification using Siamese recurrent architectures, in: Proceedings of International Symposium on Communications and Information Technologies, 2019, pp.\u00a070\u201375.","DOI":"10.1109\/ISCIT.2019.8905116"},{"key":"10.3233\/AIC-210085_ref27","doi-asserted-by":"crossref","unstructured":"A.\u00a0Ormazabal, M.\u00a0Artetxe, G.\u00a0Labaka, A.\u00a0Soroa and E.\u00a0Agirre, Analyzing the limitations of cross-lingual word embedding mappings, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2020, pp.\u00a04990\u20134995.","DOI":"10.18653\/v1\/P19-1492"},{"key":"10.3233\/AIC-210085_ref28","doi-asserted-by":"crossref","unstructured":"L.\u00a0Pang, Y.\u00a0Lan, J.\u00a0Guo and J.\u00a0Xu, Text matching as image recognition, in: Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp.\u00a01128\u20131135.","DOI":"10.1609\/aaai.v30i1.10341"},{"key":"10.3233\/AIC-210085_ref29","doi-asserted-by":"crossref","unstructured":"B.\u00a0Patra, J.\u00a0Moniz, S.\u00a0Garg and M.\u00a0Gormley, Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2020, pp.\u00a0184\u2013193.","DOI":"10.18653\/v1\/P19-1018"},{"key":"10.3233\/AIC-210085_ref30","doi-asserted-by":"crossref","unstructured":"M.\u00a0Peters, M.\u00a0Neumann, M.\u00a0Iyyer and M.\u00a0Gardner, Deep contextualized word representations, in: Proceedings of NAACL-HLT 2018, 2018, pp.\u00a02227\u20132237.","DOI":"10.18653\/v1\/N18-1202"},{"issue":"18","key":"10.3233\/AIC-210085_ref31","first-page":"174","article-title":"Document classification base on ensemble classifiers support vector machine, multi-layer perceptron and k-nearest neighbors","volume":"2","author":"Rad","year":"2019","journal-title":"Journal of Biochemistry Technology"},{"issue":"5","key":"10.3233\/AIC-210085_ref32","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1007\/s10791-020-09372-2","article-title":"An axiomatic approach to corpus-based cross-language information retrieval","volume":"23","author":"Rahimi","year":"2020","journal-title":"Information Retrieval Journal"},{"key":"10.3233\/AIC-210085_ref33","doi-asserted-by":"crossref","unstructured":"N.\u00a0Reimers and I.\u00a0Gurevych, Sentence-BERT: Sentence embeddings using Siamese BERT-networks, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019, pp.\u00a03982\u20133992.","DOI":"10.18653\/v1\/D19-1410"},{"issue":"8","key":"10.3233\/AIC-210085_ref34","doi-asserted-by":"publisher","first-page":"569","DOI":"10.1613\/jair.1.11640","article-title":"A survey of cross-lingual word embedding models","volume":"65","author":"Ruder","year":"2019","journal-title":"Journal of Artificial Intelligence Research"},{"issue":"4","key":"10.3233\/AIC-210085_ref35","first-page":"640","article-title":"Similarity calculation of Chinese Thai cross-language text based on WordNet","volume":"30","author":"Shi","year":"2016","journal-title":"Journal of Chinese Information Processing"},{"key":"10.3233\/AIC-210085_ref36","unstructured":"A.\u00a0Sogaard, S.\u00a0Ruder and I.\u00a0Vulic, On the limitations of unsupervised bilingual dictionary induction, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2020, pp.\u00a0778\u2013788."},{"issue":"04","key":"10.3233\/AIC-210085_ref37","first-page":"1082","article-title":"Research progress of language models based on deep learning","volume":"32","author":"Wang","year":"2018","journal-title":"Journal of Software"},{"issue":"3","key":"10.3233\/AIC-210085_ref38","doi-asserted-by":"publisher","first-page":"684","DOI":"10.1587\/transinf.2019EDP7157","article-title":"Neural machine translation with target-attention model","volume":"103","author":"Yang","year":"2020","journal-title":"IEICE Transactions on Information and Systems"},{"key":"10.3233\/AIC-210085_ref39","unstructured":"Z.\u00a0Yin and Y.\u00a0Shen, On the dimensionality of word embedding, in: Advances in Neural Information Processing Systems, 2019, pp.\u00a0648\u2013655."},{"key":"10.3233\/AIC-210085_ref40","doi-asserted-by":"crossref","unstructured":"M.\u00a0Zhang, Y.\u00a0Liu, H.\u00a0Luan and M.\u00a0Sun, Adversarial training for unsupervised bilingual lexicon induction, in: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp.\u00a01959\u20131970.","DOI":"10.18653\/v1\/P17-1179"},{"key":"10.3233\/AIC-210085_ref41","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-68699-8_12"}],"container-title":["AI Communications"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/AIC-210085","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T18:28:00Z","timestamp":1777400880000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/AIC-210085"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,10]]},"references-count":41,"journal-issue":{"issue":"1"},"URL":"https:\/\/doi.org\/10.3233\/aic-210085","relation":{},"ISSN":["1875-8452","0921-7126"],"issn-type":[{"value":"1875-8452","type":"electronic"},{"value":"0921-7126","type":"print"}],"subject":[],"published":{"date-parts":[[2022,5,10]]}}}