{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T16:08:07Z","timestamp":1780675687863,"version":"3.54.1"},"publisher-location":"New York, NY, USA","reference-count":59,"publisher":"ACM","funder":[{"name":"National Natural Science Foundation of China","award":["72293575"],"award-info":[{"award-number":["72293575"]}]},{"name":"National Natural Science Foundation of China","award":["62206287"],"award-info":[{"award-number":["62206287"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2026,4,13]]},"DOI":"10.1145\/3774904.3792262","type":"proceedings-article","created":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T21:54:34Z","timestamp":1775771674000},"page":"7068-7079","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Hermes the Polyglot: A Unified Framework to Enhance Expressiveness for Multimodal Interlingual Subtitling"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-7487-7916","authenticated-orcid":false,"given":"Chaoqun","family":"Cui","sequence":"first","affiliation":[{"name":"MAIS, Institute of Automation, Chinese Academy of Sciences, Beijing, China and School of AI, University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4937-8154","authenticated-orcid":false,"given":"Shijing","family":"Wang","sequence":"additional","affiliation":[{"name":"Beijing Jiaotong University, Beijing, China and Hujing Digital Media &amp;#38; Entertainment Group, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-0336-7716","authenticated-orcid":false,"given":"Liangbin","family":"Huang","sequence":"additional","affiliation":[{"name":"Hujing Digital Media &amp;#38; Entertainment Group, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-6872-8910","authenticated-orcid":false,"given":"Qingqing","family":"Gu","sequence":"additional","affiliation":[{"name":"Geely AI Lab, Ningbo, Zhejiang, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-5005-2601","authenticated-orcid":false,"given":"Zhaolong","family":"Huang","sequence":"additional","affiliation":[{"name":"Hujing Digital Media &amp;#38; Entertainment Group, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-0635-1358","authenticated-orcid":false,"given":"Xiao","family":"Zeng","sequence":"additional","affiliation":[{"name":"Hujing Digital Media &amp;#38; Entertainment Group, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2323-5091","authenticated-orcid":false,"given":"Wenji","family":"Mao","sequence":"additional","affiliation":[{"name":"MAIS, Institute of Automation, Chinese Academy of Sciences, Beijing, China and School of AI, University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2026,4,12]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5539\/ijel.v6n3p185"},{"key":"e_1_3_2_1_2_1","volume-title":"Golden-Retriever: High-Fidelity Agentic Retrieval Augmented Generation for Industrial Knowledge Base. arXiv preprint arXiv:2408.00798","author":"An Zhiyu","year":"2024","unstructured":"Zhiyu An, Xianzhong Ding, Yen-Chun Fu, Cheng-Chung Chu, Yan Li, and Wan Du. 2024. Golden-Retriever: High-Fidelity Agentic Retrieval Augmented Generation for Industrial Knowledge Base. arXiv preprint arXiv:2408.00798 (2024)."},{"key":"e_1_3_2_1_3_1","volume-title":"International conference on learning representations.","author":"Andrychowicz Marcin","year":"2021","unstructured":"Marcin Andrychowicz, Anton Raichuk, Piotr Sta'nczyk, Manu Orsini, Sertan Girgin, Rapha\u00ebl Marinier, Leonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, et al., 2021. What matters for on-policy deep actor-critic methods? a large-scale study. In International conference on learning representations."},{"key":"e_1_3_2_1_4_1","volume-title":"International Conference on Artificial Intelligence and Statistics. PMLR, 4447-4455","author":"Azar Mohammad Gheshlaghi","year":"2024","unstructured":"Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Remi Munos, Mark Rowland, Michal Valko, and Daniele Calandriello. 2024. A general theoretical paradigm to understand learning from human preferences. In International Conference on Artificial Intelligence and Statistics. PMLR, 4447-4455."},{"key":"e_1_3_2_1_5_1","first-page":"171","article-title":"Applying Large Language Models in Legal Translation: The State-of-the-Art","volume":"13","author":"Baj\u010di\u0107 Martina","year":"2024","unstructured":"Martina Baj\u010di\u0107 and Dejana Golenko. 2024. Applying Large Language Models in Legal Translation: The State-of-the-Art. International Journal of Language & Law (JLL), Vol. 13 (2024), 171-196.","journal-title":"International Journal of Language & Law (JLL)"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Susan Bassnett. 2013. Translation studies. routledge.","DOI":"10.4324\/9780203488232"},{"key":"e_1_3_2_1_7_1","volume-title":"Forty-first International Conference on Machine Learning.","author":"Chen Dongping","unstructured":"Dongping Chen, Ruoxi Chen, Shilin Zhang, Yaochen Wang, Yinuo Liu, Huichi Zhou, Qihui Zhang, Yao Wan, Pan Zhou, and Lichao Sun. [n.d.]. Mllm-as-a-judge: Assessing multimodal llm-as-a-judge with vision-language benchmark. In Forty-first International Conference on Machine Learning."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3626246.3653385"},{"key":"e_1_3_2_1_9_1","first-page":"3245","article-title":"ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency","volume":"2024","author":"Chen Yafeng","year":"2024","unstructured":"Yafeng Chen, Siqi Zheng, Hui Wang, Luyao Cheng, Qian Chen, Shiliang Zhang, and Junjie Li. 2024b. ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency. In Proc. Interspeech 2024. 3245-3249.","journal-title":"Proc. Interspeech"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2025.acl-long.977"},{"key":"e_1_3_2_1_11_1","volume-title":"Exploring speaker-related information in spoken language understanding for better speaker diarization. arXiv preprint arXiv:2305.12927","author":"Cheng Luyao","year":"2023","unstructured":"Luyao Cheng, Siqi Zheng, Zhang Qinglin, Hui Wang, Yafeng Chen, and Qian Chen. 2023. Exploring speaker-related information in spoken language understanding for better speaker diarization. arXiv preprint arXiv:2305.12927 (2023)."},{"key":"e_1_3_2_1_12_1","volume-title":"Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https:\/\/vicuna. lmsys.org (accessed","author":"Chiang Wei-Lin","year":"2023","unstructured":"Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E Gonzalez, et al., 2023. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https:\/\/vicuna. lmsys.org (accessed 14 April 2023), Vol. 2, 3 (2023), 6."},{"key":"e_1_3_2_1_13_1","volume-title":"Who said that?: Audio-visual speaker diarisation of real-world meetings. arXiv preprint arXiv:1906.10042","author":"Chung Joon Son","year":"2019","unstructured":"Joon Son Chung, Bong-Jin Lee, and Icksang Han. 2019. Who said that?: Audio-visual speaker diarisation of real-world meetings. arXiv preprint arXiv:1906.10042 (2019)."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1057\/9780230234581"},{"key":"e_1_3_2_1_15_1","unstructured":"Marta R Costa-juss\u00e0 James Cross Onur \u00c7elebi Maha Elbayad Kenneth Heafield Kevin Heffernan Elahe Kalbassi Janice Lam Daniel Licht Jean Maillard et al. 2022. No language left behind: Scaling human-centered machine translation. arXiv preprint arXiv:2207.04672 (2022)."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2025.acl-long.227"},{"key":"e_1_3_2_1_17_1","volume-title":"Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO. In International Conference on Learning Representations.","author":"Engstrom Logan","year":"2020","unstructured":"Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, and Aleksander Madry. 2020. Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO. In International Conference on Learning Representations."},{"key":"e_1_3_2_1_18_1","volume-title":"From llm to nmt: Advancing low-resource machine translation with claude. arXiv preprint arXiv:2404.13813","author":"Enis Maxim","year":"2024","unstructured":"Maxim Enis and Mark Hopkins. 2024. From llm to nmt: Advancing low-resource machine translation with claude. arXiv preprint arXiv:2404.13813 (2024)."},{"key":"e_1_3_2_1_19_1","volume-title":"Kto: Model alignment as prospect theoretic optimization. arXiv preprint arXiv:2402.01306","author":"Ethayarajh Kawin","year":"2024","unstructured":"Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, and Douwe Kiela. 2024. Kto: Model alignment as prospect theoretic optimization. arXiv preprint arXiv:2402.01306 (2024)."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.iwslt-1.31"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.emnlp-main.860"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.wmt-1.100"},{"key":"e_1_3_2_1_23_1","volume-title":"End-to-end neural speaker diarization with permutation-free objectives. arXiv preprint arXiv:1909.05952","author":"Fujita Yusuke","year":"2019","unstructured":"Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, and Shinji Watanabe. 2019. End-to-end neural speaker diarization with permutation-free objectives. arXiv preprint arXiv:1909.05952 (2019)."},{"key":"e_1_3_2_1_24_1","unstructured":"Google LLC. 2023. Google Translate. https:\/\/translate.google.com Accessed: 2025-05-15."},{"key":"e_1_3_2_1_25_1","unstructured":"Jiawei Gu Xuhui Jiang Zhichao Shi Hexiang Tan Xuehao Zhai Chengjin Xu Wei Li Yinghan Shen Shengjie Ma Honghao Liu et al. 2024. A survey on llm-as-a-judge. arXiv preprint arXiv:2411.15594 (2024)."},{"key":"e_1_3_2_1_26_1","unstructured":"Kaiyu Huang Fengran Mo Xinyu Zhang Hongliang Li You Li Yuanchi Zhang Weijian Yi Yulong Mao Jinchen Liu Yuzhuang Xu et al. 2024. A survey on large language models with multilingualism: Recent advances and new frontiers. arXiv preprint arXiv:2405.10936 (2024)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00594"},{"key":"e_1_3_2_1_28_1","volume-title":"Belle: Be everyone's large language model engine.","author":"Ji Yunjie","year":"2023","unstructured":"Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, and Xiangang Li. 2023. Belle: Be everyone's large language model engine."},{"key":"e_1_3_2_1_29_1","volume-title":"Critique of pure reason. 1781. Modern Classical Philosophers","author":"Kant Immanuel","year":"1908","unstructured":"Immanuel Kant. 1908. Critique of pure reason. 1781. Modern Classical Philosophers, Cambridge, MA: Houghton Mifflin (1908), 370-456."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.2307\/jj.18254729.16"},{"key":"e_1_3_2_1_31_1","volume-title":"Proceedings of the 24th Annual Conference of the European Association for Machine Translation. 193-203","author":"Kocmi Tom","year":"2023","unstructured":"Tom Kocmi and Christian Federmann. 2023. Large Language Models Are State-of-the-Art Evaluators of Translation Quality. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation. 193-203."},{"key":"e_1_3_2_1_32_1","volume-title":"Advances in Neural Information Processing Systems","volume":"36","author":"Kudugunta Sneha","year":"2024","unstructured":"Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, and Orhan Firat. 2024. Madlad-400: A multilingual and document-level large audited dataset. Advances in Neural Information Processing Systems, Vol. 36 (2024)."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2021.101254"},{"key":"e_1_3_2_1_34_1","volume-title":"Generative Judge for Evaluating Alignment. In The Twelfth International Conference on Learning Representations.","author":"Li Junlong","unstructured":"Junlong Li, Shichao Sun, Weizhe Yuan, Run-Ze Fan, Pengfei Liu, et al., [n.d.]. Generative Judge for Evaluating Alignment. In The Twelfth International Conference on Learning Representations."},{"key":"e_1_3_2_1_35_1","unstructured":"Chen Ling Xujiang Zhao Jiaying Lu Chengyuan Deng Can Zheng Junxiang Wang Tanmoy Chowdhury Yun Li Hejie Cui Xuchao Zhang et al. 2023. Domain specialization as the key to make large language models disruptive: A comprehensive survey. arXiv preprint arXiv:2305.18703 (2023)."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2019.2937185"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-012-0602-z"},{"key":"e_1_3_2_1_38_1","volume-title":"Simpo: Simple preference optimization with a reference-free reward. arXiv preprint arXiv:2405.14734","author":"Meng Yu","year":"2024","unstructured":"Yu Meng, Mengzhou Xia, and Danqi Chen. 2024. Simpo: Simple preference optimization with a reference-free reward. arXiv preprint arXiv:2405.14734 (2024)."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.findings-naacl.250"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Long Ouyang Jeffrey Wu Xu Jiang Diogo Almeida Carroll Wainwright Pamela Mishkin Chong Zhang Sandhini Agarwal Katarina Slama Alex Ray et al. 2022. Training language models to follow instructions with human feedback. Advances in neural information processing systems Vol. 35 (2022) 27730-27744.","DOI":"10.52202\/068431-2011"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00730"},{"key":"e_1_3_2_1_42_1","volume-title":"Advances in Neural Information Processing Systems","volume":"36","author":"Rafailov Rafael","year":"2024","unstructured":"Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. 2024. Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems, Vol. 36 (2024)."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.31235\/osf.io\/j4zh7"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.9744\/katakita.9.2.144-149"},{"key":"e_1_3_2_1_45_1","unstructured":"Susan Sarcevic et al. 1997. New approach to legal translation. Kluwer Law International BV."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT.2014.7078610"},{"key":"e_1_3_2_1_47_1","volume-title":"Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300","author":"Shao Zhihong","year":"2024","unstructured":"Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, Y Wu, et al., 2024. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300 (2024)."},{"key":"e_1_3_2_1_48_1","first-page":"3008","article-title":"Learning to summarize with human feedback","volume":"33","author":"Stiennon Nisan","year":"2020","unstructured":"Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul F Christiano. 2020. Learning to summarize with human feedback. Advances in Neural Information Processing Systems, Vol. 33 (2020), 3008-3021.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3474085.3475587"},{"key":"e_1_3_2_1_50_1","volume-title":"A tutorial on spectral clustering. Statistics and computing","author":"Luxburg Ulrike Von","year":"2007","unstructured":"Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and computing, Vol. 17, 4 (2007), 395-416."},{"key":"e_1_3_2_1_51_1","volume-title":"Maryam Fazel-Zarandi, Jason Weston, and Xian Li.","author":"Wang Tianlu","year":"2024","unstructured":"Tianlu Wang, Ilia Kulikov, Olga Golovneva, Ping Yu, Weizhe Yuan, Jane Dwivedi-Yu, Richard Yuanzhe Pang, Maryam Fazel-Zarandi, Jason Weston, and Xian Li. 2024. Self-taught evaluators. arXiv preprint arXiv:2408.02666 (2024)."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v37i11.26613"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3548027"},{"key":"e_1_3_2_1_54_1","unstructured":"An Yang Baosong Yang Beichen Zhang Binyuan Hui Bo Zheng Bowen Yu Chengyuan Li Dayiheng Liu Fei Huang Haoran Wei et al. 2024. Qwen2.5 technical report. arXiv preprint arXiv:2412.15115 (2024)."},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2008.2007344"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-020-00734-4"},{"key":"e_1_3_2_1_57_1","volume-title":"How good are llms for literary translation, really? literary translation evaluation with humans and llms. arXiv preprint arXiv:2410.18697","author":"Zhang Ran","year":"2024","unstructured":"Ran Zhang, Wei Zhao, and Steffen Eger. 2024. How good are llms for literary translation, really? literary translation evaluation with humans and llms. arXiv preprint arXiv:2410.18697 (2024)."},{"key":"e_1_3_2_1_58_1","first-page":"46595","article-title":"Judging llm-as-a-judge with mt-bench and chatbot arena","volume":"36","author":"Zheng Lianmin","year":"2023","unstructured":"Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, et al., 2023a. Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, Vol. 36 (2023), 46595-46623.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_59_1","unstructured":"Rui Zheng Shihan Dou Songyang Gao Yuan Hua Wei Shen Binghai Wang Yan Liu Senjie Jin Qin Liu Yuhao Zhou et al. 2023b. Secrets of rlhf in large language models part i: Ppo. arXiv preprint arXiv:2307.04964 (2023)."}],"event":{"name":"WWW '26: The ACM Web Conference 2026","location":"Dubai United Arab Emirates","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"]},"container-title":["Proceedings of the ACM Web Conference 2026"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3774904.3792262","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T15:53:40Z","timestamp":1780674820000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3774904.3792262"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,12]]},"references-count":59,"alternative-id":["10.1145\/3774904.3792262","10.1145\/3774904"],"URL":"https:\/\/doi.org\/10.1145\/3774904.3792262","relation":{},"subject":[],"published":{"date-parts":[[2026,4,12]]},"assertion":[{"value":"2026-04-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}