{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T14:14:21Z","timestamp":1769004861562,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":26,"publisher":"ACM","license":[{"start":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T00:00:00Z","timestamp":1765324800000},"content-version":"vor","delay-in-days":59,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH (National Institutes of Health)","doi-asserted-by":"publisher","award":["P20GM103446"],"award-info":[{"award-number":["P20GM103446"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH (National Institutes of Health)","doi-asserted-by":"publisher","award":["U54GM104941"],"award-info":[{"award-number":["U54GM104941"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF (National Science Foundation)","doi-asserted-by":"publisher","award":["2443639"],"award-info":[{"award-number":["2443639"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,10,12]]},"DOI":"10.1145\/3765612.3767230","type":"proceedings-article","created":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T17:45:59Z","timestamp":1765388759000},"page":"1-6","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Reward Hacking Mitigation using Verifiable Composite Rewards"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-0403-6442","authenticated-orcid":false,"given":"Mirza Farhan","family":"Bin Tarek","sequence":"first","affiliation":[{"name":"University of Delaware, Newark, DE, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8912-3063","authenticated-orcid":false,"given":"Rahmatollah","family":"Beheshti","sequence":"additional","affiliation":[{"name":"University of Delaware, Newark, DE, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,12,10]]},"reference":[{"key":"e_1_3_2_2_1_1","unstructured":"Yuntao Bai Saurav Kadavath Sandipan Kundu Amanda Askell Jackson Kernion Andy Jones Anna Chen Anna Goldie Azalia Mirhoseini Cameron McKinnon et al. 2022. Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073 (2022)."},{"key":"e_1_3_2_2_2_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877\u20131901."},{"key":"e_1_3_2_2_3_1","volume-title":"Deep reinforcement learning from human preferences. Advances in neural information processing systems 30","author":"Christiano Paul F","year":"2017","unstructured":"Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep reinforcement learning from human preferences. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_2_4_1","volume-title":"Faulty reward functions in the wild. Internet: https:\/\/blog.openai.com\/faulty-reward-functions","author":"Clark Jack","year":"2016","unstructured":"Jack Clark and Dario Amodei. 2016. Faulty reward functions in the wild. Internet: https:\/\/blog.openai.com\/faulty-reward-functions (2016)."},{"key":"e_1_3_2_2_5_1","volume-title":"Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997 2, 1","author":"Gao Yunfan","year":"2023","unstructured":"Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yixin Dai, Jiawei Sun, Haofen Wang, and Haofen Wang. 2023. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997 2, 1 (2023)."},{"key":"e_1_3_2_2_6_1","unstructured":"Aaron Grattafiori Abhimanyu Dubey Abhinav Jauhri Abhinav Pandey Abhishek Kadian Ahmad Al-Dahle Aiesha Letman Akhil Mathur Alan Schelten Alex Vaughan et al. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783 (2024)."},{"key":"e_1_3_2_2_7_1","volume-title":"What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. arXiv preprint arXiv:2009.13081","author":"Jin Di","year":"2020","unstructured":"Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, and Peter Szolovits. 2020. What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. arXiv preprint arXiv:2009.13081 (2020)."},{"key":"e_1_3_2_2_8_1","volume-title":"Specification gaming: the flip side of AI ingenuity. DeepMind Blog 3","author":"Krakovna Victoria","year":"2020","unstructured":"Victoria Krakovna, Jonathan Uesato, Vladimir Mikulik, Matthew Rahtz, Tom Everitt, Ramana Kumar, Zac Kenton, Jan Leike, and Shane Legg. 2020. Specification gaming: the flip side of AI ingenuity. DeepMind Blog 3 (2020)."},{"key":"e_1_3_2_2_9_1","volume-title":"Alisa Liu, Nouha Dziri, Shane Lyu, et al.","author":"Lambert Nathan","year":"2024","unstructured":"Nathan Lambert, Jacob Morrison, Valentina Pyatkin, Shengyi Huang, Hamish Ivison, Faeze Brahman, Lester James V Miranda, Alisa Liu, Nouha Dziri, Shane Lyu, et al. 2024. T\\\" ulu 3: Pushing frontiers in open language model post-training. arXiv preprint arXiv:2411.15124 (2024)."},{"key":"e_1_3_2_2_10_1","volume-title":"Thomas Mesnard, Johan Ferret, Colton Bishop, Ethan Hall, Victor Carbune, and Abhinav Rastogi.","author":"Lee Harrison","year":"2023","unstructured":"Harrison Lee, Samrat Phatale, Hassan Mansoor, Kellie Ren Lu, Thomas Mesnard, Johan Ferret, Colton Bishop, Ethan Hall, Victor Carbune, and Abhinav Rastogi. 2023. Rlaif: Scaling reinforcement learning from human feedback with ai feedback. (2023)."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.emnlp-main.565"},{"key":"e_1_3_2_2_12_1","volume-title":"The Twelfth International Conference on Learning Representations.","author":"Lightman Hunter","year":"2023","unstructured":"Hunter Lightman, Vineet Kosaraju, Yuri Burda, Harrison Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, and Karl Cobbe. 2023. Let's verify step by step. In The Twelfth International Conference on Learning Representations."},{"key":"e_1_3_2_2_13_1","volume-title":"G-eval: NLG evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634","author":"Liu Yang","year":"2023","unstructured":"Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, and Chenguang Zhu. 2023. G-eval: NLG evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634 (2023)."},{"key":"e_1_3_2_2_14_1","volume-title":"Compositional questions do not necessitate multi-hop reasoning. arXiv preprint arXiv:1906.02900","author":"Min Sewon","year":"2019","unstructured":"Sewon Min, Eric Wallace, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2019. Compositional questions do not necessitate multi-hop reasoning. arXiv preprint arXiv:1906.02900 (2019)."},{"key":"e_1_3_2_2_15_1","unstructured":"Long Ouyang Jeffrey Wu Xu Jiang Diogo Almeida Carroll Wainwright Pamela Mishkin Chong Zhang Sandhini Agarwal Katarina Slama Alex Ray et al. 2022. Training language models to follow instructions with human feedback. Advances in neural information processing systems 35 (2022) 27730\u201327744."},{"key":"e_1_3_2_2_16_1","volume-title":"Bias patterns in the application of LLMs for clinical decision support: A comprehensive study. arXiv preprint arXiv:2404.15149","author":"Poulain Raphael","year":"2024","unstructured":"Raphael Poulain, Hamed Fayyaz, and Rahmatollah Beheshti. 2024. Bias patterns in the application of LLMs for clinical decision support: A comprehensive study. arXiv preprint arXiv:2404.15149 (2024)."},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_2_2_18_1","unstructured":"Andrew Sellergren Sahar Kazemzadeh Tiam Jaroensri Atilla Kiraly Madeleine Traverse Timo Kohlberger Shawn Xu Fayaz Jamil C\u00edan Hughes Charles Lau et al. 2025. Medgemma technical report. arXiv preprint arXiv:2507.05201 (2025)."},{"key":"e_1_3_2_2_19_1","unstructured":"Mrinank Sharma Meg Tong Tomasz Korbak David Duvenaud Amanda Askell Samuel R Bowman Newton Cheng Esin Durmus Zac Hatfield-Dodds Scott R Johnston et al. 2023. Towards understanding sycophancy in language models. arXiv preprint arXiv:2310.13548 (2023)."},{"key":"e_1_3_2_2_20_1","volume-title":"A Risk-Aware Reinforcement Learning Reward for Financial Trading. arXiv preprint arXiv:2506.04358","author":"Srivastava Uditansh","year":"2025","unstructured":"Uditansh Srivastava, Shivam Aryan, and Shaurya Singh. 2025. A Risk-Aware Reinforcement Learning Reward for Financial Trading. arXiv preprint arXiv:2506.04358 (2025)."},{"key":"e_1_3_2_2_21_1","volume-title":"Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems 12","author":"Sutton Richard S","year":"1999","unstructured":"Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. 1999. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems 12 (1999)."},{"key":"e_1_3_2_2_22_1","unstructured":"Qwen Team. 2024. Qwen2.5: A Party of Foundation Models. https:\/\/qwenlm.github.io\/blog\/qwen2.5\/"},{"key":"e_1_3_2_2_23_1","first-page":"74952","article-title":"Language models don't always say what they think: Unfaithful explanations in chain-of-thought prompting","volume":"36","author":"Turpin Miles","year":"2023","unstructured":"Miles Turpin, Julian Michael, Ethan Perez, and Samuel Bowman. 2023. Language models don't always say what they think: Unfaithful explanations in chain-of-thought prompting. Advances in Neural Information Processing Systems 36 (2023), 74952\u201374965.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_2_24_1","volume-title":"The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track.","author":"Wang Yubo","year":"2024","unstructured":"Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, et al. 2024. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark. In The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track."},{"key":"e_1_3_2_2_25_1","volume-title":"Med-rlvr: Emerging medical reasoning from a 3b base model via reinforcement learning. arXiv preprint arXiv:2502.19655","author":"Zhang Sheng","year":"2025","unstructured":"Sheng Zhang, Qianchu Liu, Guanghui Qin, Tristan Naumann, and Hoifung Poon. 2025. Med-rlvr: Emerging medical reasoning from a 3b base model via reinforcement learning. arXiv preprint arXiv:2502.19655 (2025)."},{"key":"e_1_3_2_2_26_1","volume-title":"Instruction-following evaluation for large language models. arXiv preprint arXiv:2311.07911","author":"Zhou Jeffrey","year":"2023","unstructured":"Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, and Le Hou. 2023. Instruction-following evaluation for large language models. arXiv preprint arXiv:2311.07911 (2023)."}],"event":{"name":"BCB '25: 16th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","location":"Element Philadelphia Downtown Philadelphia PA USA","acronym":"BCB '25","sponsor":["SIGBio ACM Special Interest Group on Bioinformatics"]},"container-title":["Proceedings of the 16th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3765612.3767230","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3765612.3767230","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T17:47:47Z","timestamp":1765388867000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3765612.3767230"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,12]]},"references-count":26,"alternative-id":["10.1145\/3765612.3767230","10.1145\/3765612"],"URL":"https:\/\/doi.org\/10.1145\/3765612.3767230","relation":{},"subject":[],"published":{"date-parts":[[2025,10,12]]},"assertion":[{"value":"2025-12-10","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}