{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T15:58:42Z","timestamp":1777046322448,"version":"3.51.4"},"reference-count":21,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T00:00:00Z","timestamp":1776988800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"College Students\u2019 Innovation and Entrepreneurship Training Program","award":["202610004005"],"award-info":[{"award-number":["202610004005"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>This study examines black-box hyperparameter optimization for financial retrieval-augmented generation (RAG) retrieval under limited budget constraints. Using FinQA as the primary dataset, it compares Grid Search, Random Search, and Bayesian Optimization under a unified search space, evaluation protocol, and multi-seed setting, and further uses FinanceBench for external validation. The results show that Random Search and Bayesian Optimization can approach the Grid reference at substantially lower cost, but the small development-set advantage of Bayesian Optimization does not remain stable on the test set or across repeated runs. A more consistent finding is that high-performing configurations are concentrated in a limited parameter region. Overall, the results suggest that, in budget-constrained financial RAG retrieval tuning, identifying stable high-performing parameter regions may be more useful than relying on increasingly complex optimization methods.<\/jats:p>","DOI":"10.3390\/info17050405","type":"journal-article","created":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T14:28:23Z","timestamp":1777040903000},"page":"405","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Black-Box Hyperparameter Optimization for Financial RAG Retrieval: An Efficiency\u2013Effectiveness Trade-Off Study"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-3179-7611","authenticated-orcid":false,"given":"Yangyang","family":"Jin","sequence":"first","affiliation":[{"name":"School of Economics and Management, Beijing Jiaotong University, Beijing 100044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-0071-0957","authenticated-orcid":false,"given":"Xindi","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Economics and Management, Beijing Jiaotong University, Beijing 100044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-0559-954X","authenticated-orcid":false,"given":"Qianli","family":"Dong","sequence":"additional","affiliation":[{"name":"School of Economics and Management, Beijing Jiaotong University, Beijing 100044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2026,4,24]]},"reference":[{"key":"ref_1","first-page":"9459","article-title":"Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks","volume":"33","author":"Lewis","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_2","unstructured":"Guu, K., Lee, K., Tung, Z., Pasupat, P., and Chang, M.-W. (2020, January 13\u201318). Retrieval Augmented Language Model Pre-Training. Proceedings of the 37th International Conference on Machine Learning, Virtual Event."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Chen, Z., Chen, W., Smiley, C., Shah, S., Borova, I., Langdon, D., Moussa, R., Beane, M., Huang, T.-H., and Routledge, B. (2021). FinQA: A Dataset of Numerical Reasoning over Financial Data. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2021.emnlp-main.300"},{"key":"ref_4","unstructured":"Islam, P., Kannappan, A., Kiela, D., Qian, R., Scherrer, N., and Vidgen, B. (2023). FinanceBench: A New Benchmark for Financial Question Answering. arXiv."},{"key":"ref_5","unstructured":"Strich, J., Isgorur, E.K., Trescher, M., Biemann, C., and Semmann, M. (2026). T2-RAGBench: Text-and-Table Benchmark for Evaluating Retrieval-Augmented Generation. Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Rabat, Morocco, Association for Computational Linguistics."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Reddy, V., Koncel-Kedziorski, R., Lai, V.D., Krumdick, M., Lovering, C., and Tanner, C. (2024). DocFinQA: A Long-Context Financial Reasoning Dataset. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Bangkok, Thailand, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2024.acl-short.42"},{"key":"ref_7","unstructured":"Kim, S., Song, H., Seo, H., and Kim, H. (2025). Optimizing Retrieval Strategies for Financial Question Answering Documents in Retrieval-Augmented Generation Systems. arXiv."},{"key":"ref_8","unstructured":"Lee, J., and Roh, M. (2024). Multi-Reranker: Maximizing Performance of Retrieval-Augmented Generation in the FinanceRAG Challenge. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Choe, J., Kim, J., and Jung, W. (2025). Hierarchical Retrieval with Evidence Curation for Open-Domain Financial Question Answering on Standardized Documents. Findings of the Association for Computational Linguistics: ACL 2025, Vienna, Austria, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2025.findings-acl.855"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Izacard, G., and Grave, E. (2021). Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume; Online, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2021.eacl-main.74"},{"key":"ref_11","unstructured":"Orbach, M., Eytan, O., Sznajder, B., Gera, A., Boni, O., Kantor, Y., Bloch, G., Levy, O., Abraham, H., and Barzilay, N. (2025). An Analysis of Hyper-Parameter Optimization Methods for Retrieval Augmented Generation. arXiv."},{"key":"ref_12","first-page":"2546","article-title":"Algorithms for Hyper-Parameter Optimization","volume":"24","author":"Bergstra","year":"2011","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_13","first-page":"2951","article-title":"Practical Bayesian Optimization of Machine Learning Algorithms","volume":"25","author":"Snoek","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_14","first-page":"281","article-title":"Random Search for Hyper-Parameter Optimization","volume":"13","author":"Bergstra","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"ref_15","unstructured":"Jimeno Yepes, A., You, Y., Milczek, J., Laverde, S., and Li, R. (2024). Financial Report Chunking for Effective Retrieval Augmented Generation. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., and Yih, W.-t. (2020). Dense Passage Retrieval for Open-Domain Question Answering. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics.","DOI":"10.18653\/v1\/2020.emnlp-main.550"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Robertson, S.E., and Walker, S. (1994). Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. SIGIR \u201994, Springer.","DOI":"10.1007\/978-1-4471-2099-5_24"},{"key":"ref_18","unstructured":"Hsu, H.-L., and Tzeng, J. (2025). DAT: Dynamic Alpha Tuning for Hybrid Retrieval in Retrieval-Augmented Generation. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019). Optuna: A Next-Generation Hyperparameter Optimization Framework. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, ACM.","DOI":"10.1145\/3292500.3330701"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1145\/582415.582418","article-title":"Cumulated Gain-Based Evaluation of IR Techniques","volume":"20","year":"2002","journal-title":"ACM Trans. Inf. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Voorhees, E.M., and Tice, D.M. (2000). The TREC-8 Question Answering Track. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC\u201900), Athens, Greece, European Language Resources Association (ELRA).","DOI":"10.6028\/NIST.SP.500-246.qa-overview"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/5\/405\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T14:39:47Z","timestamp":1777041587000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/5\/405"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,24]]},"references-count":21,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2026,5]]}},"alternative-id":["info17050405"],"URL":"https:\/\/doi.org\/10.3390\/info17050405","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,24]]}}}