{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T20:38:15Z","timestamp":1780346295973,"version":"3.54.1"},"reference-count":75,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,6,17]]},"abstract":"<jats:p>Integrating machine learning (ML) analytics into existing database management systems (DBMSs) not only eliminates the need for costly data transfers to external ML platforms but also ensures compliance with regulatory standards. While some DBMSs have integrated functionalities for training and applying ML models for analytics, these tasks still present challenges, particularly due to limited support for automatic feature engineering (AutoFE), which is crucial for optimizing ML model performance. In this paper, we introduce Adda, an agent-driven in-database feature generation tool designed to automatically create high-quality features for ML analytics directly within the database. Adda interprets ML analytics tasks described in natural language and generates code for feature construction by leveraging the power of large language models (LLMs) integrated with specialized agents. This code is then translated into SQL statements using a predefined set of operators and compiled just-in-time (JIT) into user-defined functions (UDFs). The result is a seamless, fully in-database solution for feature generation, specifically tailored for ML analytics tasks. Extensive experiments across 14 public datasets, with five ML tasks per dataset, show that Adda improves the AUC by up to 33.2% and reduces end-to-end latency by up to 100x compared to Madlib.<\/jats:p>","DOI":"10.1145\/3725262","type":"journal-article","created":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T21:23:29Z","timestamp":1750281809000},"page":"1-27","source":"Crossref","is-referenced-by-count":3,"title":["Adda: Towards Efficient in-Database Feature Generation via LLM-based Agents"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-9344-619X","authenticated-orcid":false,"given":"Kuan","family":"Lu","sequence":"first","affiliation":[{"name":"Zhejiang University, HangZhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0344-1464","authenticated-orcid":false,"given":"Zhihui","family":"Yang","sequence":"additional","affiliation":[{"name":"Zhejiang University, The State Key Laboratory of Blockchain and Data Security, HangZhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7903-1496","authenticated-orcid":false,"given":"Sai","family":"Wu","sequence":"additional","affiliation":[{"name":"Zhejiang Key Laboratory of Big Data Intelligent Computing, Zhejiang University, HangZhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-2736-451X","authenticated-orcid":false,"given":"Ruichen","family":"Xia","sequence":"additional","affiliation":[{"name":"Zhejiang University, HangZhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-6338-0698","authenticated-orcid":false,"given":"Dongxiang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Zhejiang University, HangZhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7483-0045","authenticated-orcid":false,"given":"Gang","family":"Chen","sequence":"additional","affiliation":[{"name":"Zhejiang University, HangZhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,6,18]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al.","author":"Achiam Josh","year":"2023","unstructured":"Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. (2023)."},{"key":"e_1_2_1_2_1","first-page":"27","article-title":"tspDB","volume":"133","author":"Agarwal Anish","year":"2021","unstructured":"Anish Agarwal, Abdullah Alomar, and Devavrat Shah. 2021. tspDB: Time Series Predict DB, Vol. 133. 27--56.","journal-title":"Time Series Predict DB"},{"key":"e_1_2_1_3_1","volume-title":"Mohamed Y. Eltabakh, Mourad Ouzzani, and Nan Tang.","author":"Ahmad Mohammad Shahmeer","year":"2023","unstructured":"Mohammad Shahmeer Ahmad, Zan Ahmad Naeem, Mohamed Y. Eltabakh, Mourad Ouzzani, and Nan Tang. 2023. RetClean: Retrieval-Based Data Cleaning Using Foundation Models and Data Lakes. abs\/2303.16909 (2023)."},{"key":"e_1_2_1_4_1","unstructured":"Alibaba-NLP. 2024."},{"key":"e_1_2_1_5_1","volume-title":"Using Confidence Bounds for Exploitation-Exploration Trade-offs. 3 (01","author":"Auer Peter","year":"2002","unstructured":"Peter Auer. 2002. Using Confidence Bounds for Exploitation-Exploration Trade-offs. 3 (01 2002), 397--422."},{"key":"e_1_2_1_6_1","first-page":"1798","article-title":"Representation Learning","volume":"35","author":"Bengio Yoshua","year":"2012","unstructured":"Yoshua Bengio, Aaron C. Courville, and Pascal Vincent. 2012. Representation Learning: A Review and New Perspectives. 35 (2012), 1798--1828.","journal-title":"A Review and New Perspectives."},{"key":"e_1_2_1_7_1","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners."},{"key":"e_1_2_1_8_1","unstructured":"Miguel Castro Barbara Liskov et al. 1999. Practical byzantine fault tolerance Vol. 99. 173--186."},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Xiangning Chen Qingwei Lin Chuan Luo Xudong Li Hongyu Zhang Yong Xu Yingnong Dang Kaixin Sui Xu Zhang Bo Qiao Weiyi Zhang Wei Wu Murali Chintalapati and Dongmei Zhang. 2019. Neural Feature Search: A Neural Architecture for Automated Feature Engineering. 71--80.","DOI":"10.1109\/ICDM.2019.00017"},{"key":"e_1_2_1_10_1","volume-title":"Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, and Ion Stoica.","author":"Chiang Wei-Lin","year":"2024","unstructured":"Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, and Ion Stoica. 2024. Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference. arXiv:2403.04132 https:\/\/arxiv.org\/abs\/2403.04132"},{"key":"e_1_2_1_11_1","unstructured":"DeepSeek-AI. 2024. DeepSeek-V3 Technical Report. arXiv:2412.19437 [cs.CL] https:\/\/arxiv.org\/abs\/2412.19437"},{"key":"e_1_2_1_12_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Association for Computational Linguistics, 4171--4186."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.14778\/3192965.3192970"},{"key":"e_1_2_1_14_1","volume-title":"A few useful things to know about machine learning. 55, 10 (Oct","author":"Domingos Pedro","year":"2012","unstructured":"Pedro Domingos. 2012. A few useful things to know about machine learning. 55, 10 (Oct. 2012), 78--87."},{"key":"e_1_2_1_15_1","unstructured":"Yilun Du Shuang Li Antonio Torralba Joshua B. Tenenbaum and Igor Mordatch. 2023. Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXiv:2305.14325 [cs.CL]"},{"key":"e_1_2_1_16_1","unstructured":"Nick Erickson Jonas Mueller Alexander Shirkov Hang Zhang Pedro Larroy Mu Li and Alexander Smola. 2020. AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. (2020)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Arash Fard Anh Le George Larionov Waqas Dhillon and Chuck Bear. 2020. Vertica-ML: Distributed Machine Learning in Vertica Database. 755--768.","DOI":"10.1145\/3318464.3386137"},{"key":"e_1_2_1_18_1","volume-title":"Garnett (Eds.)","volume":"28","author":"Feurer Matthias","year":"2015","unstructured":"Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett (Eds.), Vol. 28. Curran Associates, Inc."},{"key":"e_1_2_1_19_1","unstructured":"Tianyu Gao Howard Yen Jiatong Yu and Danqi Chen. 2023. Enabling Large Language Models to Generate Text with Citations."},{"key":"e_1_2_1_20_1","unstructured":"Aaron Grattafiori and Abhimanyu Dubey. 2024. The Llama 3 Herd of Models. arXiv:2407.21783 https:\/\/arxiv.org\/abs\/2407.21783"},{"key":"e_1_2_1_21_1","unstructured":"Sungwon Han Jinsung Yoon Sercan \u00d6. Arik and Tomas Pfister. 2024. Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning. https:\/\/openreview.net\/forum?id=fRG45xL1WT"},{"key":"e_1_2_1_22_1","volume-title":"Sontag","author":"Hegselmann Stefan","year":"2022","unstructured":"Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David A. Sontag. 2022. TabLLM: Few-shot Classification of Tabular Data with Large Language Models. abs\/2210.10723 (2022). https:\/\/api.semanticscholar.org\/CorpusID:252992811"},{"key":"e_1_2_1_23_1","volume-title":"Eugene Fratkin, Aleksander Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li, and Arun Kumar.","author":"Hellerstein Joseph M.","year":"2012","unstructured":"Joseph M. Hellerstein, Christoper R\u00e9, Florian Schoppmann, Daisy Zhe Wang, Eugene Fratkin, Aleksander Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li, and Arun Kumar. 2012. The MADlib analytics library: or MAD skills, the SQL. 5, 12 (2012), 1700--1711."},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","unstructured":"Noah Hollmann Samuel M\u00fcller and Frank Hutter. 2023. Large Language Models for Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering.","DOI":"10.52202\/075280-1938"},{"key":"e_1_2_1_25_1","volume-title":"Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, and Chenglin Wu.","author":"Hong Sirui","year":"2023","unstructured":"Sirui Hong, Xiawu Zheng, Jonathan Chen, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, and Chenglin Wu. 2023. MetaGPT: Meta Programming for Multi-Agent Collaborative Framework. abs\/2308.00352 (2023)."},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"F. Horn Robert T. Pack and Michael Rieger. 2019. The autofeat Python Library for Automated Feature Engineering and Selection.","DOI":"10.1007\/978-3-030-43823-4_10"},{"key":"e_1_2_1_27_1","volume-title":"Focus: Querying Large Video Datasets with Low Latency and Low Cost. 269--286.","author":"Hsieh Kevin","year":"2018","unstructured":"Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, and Onur Mutlu. 2018. Focus: Querying Large Video Datasets with Low Latency and Low Cost. 269--286."},{"key":"e_1_2_1_28_1","volume-title":"Systems, Challenges.","author":"Hutter Frank","year":"2019","unstructured":"Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. 2019. Automated Machine Learning: Methods, Systems, Challenges. (2019)."},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Matthias Jasny Tobias Ziegler Tim Kraska Uwe Roehm and Carsten Binnig. 2020. DB4ML - An In-Memory Database Kernel with Machine Learning Support. 159--173.","DOI":"10.1145\/3318464.3380575"},{"key":"e_1_2_1_30_1","unstructured":"Kaggle. 2012. https:\/\/www.kaggle.com\/."},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Ehsan Kamalloo Nouha Dziri Charles Clarke and Davood Rafiei. 2023. Evaluating Open-Domain Question Answering in the Era of Large Language Models. 5591--5606.","DOI":"10.18653\/v1\/2023.acl-long.307"},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","unstructured":"James Kanter and Kalyan Veeramachaneni. 2015. Deep feature synthesis: Towards automating data science endeavors. 1--10.","DOI":"10.1109\/DSAA.2015.7344858"},{"key":"e_1_2_1_33_1","volume-title":"Eui Chul Richard Shin, and Dawn Xiaodong Song","author":"Katz Gilad","year":"2016","unstructured":"Gilad Katz, Eui Chul Richard Shin, and Dawn Xiaodong Song. 2016. ExploreKit: Automatic Feature Generation and Selection. (2016), 979--984."},{"key":"e_1_2_1_34_1","volume-title":"Garnett (Eds.)","volume":"30","author":"Ke Guolin","year":"2017","unstructured":"Guolin Ke, Qi Meng, Thomas Finley, TaifengWang,Wei Chen,Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc."},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Mahmoud Abo Khamis Hung Q. Ngo XuanLong Nguyen Dan Olteanu and Maximilian Schleich. 2018. AC\/DC: In-Database Learning Thunderstruck. Article 8 10 pages.","DOI":"10.1145\/3209889.3209896"},{"key":"e_1_2_1_36_1","volume-title":"Feature Engineering for Predictive Modeling Using Reinforcement Learning. 32 (09","author":"Khurana Udayan","year":"2017","unstructured":"Udayan Khurana, Horst Samulowitz, and Surya Deepak Turaga. 2017. Feature Engineering for Predictive Modeling Using Reinforcement Learning. 32 (09 2017)."},{"key":"e_1_2_1_37_1","volume-title":"Cognito: Automated Feature Engineering for Supervised Learning. 1304--1307.","author":"Khurana Udayan","year":"2016","unstructured":"Udayan Khurana, Deepak Turaga, Horst Samulowitz, and Srinivasan Parthasrathy. 2016. Cognito: Automated Feature Engineering for Supervised Learning. 1304--1307."},{"key":"e_1_2_1_38_1","volume-title":"A tree-edit-distance algorithm for comparing simple, closed shapes","author":"Klein Philip","unstructured":"Philip Klein, Srikanta Tirthapura, Daniel Sharvit, and Ben Kimia. 2000. A tree-edit-distance algorithm for comparing simple, closed shapes. Society for Industrial and Applied Mathematics, USA."},{"key":"e_1_2_1_39_1","volume-title":"Martin Wistuba, Udayan Khurana, Gregory Bramble, Theodoros Salonidis, Dakuo Wang, and Horst Samulowitz.","author":"Lam Hoang Thanh","year":"2021","unstructured":"Hoang Thanh Lam, Beat Buesser, Hong Min, Tran Ngoc Minh, Martin Wistuba, Udayan Khurana, Gregory Bramble, Theodoros Salonidis, Dakuo Wang, and Horst Samulowitz. 2021. Automated Data Science for Relational Data. 2689--2692."},{"key":"e_1_2_1_40_1","unstructured":"Scikit learn Developers. 2017. https:\/\/scikit-learn.org\/stable\/supervised_learning.html."},{"key":"e_1_2_1_41_1","volume-title":"Montana Low et al","author":"Lev Kokotov Silas Marvin","year":"2023","unstructured":"Silas Marvin Lev Kokotov, Montana Low et al. 2023. PostgresML. https:\/\/github.com\/postgresml\/postgresml"},{"key":"e_1_2_1_42_1","unstructured":"Liyao Li HaoboWang Liangyu Zha Qingyi Huang SaiWu Gang Chen and Junbo Zhao. 2023. Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering. OpenReview.net. https:\/\/openreview.net\/forum?id=688hNNMigVX"},{"key":"e_1_2_1_43_1","volume-title":"Deep Entity Matching with Pre-Trained Language Models. 14, 1","author":"Li Yuliang","year":"2020","unstructured":"Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, and Wang-Chiew Tan. 2020. Deep Entity Matching with Pre-Trained Language Models. 14, 1 (2020), 50--60."},{"key":"e_1_2_1_44_1","unstructured":"Zehan Li Xin Zhang Yanzhao Zhang Dingkun Long Pengjun Xie and Meishan Zhang. 2023. Towards General Text Embeddings with Multi-stage Contrastive Learning. arXiv:2308.03281 [cs.CL]"},{"key":"e_1_2_1_45_1","doi-asserted-by":"crossref","unstructured":"Qiuru Lin SaiWu Junbo Zhao Jian Dai Feifei Li and Gang Chen. 2022. A Comparative Study of in-Database Inference Approaches. 1794--1807.","DOI":"10.1109\/ICDE53745.2022.00180"},{"key":"e_1_2_1_46_1","volume-title":"SMARTFEAT: Efficient Feature Construction through Feature-Level Foundation Model Interactions. arXiv:2309.07856 [cs.DB]","author":"Lin Yin","year":"2023","unstructured":"Yin Lin, Bolin Ding, H. V. Jagadish, and Jingren Zhou. 2023. SMARTFEAT: Efficient Feature Construction through Feature-Level Foundation Model Interactions. arXiv:2309.07856 [cs.DB]"},{"key":"e_1_2_1_47_1","volume-title":"Yuyao Wang, and Lingming Zhang.","author":"Liu Jiawei","year":"2023","unstructured":"Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2023. Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation."},{"key":"e_1_2_1_48_1","unstructured":"Scott Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. arXiv:1705.07874 [cs.AI] https:\/\/arxiv.org\/abs\/1705.07874"},{"key":"e_1_2_1_49_1","volume-title":"308--320","author":"McCabe T.J.","year":"1976","unstructured":"T.J. McCabe. 1976. A Complexity Measure. SE-2, 4 (1976), 308--320."},{"key":"e_1_2_1_50_1","volume-title":"Can Foundation Models Wrangle Your Data? 16, 4","author":"Narayan Avanika","year":"2022","unstructured":"Avanika Narayan, Ines Chami, Laurel J. Orr, and Christopher R\u00e9. 2022. Can Foundation Models Wrangle Your Data? 16, 4 (2022), 738--746."},{"key":"e_1_2_1_51_1","doi-asserted-by":"crossref","unstructured":"Fatemeh Nargesian Horst Samulowitz Udayan Khurana Elias B. Khalil and Deepak Turaga. 2017. Learning Feature Engineering for Classification. 2529--2535.","DOI":"10.24963\/ijcai.2017\/352"},{"key":"e_1_2_1_52_1","unstructured":"OpenML. 2012."},{"key":"e_1_2_1_53_1","doi-asserted-by":"crossref","unstructured":"Kwanghyun Park Karla Saur Dalitso Banda Rathijit Sen Matteo Interlandi and Konstantinos Karanasos. 2022. End-to-end Optimization of Machine Learning Prediction Queries. 587--601.","DOI":"10.1145\/3514221.3526141"},{"key":"e_1_2_1_54_1","volume-title":"Vernon","author":"Patel Jignesh M.","year":"1994","unstructured":"Jignesh M. Patel, Michael J. Carey, and Mary K. Vernon. 1994. Accurate modeling of the hybrid hash join algorithm. Association for Computing Machinery, New York, NY, USA, 56--66."},{"key":"e_1_2_1_55_1","unstructured":"Qwen Team. 2024. Qwen2.5: A Party of Foundation Models. https:\/\/qwenlm.github.io\/blog\/qwen2.5\/"},{"key":"e_1_2_1_56_1","volume-title":"Oran Agra et al","author":"Salvatore Sanfilippo Pieter Noordhuis","year":"2023","unstructured":"Pieter Noordhuis Salvatore Sanfilippo, Oran Agra et al. 2023. Redis. https:\/\/github.com\/redis\/redis"},{"key":"e_1_2_1_57_1","volume-title":"LMFAO: An engine for batches of group-by aggregates: layered multiple functional aggregate optimization. 13, 12 (aug","author":"Schleich Maximilian","year":"2020","unstructured":"Maximilian Schleich and Dan Olteanu. 2020. LMFAO: An engine for batches of group-by aggregates: layered multiple functional aggregate optimization. 13, 12 (aug 2020), 2945--2948."},{"key":"e_1_2_1_58_1","unstructured":"Maxim Shcherbakov Adriaan Brebels N.L. Shcherbakova Anton Tyukov T.A. Janovsky and V.A. Kamaev. 2013. A survey of forecast error measures. 24 (01 2013) 171--176."},{"key":"e_1_2_1_59_1","volume-title":"Michael Pradel","author":"Spiess Claudio","year":"2024","unstructured":"Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md. Rafiqul Islam Rabin, Amin Alipour, Susmit Jha, Prem Devanbu, and Toufique Ahmed. 2024. Calibration and Correctness of Language Models for Code. abs\/2402.02047 (2024)."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/16894.16888"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.14778\/3457390.3457391"},{"key":"e_1_2_1_62_1","unstructured":"The Apache Software Foundation. 2023. Apache Benchmark. https:\/\/httpd.apache.org\/docs\/2.4\/programs\/ab.html."},{"key":"e_1_2_1_63_1","unstructured":"Irvine University of California. 2012."},{"key":"e_1_2_1_64_1","volume-title":"Representation Learning with Contrastive Predictive Coding. abs\/1807.03748","author":"van den Oord A\u00e4ron","year":"2018","unstructured":"A\u00e4ron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. abs\/1807.03748 (2018)."},{"key":"e_1_2_1_65_1","volume-title":"Sharan Narang, Aakanksha Chowdhery, and Denny Zhou.","author":"Wang Xuezhi","year":"2023","unstructured":"Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V Le, Ed H. Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. 2023. Self-Consistency Improves Chain of Thought Reasoning in Language Models."},{"key":"e_1_2_1_66_1","volume-title":"Fei Xia, Ed Chi, Quoc V Le, and Denny Zhou.","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, brian ichter, Fei Xia, Ed Chi, Quoc V Le, and Denny Zhou. 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, Vol. 35. 24824--24837."},{"key":"e_1_2_1_67_1","unstructured":"Colin White Samuel Dooley Manley Roberts Arka Pal Ben Feuer Siddhartha Jain Ravid Shwartz-Ziv Neel Jain Khalid Saifullah Siddartha Naidu Chinmay Hegde Yann LeCun Tom Goldstein Willie Neiswanger and Micah Goldblum. 2024. LiveBench: A Challenging Contamination-Free LLM Benchmark. arXiv:2406.19314 https:\/\/arxiv.org\/abs\/2406.19314"},{"key":"e_1_2_1_68_1","unstructured":"David Kofoed Wind. 2014. Concepts in predictive machine learning. (2014) 1--129."},{"key":"e_1_2_1_69_1","doi-asserted-by":"crossref","unstructured":"Shitao Xiao Zheng Liu Peitian Zhang Niklas Muennighoff Defu Lian and Jian-Yun Nie. 2024. C-Pack: Packaged Resources To Advance General Chinese Embedding. arXiv:2309.07597 [cs.CL]","DOI":"10.1145\/3626772.3657878"},{"key":"e_1_2_1_70_1","volume-title":"Optimizing machine learning inference queries with correlative proxy models. (2022)","author":"Yang Zhihui","year":"2032","unstructured":"Zhihui Yang, Zuozhi Wang, Yicong Huang, Yao Lu, Chen Li, and X. Sean Wang. 2022. Optimizing machine learning inference queries with correlative proxy models. (2022), 2032--2044."},{"key":"e_1_2_1_71_1","unstructured":"Shunyu Yao Jeffrey Zhao Dian Yu Nan Du Izhak Shafran Karthik R. Narasimhan and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. OpenReview.net."},{"key":"e_1_2_1_72_1","volume-title":"Simple Fast Algorithms for the Editing Distance between Trees and Related Problems. 18, 6","author":"Zhang Kaizhong","year":"1989","unstructured":"Kaizhong Zhang and Dennis Shasha. 1989. Simple Fast Algorithms for the Editing Distance between Trees and Related Problems. 18, 6 (1989), 1245--1262."},{"key":"e_1_2_1_73_1","unstructured":"Tianping Zhang Zheyu Zhang Zhiyuan Fan Haoyan Luo Fengyuan Liu Qian Liu Wei Cao and Jian Li. 2023. OpenFE: Automated Feature Generation with Expert-level Performance. arXiv:2211.12507 [cs.LG]"},{"key":"e_1_2_1_74_1","unstructured":"Andy Zhou Kai Yan Michal Shlapentokh-Rothman Haohan Wang and Yu-Xiong Wang. 2024. Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models. arXiv:2310.04406 [cs.AI]"},{"key":"e_1_2_1_75_1","volume-title":"DIFER: Differentiable Automated Feature Engineering.","author":"Zhu Guanghui","year":"2022","unstructured":"Guanghui Zhu, Zhuoer Xu, Chunfeng Yuan, and Yihua Huang. 2022. DIFER: Differentiable Automated Feature Engineering."}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3725262","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:56:56Z","timestamp":1774983416000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3725262"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,17]]},"references-count":75,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,6,17]]}},"alternative-id":["10.1145\/3725262"],"URL":"https:\/\/doi.org\/10.1145\/3725262","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,17]]}}}