{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,13]],"date-time":"2026-06-13T04:58:56Z","timestamp":1781326736433,"version":"3.54.1"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"name":"the Key Research Program of Zhejiang Province","award":["2023C01037"],"award-info":[{"award-number":["2023C01037"]}]},{"DOI":"10.13039\/501100001809","name":"the National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62572422"],"award-info":[{"award-number":["62572422"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,12,4]]},"abstract":"<jats:p>The increasing demand for deep neural inference within database environments has driven the emergence of AI?native DBMSs. However, existing solutions either rely on model-centric designs requiring developers to manually select, configure, and maintain models, resulting in high development overhead, or adopt task-centric AutoML approaches with high computational costs and poor DBMS integration. We present MorphingDB, a task-centric AI-native DBMS that automates model storage, selection, and inference within PostgreSQL. To enable flexible, I\/O-efficient storage of deep learning models, we first introduce specialized schemas and multi-dimensional tensor data types to support BLOB-based all-in-one and decoupled model storage. Then we design a transfer learning framework for model selection in two phases, which builds a transferability subspace via offline embedding of historical tasks and employs online projection through feature-aware mapping for real-time tasks. To further optimize inference throughput, we propose pre-embedding with vectoring sharing to eliminate redundant computations and DAG-based batch pipelines with cost-aware scheduling to minimize the inference time. Implemented as a PostgreSQL extension with LibTorch, MorphingDB outperforms AI-native DBMSs (EvaDB, Madlib, GaussML) and AutoML platforms (AutoGluon, AutoKeras, AutoSklearn) across nine public datasets, encompassing series, NLP, and image tasks. Our evaluation demonstrates a robust balance among accuracy, resource consumption, and time cost in model selection and significant gains in throughput and resource efficiency.<\/jats:p>","DOI":"10.1145\/3769844","type":"journal-article","created":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T04:32:13Z","timestamp":1764995533000},"page":"1-26","source":"Crossref","is-referenced-by-count":0,"title":["MorphingDB: A Task-Centric AI-Native DBMS for Model Management and Inference"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7903-1496","authenticated-orcid":false,"given":"Sai","family":"Wu","sequence":"first","affiliation":[{"name":"Zhejiang University. Zhejiang Key Laboratory of Big Data Intelligent Computing, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-2736-451X","authenticated-orcid":false,"given":"Ruichen","family":"Xia","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8156-3926","authenticated-orcid":false,"given":"Dingyu","family":"Yang","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University. Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8915-4169","authenticated-orcid":false,"given":"Rui","family":"Wang","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8454-3030","authenticated-orcid":false,"given":"Huihang","family":"Lai","sequence":"additional","affiliation":[{"name":"Institute of Computing Innovation, Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-7038-3154","authenticated-orcid":false,"given":"Jiarui","family":"Guan","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4732-5913","authenticated-orcid":false,"given":"Jiameng","family":"Bai","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-6338-0698","authenticated-orcid":false,"given":"Dongxiang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8611-0283","authenticated-orcid":false,"given":"Xiu","family":"Tang","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2924-6974","authenticated-orcid":false,"given":"Zhongle","family":"Xie","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9200-7896","authenticated-orcid":false,"given":"Peng","family":"Lu","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7483-0045","authenticated-orcid":false,"given":"Gang","family":"Chen","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,12,5]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2018. Microsoft Azure SQL Database. https:\/\/azure.microsoft.com\/en-us\/products\/azure-sql\/database\/"},{"key":"e_1_2_1_2_1","unstructured":"2020. Google Cloud AI Platform. https:\/\/cloud.google.com\/ai-platform"},{"key":"e_1_2_1_3_1","unstructured":"2020. MindsDB. https:\/\/mindsdb.com\/"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","unstructured":"2020. Swarm Behaviour. UCI Machine Learning Repository. DOI: https:\/\/doi.org\/10.24432\/C5N02J.","DOI":"10.24432\/C5N02J"},{"key":"e_1_2_1_5_1","unstructured":"2021. Amazon SageMaker. https:\/\/aws.amazon.com\/sagemaker\/"},{"key":"e_1_2_1_6_1","unstructured":"2022. IBM Db2 AI for z\/OS. https:\/\/www.ibm.com\/products\/db2-ai-for-zos"},{"key":"e_1_2_1_7_1","unstructured":"2023. Apache MADlib. https:\/\/madlib.apache.org"},{"key":"e_1_2_1_8_1","unstructured":"2023. EvaDB. https:\/\/evadb.ai\/"},{"key":"e_1_2_1_9_1","unstructured":"2023. Oracle Autonomous Database. https:\/\/www.oracle.com\/database\/technologies\/autonomous-database.html"},{"key":"e_1_2_1_10_1","unstructured":"2023. SAP HANA. https:\/\/www.sap.com\/products\/hana.html"},{"key":"e_1_2_1_11_1","unstructured":"2024. LibTorch. https:\/\/pytorch.org\/cppdocs\/"},{"key":"e_1_2_1_12_1","unstructured":"2024. ONNX-Open Neural Network Exchange. https:\/\/onnx.ai\/"},{"key":"e_1_2_1_13_1","unstructured":"2024. PostgreSQL. https:\/\/postgresql.org\/"},{"key":"e_1_2_1_14_1","unstructured":"2025. . https:\/\/chatgpt.com\/"},{"key":"e_1_2_1_15_1","unstructured":"2025. Hugging Face Hub. https:\/\/huggingface.co"},{"key":"e_1_2_1_16_1","unstructured":"2025. Ollama. https:\/\/ollama.com\/"},{"key":"e_1_2_1_17_1","unstructured":"2025. Oracle AI Vector Search User's Guide. https:\/\/docs.oracle.com\/en\/database\/oracle\/oracle-database\/23\/vecse\/"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.24432\/C50K61"},{"key":"e_1_2_1_19_1","first-page":"19301","article-title":"Scalable diverse model selection for accessible transfer learning","volume":"34","author":"Bolya Daniel","year":"2021","unstructured":"Daniel Bolya, Rohit Mittapalli, and Judy Hoffman. 2021. Scalable diverse model selection for accessible transfer learning. NIPS 34 (2021), 19301-19312.","journal-title":"NIPS"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3329859.3329878"},{"key":"e_1_2_1_21_1","first-page":"459","article-title":"Tsmixer: Lightweight mlp-mixer model for multivariate time series forecasting","author":"Ekambaram Vijay","year":"2023","unstructured":"Vijay Ekambaram, Arindam Jati, Nam Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. 2023. Tsmixer: Lightweight mlp-mixer model for multivariate time series forecasting. In SIGKDD. 459-469.","journal-title":"SIGKDD."},{"key":"e_1_2_1_22_1","volume-title":"The next generation. arXiv preprint arXiv:2007.04074","author":"Feurer M","year":"2020","unstructured":"M Feurer, K Eggensperger, S Falkner, J T Springenberg, A Klein, and F Hutter. 2020. Auto-sklearn 2.0: The next generation. arXiv preprint arXiv:2007.04074 (2020)."},{"key":"e_1_2_1_23_1","volume-title":"Yolox: Exceeding yolo series in","author":"Ge Zheng","year":"2021","unstructured":"Zheng Ge, Songtao Liu, FengWang, Zeming Li, and Jian Sun. 2021. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)."},{"key":"e_1_2_1_24_1","volume-title":"Relative location of CT slices on axial axis data set. UCI Machine Learning Repository","author":"Graf Franz","year":"2011","unstructured":"Franz Graf, Hans-Peter Kriegel, Matthias Schubert, Sebastian Poelsterl, and Alexander Cavallaro. 2011. Relative location of CT slices on axial axis data set. UCI Machine Learning Repository (2011)."},{"key":"e_1_2_1_25_1","volume-title":"GaussML: An End-to-End In-database Machine Learning System. ICDE","author":"Shifu Li Jiang Wang Lijie Xu","year":"2024","unstructured":"Lijie Xu Shifu Li Jiang Wang Wen Nie Guoliang Li, Ji Sun. 2024. GaussML: An End-to-End In-database Machine Learning System. ICDE (2024)."},{"key":"e_1_2_1_26_1","first-page":"5349","article-title":"Why resnet works? residuals generalize","volume":"31","author":"He Fengxiang","year":"2020","unstructured":"Fengxiang He, Tongliang Liu, and Dacheng Tao. 2020. Why resnet works? residuals generalize. IEEE TNNLS 31, 12 (2020), 5349-5362.","journal-title":"IEEE TNNLS"},{"key":"e_1_2_1_27_1","first-page":"2961","article-title":"Mask R-CNN","author":"He Kaiming","year":"2017","unstructured":"Kaiming He, Georgia Gkioxari, Piotr Doll\u00e1r, and Ross Girshick. 2017. Mask R-CNN. In ICCV. 2961-2969.","journal-title":"ICCV."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2020.106622"},{"key":"e_1_2_1_29_1","first-page":"159","article-title":"DB4ML-an in-memory database kernel with machine learning support","author":"Jasny Matthias","year":"2020","unstructured":"Matthias Jasny, Tobias Ziegler, Tim Kraska, Uwe Roehm, and Carsten Binnig. 2020. DB4ML-an in-memory database kernel with machine learning support. In SIGMOD. 159-173.","journal-title":"SIGMOD."},{"key":"e_1_2_1_30_1","first-page":"1946","article-title":"Auto-keras: An efficient neural architecture search system","author":"Jin Haifeng","year":"2019","unstructured":"Haifeng Jin, Qingquan Song, and Xia Hu. 2019. Auto-keras: An efficient neural architecture search system. In SIGKDD. ACM, 1946-1956.","journal-title":"SIGKDD. ACM"},{"key":"e_1_2_1_31_1","unstructured":"Konstantinos Karanasos Matteo Interlandi Doris Xin Fotis Psallidas Rathijit Sen Kwanghyun Park Ivan Popivanov Supun Nakandal Subru Krishnan Markus Weimer et al. 2020. Extending Relational Query Processing with ML Inference. In CIDR."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3470918"},{"key":"e_1_2_1_33_1","unstructured":"Aditya Khosla Nityananda Jayadevaprakash Bangpeng Yao and Li Fei-Fei. 2012. Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs. https:\/\/api.semanticscholar.org\/CorpusID:3181866"},{"key":"e_1_2_1_34_1","first-page":"311","article-title":"Exploration of Approaches for In-Database ML","author":"Kl\u00e4be Steffen","year":"2023","unstructured":"Steffen Kl\u00e4be, Stefan Hagedorn, and Kai-Uwe Sattler. 2023. Exploration of Approaches for In-Database ML. In EDBT. 311-323.","journal-title":"EDBT."},{"key":"e_1_2_1_35_1","unstructured":"Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical Report."},{"key":"e_1_2_1_36_1","volume-title":"Imagenet classification with deep convolutional neural networks. NIPS 25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. NIPS 25 (2012)."},{"key":"e_1_2_1_37_1","volume-title":"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In ICLR.","author":"Lan Zhenzhong","year":"2020","unstructured":"Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2020. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In ICLR."},{"key":"e_1_2_1_38_1","first-page":"2859","article-title":"AI meets database: AI4DB and DB4AI","author":"Li Guoliang","year":"2021","unstructured":"Guoliang Li, Xuanhe Zhou, and Lei Cao. 2021. AI meets database: AI4DB and DB4AI. In SIGMOD. 2859-2866.","journal-title":"SIGMOD."},{"key":"e_1_2_1_39_1","first-page":"1933","article-title":"Mlog: Towards declarative in-database machine learning","volume":"10","author":"Li Xupeng","year":"2017","unstructured":"Xupeng Li, Bin Cui, Yiru Chen, Wentao Wu, and Ce Zhang. 2017. Mlog: Towards declarative in-database machine learning. PVLDB 10, 12 (2017), 1933-1936.","journal-title":"PVLDB"},{"key":"e_1_2_1_40_1","first-page":"142","article-title":"Learning word vectors for sentiment analysis","author":"Maas Andrew","year":"2011","unstructured":"Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In ACL. 142-150.","journal-title":"ACL."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.23062"},{"key":"e_1_2_1_42_1","first-page":"587","article-title":"End-to-end optimization of machine learning prediction queries","author":"Park Kwanghyun","year":"2022","unstructured":"Kwanghyun Park, Karla Saur, Dalitso Banda, Rathijit Sen, Matteo Interlandi, and Konstantinos Karanasos. 2022. End-to-end optimization of machine learning prediction queries. In SIGMOD. 587-601.","journal-title":"SIGMOD."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.14778\/3659437.3659441"},{"key":"e_1_2_1_45_1","volume-title":"a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv abs\/1910.01108","author":"Sanh Victor","year":"2019","unstructured":"Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv abs\/1910.01108 (2019)."},{"key":"e_1_2_1_46_1","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D13-1170"},{"key":"e_1_2_1_48_1","first-page":"1","article-title":"Vexless: a serverless vector data management system using cloud functions","volume":"2","author":"Su Yongye","year":"2024","unstructured":"Yongye Su, Yinqi Sun, Minjia Zhang, and Jianguo Wang. 2024. Vexless: a serverless vector data management system using cloud functions. SIGMOD 2, 3 (2024), 1-26.","journal-title":"SIGMOD"},{"key":"e_1_2_1_49_1","first-page":"2818","article-title":"Rethinking the inception architecture for computer vision","author":"Szegedy Christian","year":"2016","unstructured":"Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In CVPR. 2818-2826.","journal-title":"CVPR."},{"key":"e_1_2_1_50_1","unstructured":"Z Tang H Fang S Zhou et al. 2024. AutoGluon-Multimodal (AutoMM): Supercharging multimodal AutoML with foundation models. PMLR 256 (2024)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.3233\/FAIA230558"},{"key":"e_1_2_1_52_1","unstructured":"Yi Wang Yang Yang Weiguo Zhu Yi Wu Xu Yan Yongfeng Liu Yu Wang Liang Xie Ziyao Gao Wenjing Zhu et al. 2020. SQLFlow: A Bridge between SQL and Machine Learning. CoRR (2020)."},{"key":"e_1_2_1_53_1","first-page":"641","article-title":"C-pack: Packed resources for general chinese embeddings","author":"Xiao Shitao","year":"2024","unstructured":"Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff, Defu Lian, and Jian-Yun Nie. 2024. C-pack: Packed resources for general chinese embeddings. In SIGIR. 641-649.","journal-title":"SIGIR."},{"key":"e_1_2_1_54_1","volume-title":"Jeffrey Xu Yu, and Yingfan Liu","author":"Xie Jiadong","year":"2025","unstructured":"Jiadong Xie, Jeffrey Xu Yu, and Yingfan Liu. 2025. Fast Approximate Similarity Join in Vector Databases. SIGMOD (2025)."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.14778\/3641204.3641212"},{"key":"e_1_2_1_56_1","first-page":"1","article-title":"AquaPipe: A Quality-Aware Pipeline for Knowledge Retrieval and Large Language Models","volume":"3","author":"Yu Runjie","year":"2025","unstructured":"Runjie Yu, Weizhou Huang, Shuhan Bai, Jian Zhou, and Fei Wu. 2025. AquaPipe: A Quality-Aware Pipeline for Knowledge Retrieval and Large Language Models. SIGMOD 3, 1 (2025), 1-26.","journal-title":"SIGMOD"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.01575"},{"key":"e_1_2_1_58_1","volume-title":"Beng Chin Ooi, and et al","author":"Zhao Zhanhao","year":"2025","unstructured":"Zhanhao Zhao, Shaofeng Cai, Haotian Gao, Hexiang Pan, Siqi Xiang, Naili Xing, Gang Chen, Beng Chin Ooi, and et al. 2025. NeurDB: On the Design and Implementation of an AI-powered Autonomous Database. In CIDR."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2020.3004555"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3769844","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,6,13]],"date-time":"2026-06-13T04:42:36Z","timestamp":1781325756000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3769844"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,4]]},"references-count":59,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,12,4]]}},"alternative-id":["10.1145\/3769844"],"URL":"https:\/\/doi.org\/10.1145\/3769844","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,4]]}}}