{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T20:02:19Z","timestamp":1775246539838,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,7,10]],"date-time":"2025-07-10T00:00:00Z","timestamp":1752105600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Modern database systems require autonomous CPU scheduling frameworks that dynamically optimize resource allocation across heterogeneous workloads while maintaining strict performance guarantees. We present a novel hierarchical deep reinforcement learning framework augmented with graph neural networks to address CPU scheduling challenges in mixed database environments comprising Online Transaction Processing (OLTP), Online Analytical Processing (OLAP), vector processing, and background maintenance workloads. Our approach introduces three key innovations: first, a symmetric two-tier control architecture where a meta-controller allocates CPU budgets across workload categories using policy gradient methods while specialized sub-controllers optimize process-level resource allocation through continuous action spaces; second, graph neural network-based dependency modeling that captures complex inter-process relationships and communication patterns while preserving inherent symmetries in database architectures; and third, meta-learning integration with curiosity-driven exploration enabling rapid adaptation to previously unseen workload patterns without extensive retraining. The framework incorporates a multi-objective reward function balancing Service Level Objective (SLO) adherence, resource efficiency, symmetric fairness metrics, and system stability. Experimental evaluation through high-fidelity digital twin simulation and production deployment demonstrates substantial performance improvements: 43.5% reduction in p99 latency violations for OLTP workloads and 27.6% improvement in overall CPU utilization, with successful scaling to 10,000 concurrent processes maintaining sub-3% scheduling overhead. This work represents a significant advancement toward truly autonomous database resource management, establishing a foundation for next-generation self-optimizing database systems with implications extending to broader orchestration challenges in cloud-native architectures.<\/jats:p>","DOI":"10.3390\/sym17071109","type":"journal-article","created":{"date-parts":[[2025,7,10]],"date-time":"2025-07-10T09:38:27Z","timestamp":1752140307000},"page":"1109","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Self-Adapting CPU Scheduling for Mixed Database Workloads via Hierarchical Deep Reinforcement Learning"],"prefix":"10.3390","volume":"17","author":[{"given":"Suchuan","family":"Xing","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA"}]},{"given":"Yihan","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Engineering and Applied Science, The University of Pennsylvania, Philadelphia, PA 19104, USA"}]},{"given":"Wenhe","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA"}]}],"member":"1968","published-online":{"date-parts":[[2025,7,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"200901","DOI":"10.1007\/s11432-024-4125-9","article-title":"NeurDB: An AI-powered autonomous data system","volume":"67","author":"Ooi","year":"2024","journal-title":"Sci. China Inf. Sci."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"544","DOI":"10.14778\/3303753.3303760","article-title":"HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines","volume":"12","author":"Chrysogelos","year":"2019","journal-title":"Proc. Vldb Endow."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2491","DOI":"10.14778\/3551793.3551809","article-title":"Orchestrating data placement and query execution in heterogeneous CPU-GPU DBMS","volume":"15","author":"Yogatama","year":"2022","journal-title":"Proc. VLDB Endow."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Mao, H., Alizadeh, M., Menache, I., and Kandula, S. (2016, January 9\u201310). Resource management with deep reinforcement learning. Proceedings of the 15th ACM Workshop on Hot Topics in Networks, Atlanta, GA, USA.","DOI":"10.1145\/3005745.3005750"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Liu, N., Li, Z., Xu, J., Xu, Z., Lin, S., Qiu, Q., Tang, J., and Wang, Y. (2017, January 5\u20138). A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. Proceedings of the 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA.","DOI":"10.1109\/ICDCS.2017.123"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Zhao, Z., Verma, G., Rao, C., Swami, A., and Segarra, S. (2021, January 6\u201311). Distributed scheduling using graph neural networks. Proceedings of the ICASSP 2021\u20132021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada.","DOI":"10.1109\/ICASSP39728.2021.9414098"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.14778\/3641204.3641219","article-title":"Xgnn: Boosting multi-gpu gnn training via global gnn memory store","volume":"17","author":"Tang","year":"2024","journal-title":"Proc. VLDB Endow."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3453160","article-title":"Hierarchical reinforcement learning: A comprehensive survey","volume":"54","author":"Pateria","year":"2021","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_9","first-page":"20064","article-title":"Causality-driven hierarchical structure discovery for reinforcement learning","volume":"35","author":"Hu","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_10","unstructured":"Nagabandi, A., Clavera, I., Liu, S., Fearing, R.S., Abbeel, P., Levine, S., and Finn, C. (2018). Learning to adapt in dynamic, real-world environments through meta-reinforcement learning. arXiv."},{"key":"ref_11","unstructured":"Alet, F., Schneider, M.F., Lozano-Perez, T., and Kaelbling, L.P. (2020). Meta-learning curiosity algorithms. arXiv."},{"key":"ref_12","unstructured":"Jarrett, D., Tallec, C., Altch\u00e9, F., Mesnard, T., Munos, R., and Valko, M. (2022). Curiosity in hindsight: Intrinsic exploration in stochastic environments. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, J., Gao, H., Lv, T., and Lu, Y. (2018, January 15\u201318). Deep reinforcement learning based computation offloading and resource allocation for MEC. Proceedings of the 2018 IEEE Wireless Communications and Networking Conference (WCNC), Barcelona, Spain.","DOI":"10.1109\/WCNC.2018.8377343"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3163","DOI":"10.1109\/TVT.2019.2897134","article-title":"Deep reinforcement learning based resource allocation for V2V communications","volume":"68","author":"Ye","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Feng, Y., and Liu, F. (2022). Resource management in cloud computing using deep reinforcement learning: A survey. Proceedings of the China Aeronautical Science and Technology Youth Science Forum, Springer.","DOI":"10.1007\/978-981-19-7652-0_56"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1911","DOI":"10.1109\/TPDS.2021.3132422","article-title":"Adaptive and efficient resource allocation in cloud datacenters using actor-critic deep reinforcement learning","volume":"33","author":"Chen","year":"2021","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"10823","DOI":"10.1007\/s12652-020-02884-1","article-title":"Workflow scheduling based on deep reinforcement learning in the cloud environment","volume":"12","author":"Dong","year":"2021","journal-title":"J. Ambient. Intell. Humaniz. Comput."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhang, J., Liu, Y., Zhou, K., Li, G., Xiao, Z., Cheng, B., Xing, J., Wang, Y., Cheng, T., and Liu, L. (July, January 30). An end-to-end automatic cloud database tuning system using deep reinforcement learning. Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands.","DOI":"10.1145\/3299869.3300085"},{"key":"ref_19","unstructured":"Manczak, B., Viebahn, J., and van Hoof, H. (2023). Hierarchical reinforcement learning for power network topology control. arXiv."},{"key":"ref_20","first-page":"1409","article-title":"Hierarchical reinforcement learning with advantage-based auxiliary rewards","volume":"32","author":"Li","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_21","first-page":"3307","article-title":"Data-efficient hierarchical reinforcement learning","volume":"31","author":"Nachum","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhao, Z., Verma, G., Swami, A., and Segarra, S. (2022, January 23\u201327). Delay-oriented distributed scheduling using graph neural networks. Proceedings of the ICASSP 2022\u20132022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore.","DOI":"10.1109\/ICASSP43922.2022.9746926"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wang, Q., Zhang, Y., Wang, H., Chen, C., Zhang, X., and Yu, G. (2022, January 12\u201317). Neutronstar: Distributed GNN training with hybrid dependency management. Proceedings of the Proceedings of the 2022 International Conference on Management of Data, Philadelphia, PA, USA.","DOI":"10.1145\/3514221.3526134"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Wang, L., Yin, Q., Tian, C., Yang, J., Chen, R., Yu, W., Yao, Z., and Zhou, J. (2021, January 26\u201318). FlexGraph: A flexible and efficient distributed framework for GNN training. Proceedings of the Proceedings of the Sixteenth European Conference on Computer Systems, Online Event.","DOI":"10.1145\/3447786.3456229"},{"key":"ref_25","unstructured":"Song, Z., Antsaklis, P.J., and Lin, H. (2025). Graph Neural Network-Based Distributed Optimal Control for Linear Networked Systems: An Online Distributed Training Approach. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4241","DOI":"10.14778\/3685800.3685845","article-title":"Workload placement on heterogeneous CPU-GPU systems","volume":"17","author":"Simitsis","year":"2024","journal-title":"Proc. VLDB Endow."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Lyu, C., Fan, Q., Song, F., Sinha, A., Diao, Y., Chen, W., Ma, L., Feng, Y., Li, Y., and Zeng, K. (2022). Fine-grained modeling and optimization for intelligent resource management in big data processing. arXiv.","DOI":"10.14778\/3551793.3551855"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"El Danaoui, M., Yin, S., Hameurlain, A., and Morvan, F. (2024, January 15\u201317). Leveraging Workload Prediction for Query Optimization in Multi-Tenant Parallel DBMSs. Proceedings of the 2024 8th International Conference on Cloud and Big Data Computing, Oxford, UK.","DOI":"10.1145\/3694860.3694866"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"980","DOI":"10.14778\/3641204.3641209","article-title":"Pilotscope: Steering databases with machine learning drivers","volume":"17","author":"Zhu","year":"2024","journal-title":"Proc. VLDB Endow."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"539","DOI":"10.14778\/3632093.3632114","article-title":"An efficient transfer learning based configuration adviser for database tuning","volume":"17","author":"Zhang","year":"2023","journal-title":"Proc. VLDB Endow."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3373","DOI":"10.14778\/3681954.3682007","article-title":"The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-Actions","volume":"17","author":"Zhang","year":"2024","journal-title":"Proc. VLDB Endow."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"3680","DOI":"10.14778\/3681954.3682030","article-title":"Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management Systems","volume":"17","author":"Lim","year":"2024","journal-title":"Proc. VLDB Endow."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Tian, Y., Cahoon, J., Krishnan, S., Agarwal, A., Alotaibi, R., Camacho-Rodr\u00edguez, J., Chundatt, B., Chung, A., and Dutta, N. (2023, January 18\u201323). Towards building autonomous data services on azure. Proceedings of the Companion of the 2023 International Conference on Management of Data, Seattle, WA, USA.","DOI":"10.1145\/3555041.3589674"},{"key":"ref_34","unstructured":"Beck, J., Vuorio, R., Liu, E.Z., Xiong, Z., Zintgraf, L., Finn, C., and Whiteson, S. (2023). A survey of meta-reinforcement learning. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Aubret, A., Matignon, L., and Hassas, S. (2023). An information-theoretic perspective on intrinsic motivation in reinforcement learning: A survey. Entropy, 25.","DOI":"10.3390\/e25020327"},{"key":"ref_36","unstructured":"Yuan, M. (2022). Intrinsically-motivated reinforcement learning: A brief introduction. arXiv."},{"key":"ref_37","unstructured":"Raileanu, R., and Rockt\u00e4schel, T. (2020). Ride: Rewarding impact-driven exploration for procedurally-generated environments. arXiv."},{"key":"ref_38","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv."},{"key":"ref_39","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10\u201315). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_40","unstructured":"Schaul, T., Quan, J., Antonoglou, I., and Silver, D. (2015). Prioritized experience replay. arXiv."},{"key":"ref_41","unstructured":"Kipf, T.N., and Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., and Dahl, G.E. (2020). Message passing neural networks. Machine Learning Meets Quantum Physics, Springer.","DOI":"10.1007\/978-3-030-40245-7_10"},{"key":"ref_43","unstructured":"Schulman, J., Moritz, P., Levine, S., Jordan, M., and Abbeel, P. (2015). High-dimensional continuous control using generalized advantage estimation. arXiv."},{"key":"ref_44","first-page":"4","article-title":"Completely fair scheduler","volume":"2009","author":"Pabla","year":"2009","journal-title":"Linux J."},{"key":"ref_45","unstructured":"Corporation, O. (2023). Managing Resources with Oracle Database Resource Manager. Oracle Database Administrator\u2019s Guide, 23c, Oracle Corporation."},{"key":"ref_46","unstructured":"Agrawal, S., and Goyal, N. (2013, January 16\u201321). Thompson sampling for contextual bandits with linear payoffs. Proceedings of the 30th International Conference on International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"2391","DOI":"10.1016\/j.jksuci.2022.03.016","article-title":"Intelligent multi-agent reinforcement learning model for resources allocation in cloud computing","volume":"34","author":"Belgacem","year":"2022","journal-title":"J. King Saud-Univ.-Comput. Inf. Sci."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"153432","DOI":"10.1109\/ACCESS.2019.2948150","article-title":"SCARL: Attentive reinforcement learning-based scheduling in a multi-resource heterogeneous cluster","volume":"7","author":"Cheong","year":"2019","journal-title":"IEEE Access"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/7\/1109\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:07:40Z","timestamp":1760033260000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/7\/1109"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,10]]},"references-count":48,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["sym17071109"],"URL":"https:\/\/doi.org\/10.3390\/sym17071109","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,10]]}}}