{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T14:12:49Z","timestamp":1762956769534,"version":"build-2065373602"},"reference-count":44,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2022,1,1]],"date-time":"2022-01-01T00:00:00Z","timestamp":1640995200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61862060"],"award-info":[{"award-number":["61862060"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>With database management systems becoming complex, predicting the execution time of graph queries before they are executed is one of the challenges for query scheduling, workload management, resource allocation, and progress monitoring. Through the comparison of query performance prediction methods, existing research works have solved such problems in traditional SQL queries, but they cannot be directly applied in Cypher queries on the Neo4j database. Additionally, most query performance prediction methods focus on measuring the relationship between correlation coefficients and retrieval performance. Inspired by machine-learning methods and graph query optimization technologies, we used the RBF neural network as a prediction model to train and predict the execution time of Cypher queries. Meanwhile, the corresponding query pattern features, graph data features, and query plan features were fused together and then used to train our prediction models. Furthermore, we also deployed a monitor node and designed a Cypher query benchmark for the database clusters to obtain the query plan information and native data store. The experimental results of four benchmarks showed that the average mean relative error of the RBF model reached 16.5% in the Northwind dataset, 12% in the FIFA2021 dataset, and 16.25% in the CORD-19 dataset. This experiment proves the effectiveness of our proposed approach on three real-world datasets.<\/jats:p>","DOI":"10.3390\/sym14010055","type":"journal-article","created":{"date-parts":[[2022,1,9]],"date-time":"2022-01-09T23:35:09Z","timestamp":1641771309000},"page":"55","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Execution Time Prediction for Cypher Queries in the Neo4j Database Using a Learning Approach"],"prefix":"10.3390","volume":"14","author":[{"given":"Zhenzhen","family":"He","sequence":"first","affiliation":[{"name":"School of Information Science and Engineering, Xinjiang University, Urumqi 830049, China"}]},{"given":"Jiong","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, Xinjiang University, Urumqi 830049, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5254-1528","authenticated-orcid":false,"given":"Binglei","family":"Guo","sequence":"additional","affiliation":[{"name":"School of Computer Engineering, Hubei University of Arts and Science, Xiangyang 441053, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,1,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Drakopoulos, G., Kanavos, A., and Tsakalidis, A.K. (2016, January 23\u201325). Evaluating Twitter Influence Ranking with System Theory. Proceedings of the 12th International Conference on Web Information Systems and Technologies (WeBIST), Rome, Italy.","DOI":"10.5220\/0005811701130120"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13040-016-0102-8","article-title":"Representing and querying disease networks using graph databases","volume":"9","author":"Lysenko","year":"2016","journal-title":"BioData Min."},{"key":"ref_3","unstructured":"Guirguis, S., Sharaf, M.A., Chrysanthis, P.K., Labrinidis, A., and Pruhs, K. (April, January 29). Adaptive scheduling of web transactions. Proceedings of the IEEE 25th International Conference on Data Engineering, Shanghai, China."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"489","DOI":"10.3233\/SW-160218","article-title":"Knowledge graph refinement: A survey of approaches and evaluation methods","volume":"8","author":"Paulheim","year":"2017","journal-title":"Semant. Web"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3401027","article-title":"Efficient Authorization of Graph-database Queries in an Attribute-supporting ReBAC Model","volume":"23","author":"Rizvi","year":"2020","journal-title":"ACM Trans. Priv. Secur. (TOPS)"},{"key":"ref_6","first-page":"48","article-title":"A Survey on Graph Queries Processing: Techniques and Methods","volume":"9","author":"Dinari","year":"2017","journal-title":"Int. J. Comput. Netw. Inf. Secur."},{"key":"ref_7","unstructured":"Scabora, L.C., Spadon, G., Oliveira, P.H., Rodrigues, J.F., and Traina, C. (April, January 30). Enhancing recursive graph querying on RDBMS with data clustering approaches. Proceedings of the 35th Annual ACM Symposium on Applied Computing, Brno, Czech Republic."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Hauff, C., and Azzopardi, L. (2009, January 19\u201323). When is query performance prediction effective?. Proceedings of the 32nd international ACM SIGIR conference on Research and Development in Information Retrieval\u2014SIGIR, Boston, MA, USA.","DOI":"10.1145\/1571941.1572150"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zendel, O., Shtok, A., and Raiber, F. (2019, January 21\u201325). Information needs, queries, and query performance prediction. Proceedings of the 42nd International ACM uSIGIR Conference on Research and Development in Information Retrieval, Paris, France.","DOI":"10.1145\/3331184.3331253"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Holzschuher, F., and Peinl, R. (2013, January 18\u201322). Performance of graph query languages: Comparison of cypher, gremlin and native access in neo4j. Proceedings of the Joint EDBT\/ICDT 2013 Workshops, Genoa, Italy.","DOI":"10.1145\/2457317.2457351"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Li, J., Ma, X., and Singh, K. (2009, January 26\u201328). Machine learning based online performance prediction for runtime parallelization and task scheduling. Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software, Boston, MA, USA.","DOI":"10.1109\/ISPASS.2009.4919641"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Macdonald, C., Tonellotto, N., and Ounis, I. (2012, January 12\u201316). Learning to predict response times for online query scheduling. Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, OR, USA.","DOI":"10.1145\/2348283.2348367"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.is.2018.04.005","article-title":"Performance prediction and adaptation for database management system workload using Case-Based Reasoning approach","volume":"76","author":"Raza","year":"2018","journal-title":"Inf. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Duggan, J., Cetintemel, U., and Papaemmanouil, O. (2011, January 12\u201316). Performance prediction for concurrent database workloads. Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, Athens, Greece.","DOI":"10.1145\/1989323.1989359"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"198","DOI":"10.7763\/IJCTE.2012.V4.450","article-title":"Self-prediction of performance metrics for the database management system workload","volume":"4","author":"Raza","year":"2012","journal-title":"Int. J. Comput. Theory Eng."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Li, J., K\u00f6nig, A.C., Narasayya, V., and Chaudhuri, S. (2012). Robust estimation of resource consumption for SQL queries using statistical techniques. arXiv.","DOI":"10.14778\/2350229.2350269"},{"key":"ref_17","unstructured":"Duggan, J., Papaemmanouil, O., Cetintemel, U., and Upfal, E. (2014, January 24\u201328). Contender: A Resource Modeling Approach for Concurrent Query Performance Prediction. Proceedings of the Extending Database Technology, Athens, Greece."},{"key":"ref_18","unstructured":"Murugesan, M., Shen, J., and Qi, Y. (2020). Resource Estimation for Queries in Large-Scale Distributed Database System. (10,762,539), U.S. Patent."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1108\/02635571211193617","article-title":"Periodic performance prediction for real-time business process monitoring","volume":"112","author":"Kang","year":"2012","journal-title":"Ind. Manag. Data Syst."},{"key":"ref_20","unstructured":"Zhao, P., and Han, J. (2010, January 13\u201317). On graph query optimization in large networks. Proceedings of the VLDB Endowment, Singapore."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Das, S., Goyal, A., and Chakravarthy, S. (2016, January 6\u20138). Plan before you execute: A cost-based query optimizer for attributed graph databases. Proceedings of the International Conference on Big Data Analytics and Knowledge Discovery, Porto, Portugal.","DOI":"10.1007\/978-3-319-43946-4_21"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Namaki, M.H., Sasani, K., and Wu, Y. (2017, January 19). Performance prediction for graph queries. Proceedings of the 2nd International Workshop on Network Data Analytics, Chicago, IL, USA.","DOI":"10.1145\/3068943.3068947"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Sasani, K., Namaki, M.H., and Wu, Y. (2018, January 21\u201324). Multi-metric graph query performance prediction. Proceedings of the International Conference on Database Systems for Advanced Applications, Gold Coast, QLD, Australia.","DOI":"10.1007\/978-3-319-91452-7_19"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1016\/j.is.2005.11.003","article-title":"Query performance prediction","volume":"31","author":"He","year":"2006","journal-title":"Inf. Syst."},{"key":"ref_25","unstructured":"Wu, W., Chi, Y., Zhu, S., Tatemura, J., Hacig\u00fcm\u00fcs, H., and Naughton, J.F. (2013, January 8\u201312). Predicting query execution time: Are optimizer cost models really unusable?. Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), Brisbane, QLD, Australia."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"925","DOI":"10.14778\/2536206.2536219","article-title":"Towards predicting query execution time for concurrent and dynamic database workloads","volume":"6","author":"Wu","year":"2013","journal-title":"Proc. VLDB Endow."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Hasan, R., and Gandon, F. (2014, January 11\u201314). A Machine Learning Approach to SPARQL Query Performance Prediction. Proceedings of the International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) IEEE, Washington, DC, USA.","DOI":"10.1109\/WI-IAT.2014.43"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhang, W.E., Sheng, Q.Z., and Taylor, K. (2016, January 8\u201310). Learning-based SPARQL query performance prediction. Proceedings of the International Conference on Web Information Systems Engineering, Shanghai, China.","DOI":"10.1007\/978-3-319-48740-3_23"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1007\/s11280-017-0498-1","article-title":"Learning-based SPARQL query performance modeling and prediction","volume":"21","author":"Zhang","year":"2018","journal-title":"World Wide Web"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Marcus, R., and Papaemmanouil, O. (2019). Plan-structured deep neural network models for query performance prediction. arXiv.","DOI":"10.14778\/3342263.3342646"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1416","DOI":"10.14778\/3397230.3397238","article-title":"Query performance prediction for concurrent queries using graph embedding","volume":"13","author":"Zhou","year":"2020","journal-title":"Proc. VLDB Endow."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Namaki, M.H., Chowdhury, F.A., Islam, M., Doppa, J., and Wu, Y. (2018). Learning to Speed Up Query Planning in Graph Databases. arXiv.","DOI":"10.1609\/icaps.v27i1.13849"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Izs\u00f3, B., Szatm\u00e1ri, Z., and Bergmann, G. (2013, January 11\u201315). Towards precise metrics for predicting graph query performance. Proceedings of the 28th IEEE\/ACM International Conference on Automated Software Engineering (ASE), Silicon Valley, CA, USA.","DOI":"10.1109\/ASE.2013.6693100"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"534","DOI":"10.1016\/j.future.2020.06.006","article-title":"A novel deep learning method for query task execution time prediction in graph database","volume":"112","author":"Chu","year":"2020","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1109\/TSMCB.2003.810911","article-title":"Data dimensionality reduction with application to simplifying RBF network structure and improving classifica-tion performance","volume":"33","author":"Fu","year":"2003","journal-title":"IEEE Trans. Syst. Man Cybern. Part B (Cybern.)"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1007\/s11063-009-9095-3","article-title":"M l-rbf: Rbf neural networks for multi-label learning","volume":"29","author":"Zhang","year":"2009","journal-title":"Neural Process. Lett."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2000","DOI":"10.1109\/TII.2017.2682855","article-title":"Research on traffic flow prediction in the big data environment based on the improved RBF neural network","volume":"13","author":"Chen","year":"2017","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_38","unstructured":"Broomhead, D.S., and Lowe, D. (1988). Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks, Royal Signals and Radar Establishment."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Ganapathi, A., Kuno, H., Dayal, U., Wiener, J.L., Fox, A., Jordan, M., and Patterson, D. (April, January 29). Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning. Proceedings of the 2009 IEEE 25th International Conference on Data Engineering, Shanghai, China.","DOI":"10.1109\/ICDE.2009.130"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1002\/widm.8","article-title":"Classification and regression trees","volume":"1","author":"Loh","year":"2011","journal-title":"Wiley Interdiscip. Rev. Data Min. Knowl. Discov."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1007\/BF00153759","article-title":"Instance-based learning algorithms","volume":"6","author":"Aha","year":"1991","journal-title":"Mach. Learn."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1109\/72.870050","article-title":"Improvements to the SMO algorithm for SVM regression","volume":"11","author":"Shevade","year":"2000","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Negi, P., Marcus, R., and Mao, H. (2020, January 20\u201324). Cost-Guided Cardinality Estimation: Focus Where it Matters. Proceedings of the 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW), Dallas, TX, USA.","DOI":"10.1109\/ICDEW49219.2020.00034"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/1\/55\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:26:38Z","timestamp":1760361998000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/1\/55"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,1]]},"references-count":44,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1]]}},"alternative-id":["sym14010055"],"URL":"https:\/\/doi.org\/10.3390\/sym14010055","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2022,1,1]]}}}