{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,5]],"date-time":"2025-04-05T04:07:48Z","timestamp":1743826068903,"version":"3.40.3"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T00:00:00Z","timestamp":1731024000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T00:00:00Z","timestamp":1731024000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006752","name":"Universidade do Porto","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006752","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Intell Inf Syst"],"published-print":{"date-parts":[[2025,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>The availability of suitable datasets and data generators is crucial for developing intelligent systems, especially in helpdesk services. However, the lack of publicly accessible data generators focused on helpdesk operations, where incidents are often closed without detailing the treatment procedures, poses challenges to implementing intelligent systems such as recommender systems. To address this issue, a dataset generator can be employed to simulate helpdesk incidents. This paper introduces SNOOKER (dataSet geNeratOr fOr helpdesK sERvices), a customizable dataset generator designed to create and treat helpdesk tickets, including domain-specific incidents (e.g., cybersecurity) by orchestrating simulated actions and multiple IT teams. SNOOKER\u2019s output is compared against a real anonymized dataset from S21Sec Cyber Solutions by Thales. The datasets are evaluated using Kolmogorov-Smirnov, Kullback-Leibler Divergence, and Hellinger distance tests, with results indicating similar distributions. For example, the first metric returned a low K-S value and a p-value exceeding 5%, while the second and third measures presented 0.003 and 0.03, respectively. Furthermore, experiments with different team configurations revealed that ticket scheduling highly depends on each team\u2019s operators\u2019 numbers and work shifts, increasing with unbalanced shifts and fewer operators.<\/jats:p>","DOI":"10.1007\/s10844-024-00905-5","type":"journal-article","created":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T11:27:08Z","timestamp":1731065228000},"page":"593-615","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["SNOOKER: a dataset generator for helpdesk services"],"prefix":"10.1007","volume":"63","author":[{"given":"Leonardo","family":"Ferreira","sequence":"first","affiliation":[]},{"given":"Daniel Castro","family":"Silva","sequence":"additional","affiliation":[]},{"given":"Mikel","family":"Uriarte-Itzazelaia","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,11,8]]},"reference":[{"doi-asserted-by":"publisher","unstructured":"Assefa, SA., Dervovic, D., Mahfouz, M., et\u00a0al. (2021). Generating synthetic data in finance: opportunities, challenges and pitfalls. In: Proceedings of the First ACM International Conference on AI in Finance. Association for Computing Machinery, New York, USA, ICAIF \u201920, https:\/\/doi.org\/10.1145\/3383455.3422554","key":"905_CR1","DOI":"10.1145\/3383455.3422554"},{"doi-asserted-by":"publisher","unstructured":"Ayala-Rivera, V., Mcdonagh, P., Cerqueus, T., et\u00a0al. (2013). Synthetic data generation using benerator tool. https:\/\/doi.org\/10.48550\/arXiv.1311.3312","key":"905_CR2","DOI":"10.48550\/arXiv.1311.3312"},{"doi-asserted-by":"publisher","unstructured":"Ayala-Rivera, V., Portillo-Dominguez, AO., Murphy, L., et\u00a0al. (2016). COCOA: A synthetic data generator for testing anonymization techniques. In: Domingo-Ferrer J, Peji\u0107-Bach M (eds) Privacy in Statistical Databases. Springer International Publishing, Cham, pp 163\u2013177, https:\/\/doi.org\/10.1007\/978-3-319-45381-1_13","key":"905_CR3","DOI":"10.1007\/978-3-319-45381-1_13"},{"unstructured":"Bhandari, N. (2018). Procedural synthetic data for self-driving cars using 3D graphics. PhD thesis, Massachusetts Institute of Technology, Massachusetts, US","key":"905_CR4"},{"doi-asserted-by":"publisher","unstructured":"Campos, S., Silva, DC. (2022). Aerial fire image synthesis and detection. In: Rocha AP, Steels L, van\u00a0den Herik HJ (eds) Proceedings of the 14th International Conference on Agents and Artificial Intelligence, ICAART, INSTICC, 3-5, vol\u00a02. SCITEPRESS, Setubal, Portugal, pp 273\u2013284, https:\/\/doi.org\/10.5220\/0010829400003116","key":"905_CR5","DOI":"10.5220\/0010829400003116"},{"doi-asserted-by":"publisher","unstructured":"Chen, X., Mishra, N., Rohaninejad, M., et\u00a0al. (2018). PixelSNAIL: An improved autoregressive generative model. In: Proceedings of the 35th International Conference on Machine Learning, pp 864\u2013872, https:\/\/doi.org\/10.48550\/arXiv.1712.09763","key":"905_CR6","DOI":"10.48550\/arXiv.1712.09763"},{"key":"905_CR7","doi-asserted-by":"publisher","first-page":"1181","DOI":"10.3390\/s19051181","volume":"19","author":"J Dahmen","year":"2019","unstructured":"Dahmen, J., & Cook, D. (2019). SynSys: A synthetic data generation system for healthcare applications. Sensors, 19, 1181. https:\/\/doi.org\/10.3390\/s19051181","journal-title":"Sensors"},{"doi-asserted-by":"publisher","unstructured":"Dandekar, A., Zen, RAM., Bressan, S. (2018). A comparative study of synthetic dataset generation techniques. In: Hartmann S, Ma H, Hameurlain A, et\u00a0al (eds) Database and Expert Systems Applications. Springer International Publishing, Cham, pp 387\u2013395, https:\/\/doi.org\/10.1007\/978-3-319-98812-2_35","key":"905_CR8","DOI":"10.1007\/978-3-319-98812-2_35"},{"doi-asserted-by":"publisher","unstructured":"Dankar, FK., Ibrahim, M. (2021). Fake It Till You Make It: Guidelines for effective synthetic data generation. Applied Sciences 11(5). https:\/\/doi.org\/10.3390\/app11052158","key":"905_CR9","DOI":"10.3390\/app11052158"},{"key":"905_CR10","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2022.3144765","author":"FK Dankar","year":"2022","unstructured":"Dankar, F. K., Ibrahim, M. K., & Ismail, L. (2022). A multi-dimensional evaluation of synthetic data generators. IEEE Access. https:\/\/doi.org\/10.1109\/ACCESS.2022.3144765","journal-title":"IEEE Access"},{"doi-asserted-by":"publisher","unstructured":"del Carmen Rodr\u00edguez-Hern\u00e1ndez, M., Ilarri, S., Hermoso, R., et\u00a0al. (2017). DataGenCARS: A generator of synthetic data for the evaluation of context-aware recommendation systems. Pervasive and Mobile Computing 38:516\u2013541. Special Issue IEEE International Conference on Pervasive Computing and Communications (PerCom) 2016 https:\/\/doi.org\/10.1016\/j.pmcj.2016.09.020","key":"905_CR11","DOI":"10.1016\/j.pmcj.2016.09.020"},{"doi-asserted-by":"publisher","unstructured":"Drechsler, J., Reiter, JP. (2008). Accounting for intruder uncertainty due to sampling when estimating identification disclosure risks in partially synthetic data. In: International conference on privacy in statistical databases, Springer, pp 227\u2013238, https:\/\/doi.org\/10.1007\/978-3-540-87471-3_19","key":"905_CR12","DOI":"10.1007\/978-3-540-87471-3_19"},{"issue":"3","key":"905_CR13","first-page":"105","volume":"1","author":"J Drechsler","year":"2008","unstructured":"Drechsler, J., Bender, S., & R\u00e4ssler, S. (2008). Comparing fully and partially synthetic datasets for statistical disclosure control in the German IAB establishment panel. Trans Data Privacy, 1(3), 105\u2013130.","journal-title":"Trans Data Privacy"},{"unstructured":"European Union Agency for Cybersecurity (ENISA) (2021a) Addressing Skills Shortage and Gap Through Higher Education. Tech. rep., report: https:\/\/www.enisa.europa.eu\/publications\/addressing-skills-shortage-and-gap-through-higher-education","key":"905_CR14"},{"unstructured":"European Union Agency for Cybersecurity (ENISA) (2021b) Threat Landscape 2021. Tech. rep., report: https:\/\/www.enisa.europa.eu\/publications\/enisa-threat-landscape-2021","key":"905_CR15"},{"unstructured":"Garcia\u00a0Torres, D. (2018). Generation of synthetic data with generative adversarial networks. PhD thesis, Royal Institute of Technology, Stockholm, Sweden","key":"905_CR16"},{"key":"905_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12874-020-00977-1","volume":"20","author":"A Goncalves","year":"2020","unstructured":"Goncalves, A., Ray, P., Soper, B., et al. (2020). Generation and evaluation of synthetic patient data. BMC Medical Res Method, 20, 1\u201340. https:\/\/doi.org\/10.1186\/s12874-020-00977-1","journal-title":"BMC Medical Res Method"},{"doi-asserted-by":"publisher","unstructured":"Gonik, J., Le, J., Viswanathan, A., et\u00a0al. (2020). CyberGAN: Generating high-fidelity cybersecurity data with generative adversarial networks. https:\/\/doi.org\/10.2514\/6.2020-4117","key":"905_CR18","DOI":"10.2514\/6.2020-4117"},{"doi-asserted-by":"publisher","unstructured":"Goralski, W. (2017). Chapter 11 - User Datagram Protocol. In: Goralski W (ed) The Illustrated Network (Second Edition), second edition edn. Morgan Kaufmann, Boston, p 289\u2013306, https:\/\/doi.org\/10.1016\/B978-0-12-811027-0.00011-4","key":"905_CR19","DOI":"10.1016\/B978-0-12-811027-0.00011-4"},{"unstructured":"GRIDLEX (2023) Round Robin Ticket Assignment vs. Other Ticket Distribution Methods: Which is Best? Available at https:\/\/gridlex.com\/a\/round-robin-ticket-assignment-vs-other-ticket-distribution-methods-st2582 (accessed at September 13th 2024)","key":"905_CR20"},{"unstructured":"GRIDLEX (2024a) Mastering Ticket Prioritization: How to Effectively Manage Your Helpdesk Queue. Available at https:\/\/gridlex.com\/a\/mastering-ticket-prioritization-st339\/ (accessed at September 13th 2024)","key":"905_CR21"},{"unstructured":"GRIDLEX (2024b) Ticket Escalation Best Practices: When and How to Escalate Helpdesk Issues. Available at https:\/\/gridlex.com\/a\/ticket-escalation-best-practices-st344 (accessed at September 13th 2024)","key":"905_CR22"},{"doi-asserted-by":"publisher","unstructured":"Gulrajani, I., Ahmed, F., Arjovsky, M., et\u00a0al. (2017). Improved training of wasserstein GANs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA, NIPS\u201917, p 5769-5779, https:\/\/doi.org\/10.48550\/arXiv.1704.00028","key":"905_CR23","DOI":"10.48550\/arXiv.1704.00028"},{"issue":"1","key":"905_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s44163-021-00016-y","volume":"1","author":"S James","year":"2021","unstructured":"James, S., Harbron, C., Branson, J., et al. (2021). Synthetic data use: exploring use cases to optimise data utility. Discover Art Intell, 1(1), 1\u201313. https:\/\/doi.org\/10.1007\/s44163-021-00016-y","journal-title":"Discover Art Intell"},{"doi-asserted-by":"publisher","unstructured":"Lin, Z., Jain, A., Wang, C., et\u00a0al. (2020). Using GANs for sharing networked time series data: challenges, initial promise, and open questions. In: Proceedings of the ACM Internet Measurement Conference. Association for Computing Machinery, New York, NY, USA, IMC \u201920, p 464-483, https:\/\/doi.org\/10.1145\/3419394.3423643","key":"905_CR25","DOI":"10.1145\/3419394.3423643"},{"doi-asserted-by":"publisher","unstructured":"Mannino, M., Abouzied, A. (2020). Synner: Generating realistic synthetic data. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, New York, NY, USA, SIGMOD \u201920, p 2749-2752, https:\/\/doi.org\/10.1145\/3318464.3384696","key":"905_CR26","DOI":"10.1145\/3318464.3384696"},{"doi-asserted-by":"publisher","unstructured":"Miok, K., Nguyen-Doan, D., Zaharie, D., et\u00a0al. (2019). Generating data using monte carlo dropout. In: 2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP), IEEE, pp 509\u2013515, https:\/\/doi.org\/10.1109\/ICCP48234.2019.8959787","key":"905_CR27","DOI":"10.1109\/ICCP48234.2019.8959787"},{"doi-asserted-by":"publisher","unstructured":"Mohamed, N., Al-Jaroodi, J. (2014). Real-time big data analytics: applications and challenges. In: 2014 international conference on high performance computing & simulation (HPCS), IEEE, pp 305\u2013310, https:\/\/doi.org\/10.1109\/HPCSim.2014.6903700","key":"905_CR28","DOI":"10.1109\/HPCSim.2014.6903700"},{"unstructured":"Nowok, B. (2015). Utility of synthetic microdata generated using tree-based methods. UNECE Statistical Data Confidentiality Work Session","key":"905_CR29"},{"issue":"11","key":"905_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v074.i11","volume":"74","author":"B Nowok","year":"2016","unstructured":"Nowok, B., Raab, G. M., & Dibben, C. (2016). Synthpop: Bespoke creation of synthetic data in R. J Stat Softw, 74(11), 1\u201326. https:\/\/doi.org\/10.18637\/jss.v074.i11","journal-title":"J Stat Softw"},{"unstructured":"Orlans, N., Buettner, D., Marques, J. (2004). A survey of synthetic biometrics: capabilities and benefits. In: IC-AI, pp 499\u2013505","key":"905_CR31"},{"key":"905_CR32","doi-asserted-by":"publisher","first-page":"64","DOI":"10.4018\/jaci.2011040105","volume":"3","author":"S O\u2019Shaughnessy","year":"2011","unstructured":"O\u2019Shaughnessy, S., & Gray, G. (2011). Development and evaluation of a dataset generator tool for generating synthetic log files containing computer attack signatures. IJACI, 3, 64\u201376. https:\/\/doi.org\/10.4018\/jaci.2011040105","journal-title":"IJACI"},{"doi-asserted-by":"publisher","unstructured":"Patki, N., Wedge, R., Veeramachaneni, K. (2016). The synthetic data vault. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp 399\u2013410, https:\/\/doi.org\/10.1109\/DSAA.2016.49","key":"905_CR33","DOI":"10.1109\/DSAA.2016.49"},{"doi-asserted-by":"publisher","unstructured":"Ping, H., Stoyanovich, J., Howe, B. (2017). DataSynthesizer: Privacy-preserving synthetic datasets. In: Proceedings of the 29th International Conference on Scientific and Statistical Database Management. Association for Computing Machinery, New York, NY, USA, SSDBM \u201917, https:\/\/doi.org\/10.1145\/3085504.3091117","key":"905_CR34","DOI":"10.1145\/3085504.3091117"},{"doi-asserted-by":"publisher","unstructured":"Popi\u0107, S., Pavkovi\u0107, B., Veliki\u0107, I., et\u00a0al. (2019). Data generators: a short survey of techniques and use cases with focus on testing. In: 2019 IEEE 9th International Conference on Consumer Electronics (ICCE-Berlin), pp 189\u2013194, https:\/\/doi.org\/10.1109\/ICCE-Berlin47944.2019.8966202","key":"905_CR35","DOI":"10.1109\/ICCE-Berlin47944.2019.8966202"},{"issue":"1","key":"905_CR36","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1146\/annurev-statistics-040720-031848","volume":"8","author":"TE Raghunathan","year":"2021","unstructured":"Raghunathan, T. E. (2021). Synthetic data. Annual Rev Stat Its Appl, 8(1), 129\u2013140. https:\/\/doi.org\/10.1146\/annurev-statistics-040720-031848","journal-title":"Annual Rev Stat Its Appl"},{"doi-asserted-by":"publisher","unstructured":"Slokom, M. (2018). Comparing recommender systems using synthetic data. In: Proceedings of the 12th ACM Conference on Recommender Systems. Association for Computing Machinery, New York, NY, USA, RecSys \u201918, p 548-552, https:\/\/doi.org\/10.1145\/3240323.3240325","key":"905_CR37","DOI":"10.1145\/3240323.3240325"},{"unstructured":"Surendra, H., & Mohan, H S,. (2017). A review of synthetic data generation methods for privacy preserving data publishing. Int J Sci & Technol Res, 6, 95\u2013101.","key":"905_CR38"},{"unstructured":"Tole, AA. (2013). Big data challenges. Database Systems Journal 4(3):31\u201340. https:\/\/ideas.repec.org\/a\/aes\/dbjour\/v4y2013i3p31-40.html","key":"905_CR39"},{"doi-asserted-by":"publisher","unstructured":"Wan, Z., Zhang, Y., He, H. (2017). Variational autoencoder based synthetic data generation for imbalanced learning. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp 1\u20137, https:\/\/doi.org\/10.1109\/SSCI.2017.8285168","key":"905_CR40","DOI":"10.1109\/SSCI.2017.8285168"},{"doi-asserted-by":"publisher","unstructured":"Wang, Z., Myles, P., Tucker, A. (2019). Generating and evaluating synthetic UK primary care data: preserving data utility & patient privacy. In: 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), pp 126\u2013131, https:\/\/doi.org\/10.1109\/CBMS.2019.00036","key":"905_CR41","DOI":"10.1109\/CBMS.2019.00036"},{"doi-asserted-by":"publisher","unstructured":"Xu, L., Skoularidou, M., Cuesta-Infante, A., et\u00a0al. (2019). Modeling tabular data using conditional GAN, Curran Associates Inc., Red Hook, NY, USA. https:\/\/doi.org\/10.48550\/arXiv.1907.00503","key":"905_CR42","DOI":"10.48550\/arXiv.1907.00503"}],"container-title":["Journal of Intelligent Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10844-024-00905-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10844-024-00905-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10844-024-00905-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,4]],"date-time":"2025-04-04T10:44:47Z","timestamp":1743763487000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10844-024-00905-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,8]]},"references-count":42,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,4]]}},"alternative-id":["905"],"URL":"https:\/\/doi.org\/10.1007\/s10844-024-00905-5","relation":{},"ISSN":["0925-9902","1573-7675"],"issn-type":[{"type":"print","value":"0925-9902"},{"type":"electronic","value":"1573-7675"}],"subject":[],"published":{"date-parts":[[2024,11,8]]},"assertion":[{"value":"22 March 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 October 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 October 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 November 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing Interests"}},{"value":"Not Applicable","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics Approval"}}]}}