{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T21:35:14Z","timestamp":1757540114541,"version":"3.40.3"},"publisher-location":"Cham","reference-count":27,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783031278143"},{"type":"electronic","value":"9783031278150"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,26]],"date-time":"2023-03-26T00:00:00Z","timestamp":1679788800000},"content-version":"vor","delay-in-days":84,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>A lot of recent literature on outcome-oriented predictive process monitoring focuses on using models from machine and deep learning. In this literature, it is assumed the outcome labels of the historical cases are all known. However, in some cases, the labelling of cases is incomplete or inaccurate. For instance, you might only observe negative customer feedback, fraudulent cases might remain unnoticed. These cases are typically present in the so-called positive and unlabelled (PU) setting, where your data set consists of a couple of positively labelled examples and examples which do not have a positive label, but might still be examples of a positive outcome. In this work, we show, using a selection of event logs from the literature, the negative impact of mislabelling cases as negative, more specifically when using XGBoost and LSTM neural networks. Furthermore, we show promising results on real-life datasets mitigating this effect, by changing the loss function used by a set of models during training to those of unbiased Positive-Unlabelled (uPU) or non-negative Positive-Unlabelled (nnPU) learning.<\/jats:p>","DOI":"10.1007\/978-3-031-27815-0_19","type":"book-chapter","created":{"date-parts":[[2023,3,25]],"date-time":"2023-03-25T10:03:04Z","timestamp":1679738584000},"page":"255-268","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Outcome-Oriented Predictive Process Monitoring on\u00a0Positive and\u00a0Unlabelled Event Logs"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4644-4881","authenticated-orcid":false,"given":"Jari","family":"Peeperkorn","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3254-2555","authenticated-orcid":false,"given":"Carlos","family":"Ortega V\u00e1zquez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6140-8788","authenticated-orcid":false,"given":"Alexander","family":"Stevens","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0389-0275","authenticated-orcid":false,"given":"Johannes","family":"De Smedt","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8781-3906","authenticated-orcid":false,"given":"Seppe","family":"vanden Broucke","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6151-0504","authenticated-orcid":false,"given":"Jochen","family":"De Weerdt","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,3,26]]},"reference":[{"issue":"4","key":"19_CR1","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1007\/s10994-020-05877-5","volume":"109","author":"J Bekker","year":"2020","unstructured":"Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Mach. Learn. 109(4), 719\u2013760 (2020). https:\/\/doi.org\/10.1007\/s10994-020-05877-5","journal-title":"Mach. Learn."},{"key":"19_CR2","series-title":"Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence)","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1007\/978-3-030-46147-8_5","volume-title":"Machine Learning and Knowledge Discovery in Databases","author":"J Bekker","year":"2020","unstructured":"Bekker, J., Robberechts, P., Davis, J.: Beyond the selected completely at random assumption for learning from positive and unlabeled data. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11907, pp. 71\u201385. Springer, Cham (2020). https:\/\/doi.org\/10.1007\/978-3-030-46147-8_5"},{"key":"19_CR3","doi-asserted-by":"publisher","unstructured":"Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785\u2013794. Association for Computing Machinery, New York (2016). https:\/\/doi.org\/10.1145\/2939672.2939785","DOI":"10.1145\/2939672.2939785"},{"issue":"6","key":"19_CR4","doi-asserted-by":"publisher","first-page":"896","DOI":"10.1109\/TSC.2016.2645153","volume":"12","author":"C Di Francescomarino","year":"2019","unstructured":"Di Francescomarino, C., Dumas, M., Maggi, F.M., Teinemaa, I.: Clustering-based predictive process monitoring. IEEE Trans. Serv. Comput. 12(6), 896\u2013909 (2019). https:\/\/doi.org\/10.1109\/TSC.2016.2645153","journal-title":"IEEE Trans. Serv. Comput."},{"key":"19_CR5","doi-asserted-by":"publisher","unstructured":"van Dongen, B.B.: BPI Challenge 2015 (2015). https:\/\/doi.org\/10.4121\/uuid:31a308ef-c844-48da-948c-305d167a0ec1. https:\/\/data.4tu.nl\/collections\/BPI_Challenge_2015\/5065424\/1","DOI":"10.4121\/uuid:31a308ef-c844-48da-948c-305d167a0ec1"},{"key":"19_CR6","doi-asserted-by":"publisher","unstructured":"van Dongen, B.: Real-life event logs - Hospital log (2011). https:\/\/doi.org\/10.4121\/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54. https:\/\/data.4tu.nl\/articles\/dataset\/Real-life_event_logs_-_Hospital_log\/12716513","DOI":"10.4121\/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54"},{"key":"19_CR7","unstructured":"Du Plessis, M., Niu, G., Sugiyama, M.: Convex formulation for learning from positive and unlabeled data. In: International Conference on Machine Learning, pp. 1386\u20131394. PMLR (2015)"},{"key":"19_CR8","doi-asserted-by":"publisher","unstructured":"Folino, F., Folino, G., Guarascio, M., Pontieri, L.: Semi-supervised discovery of DNN-based outcome predictors from scarcely-labeled process logs. Bus. Inf. Syst. Eng. (2022). https:\/\/doi.org\/10.1007\/s12599-022-00749-9","DOI":"10.1007\/s12599-022-00749-9"},{"issue":"8","key":"19_CR9","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735\u20131780 (1997)","journal-title":"Neural Comput."},{"key":"19_CR10","doi-asserted-by":"publisher","unstructured":"Jaskie, K., Spanias, A.: Positive and unlabeled learning algorithms and applications: a survey. In: 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA), pp. 1\u20138 (2019). https:\/\/doi.org\/10.1109\/IISA.2019.8900698","DOI":"10.1109\/IISA.2019.8900698"},{"key":"19_CR11","unstructured":"Kiryo, R., Niu, G., Du Plessis, M.C., Sugiyama, M.: Positive-unlabeled learning with non-negative risk estimator. In: Advances in Neural Information Processing Systems, vol. 30 (2017)"},{"issue":"3","key":"19_CR12","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1007\/s12599-020-00645-0","volume":"63","author":"W Kratsch","year":"2021","unstructured":"Kratsch, W., Manderscheid, J., R\u00f6glinger, M., Seyfried, J.: Machine learning in business process monitoring: a comparison of deep learning and classical approaches used for outcome prediction. Bus. Inf. Syst. Eng. 63(3), 261\u2013276 (2021)","journal-title":"Bus. Inf. Syst. Eng."},{"key":"19_CR13","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1007\/978-3-319-23063-4_21","volume-title":"Business Process Management","author":"A Leontjeva","year":"2015","unstructured":"Leontjeva, A., Conforti, R., Di Francescomarino, C., Dumas, M., Maggi, F.M.: Complex symbolic sequence encodings for predictive monitoring of business processes. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) BPM 2015. LNCS, vol. 9253, pp. 297\u2013313. Springer, Cham (2015). https:\/\/doi.org\/10.1007\/978-3-319-23063-4_21"},{"key":"19_CR14","series-title":"Health Informatics","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1007\/978-3-030-53993-1_5","volume-title":"Interactive Process Mining in Healthcare","author":"N Martin","year":"2021","unstructured":"Martin, N.: Data quality in process mining. In: Fernandez-Llatas, C. (ed.) Interactive Process Mining in Healthcare. HI, pp. 53\u201379. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-53993-1_5"},{"key":"19_CR15","doi-asserted-by":"publisher","first-page":"801","DOI":"10.1007\/s10462-021-09960-8","volume":"55","author":"DA Neu","year":"2021","unstructured":"Neu, D.A., Lahann, J., Fettke, P.: A systematic literature review on state-of-the-art deep learning methods for process prediction. Artif. Intell. Rev. 55, 801\u2013827 (2021). https:\/\/doi.org\/10.1007\/s10462-021-09960-8","journal-title":"Artif. Intell. Rev."},{"key":"19_CR16","doi-asserted-by":"crossref","unstructured":"Pasquadibisceglie, V., Appice, A., Castellano, G., Malerba, D.: Using convolutional neural networks for predictive process analytics. In: 2019 International Conference on Process Mining (ICPM), pp. 129\u2013136. IEEE (2019)","DOI":"10.1109\/ICPM.2019.00028"},{"key":"19_CR17","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1007\/978-3-030-85469-0_10","volume-title":"Business Process Management","author":"S Pauwels","year":"2021","unstructured":"Pauwels, S., Calders, T.: Incremental predictive process monitoring: the next activity case. In: Polyvyanyy, A., Wynn, M.T., Van Looy, A., Reichert, M. (eds.) BPM 2021. LNCS, vol. 12875, pp. 123\u2013140. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-85469-0_10"},{"key":"19_CR18","doi-asserted-by":"crossref","unstructured":"Rama-Maneiro, E., Vidal, J., Lama, M.: Deep learning for predictive business process monitoring: review and benchmark. IEEE Trans. Serv. Comput. (2021)","DOI":"10.1109\/TSC.2021.3139807"},{"issue":"5","key":"19_CR19","doi-asserted-by":"publisher","first-page":"1385","DOI":"10.1007\/s10115-022-01666-9","volume":"64","author":"W Rizzi","year":"2022","unstructured":"Rizzi, W., Di Francescomarino, C., Ghidini, C., Maggi, F.M.: How do I update my model? On the resilience of predictive process monitoring models to change. Knowl. Inf. Syst. 64(5), 1385\u20131416 (2022)","journal-title":"Knowl. Inf. Syst."},{"key":"19_CR20","series-title":"Lecture Notes in Business Information Processing","doi-asserted-by":"publisher","first-page":"194","DOI":"10.1007\/978-3-030-98581-3_15","volume-title":"Process Mining Workshops","author":"A Stevens","year":"2022","unstructured":"Stevens, A., De Smedt, J., Peeperkorn, J.: Quantifying explainability in outcome-oriented predictive process monitoring. In: Munoz-Gama, J., Lu, X. (eds.) ICPM 2021. LNBIP, vol. 433, pp. 194\u2013206. Springer, Cham (2022). https:\/\/doi.org\/10.1007\/978-3-030-98581-3_15"},{"key":"19_CR21","doi-asserted-by":"crossref","unstructured":"Su, G., Chen, W., Xu, M.: Positive-unlabeled learning from imbalanced data. In: IJCAI, pp. 2995\u20133001 (2021)","DOI":"10.24963\/ijcai.2021\/412"},{"key":"19_CR22","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"477","DOI":"10.1007\/978-3-319-59536-8_30","volume-title":"Advanced Information Systems Engineering","author":"N Tax","year":"2017","unstructured":"Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process monitoring with LSTM neural networks. In: Dubois, E., Pohl, K. (eds.) CAiSE 2017. LNCS, vol. 10253, pp. 477\u2013492. Springer, Cham (2017). https:\/\/doi.org\/10.1007\/978-3-319-59536-8_30"},{"key":"19_CR23","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1007\/978-3-030-58666-9_14","volume-title":"Business Process Management","author":"F Taymouri","year":"2020","unstructured":"Taymouri, F., Rosa, M.L., Erfani, S., Bozorgi, Z.D., Verenich, I.: Predictive business process monitoring via generative adversarial nets: the case of next event prediction. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNCS, vol. 12168, pp. 237\u2013256. Springer, Cham (2020). https:\/\/doi.org\/10.1007\/978-3-030-58666-9_14"},{"key":"19_CR24","doi-asserted-by":"publisher","unstructured":"Teinemaa, I., Dumas, M., Rosa, M.L., Maggi, F.M.: Outcome-oriented predictive process monitoring: review and benchmark. ACM Trans. Knowl. Discov. Data 13(2) (2019). https:\/\/doi.org\/10.1145\/3301300","DOI":"10.1145\/3301300"},{"issue":"4","key":"19_CR25","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3331449","volume":"10","author":"I Verenich","year":"2019","unstructured":"Verenich, I., Dumas, M., Rosa, M.L., Maggi, F.M., Teinemaa, I.: Survey and cross-benchmark comparison of remaining time prediction methods in business process monitoring. ACM Trans. Intell. Syst. Technol. (TIST) 10(4), 1\u201334 (2019)","journal-title":"ACM Trans. Intell. Syst. Technol. (TIST)"},{"key":"19_CR26","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1007\/s10115-009-0245-8","volume":"24","author":"H Wang","year":"2010","unstructured":"Wang, H., Wang, S.: Mining incomplete survey data through classification. Knowl. Inf. Syst. 24, 221\u2013233 (2010). https:\/\/doi.org\/10.1007\/s10115-009-0245-8","journal-title":"Knowl. Inf. Syst."},{"key":"19_CR27","doi-asserted-by":"crossref","unstructured":"Wu, M., Pan, S., Du, L., Tsang, I., Zhu, X., Du, B.: Long-short distance aggregation networks for positive unlabeled graph learning. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 2157\u20132160 (2019)","DOI":"10.1145\/3357384.3358122"}],"container-title":["Lecture Notes in Business Information Processing","Process Mining Workshops"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-27815-0_19","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,4]],"date-time":"2023-09-04T13:06:54Z","timestamp":1693832814000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-27815-0_19"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031278143","9783031278150"],"references-count":27,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-27815-0_19","relation":{},"ISSN":["1865-1348","1865-1356"],"issn-type":[{"type":"print","value":"1865-1348"},{"type":"electronic","value":"1865-1356"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"26 March 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"ICPM","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Process Mining","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Bozen-Bolzano","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Italy","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2022","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"23 October 2022","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"28 October 2022","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"4","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"icpm2022","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/icpmconference.org\/2022\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Single-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"EasyChair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"89","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"42","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"0","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"47% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"2.93","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"2","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"No","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}