{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T14:36:53Z","timestamp":1774449413192,"version":"3.50.1"},"reference-count":51,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2021,6,25]],"date-time":"2021-06-25T00:00:00Z","timestamp":1624579200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,6,25]],"date-time":"2021-06-25T00:00:00Z","timestamp":1624579200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Fraunhofer Institute for Experimental Software Engineering (IESE)"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Software Qual J"],"published-print":{"date-parts":[[2022,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Nowadays, systems containing components based on machine learning (ML) methods are becoming more widespread. In order to ensure the intended behavior of a software system, there are standards that define necessary qualities of the system and its components (such as ISO\/IEC 25010). Due to the different nature of ML, we have to re-interpret existing qualities for ML systems or add new ones (such as trustworthiness). We have to be very precise about which quality property is relevant for which entity of interest (such as completeness of training data or correctness of trained model), and how to objectively evaluate adherence to quality requirements. In this article, we present how to systematically construct quality models for ML systems based on an industrial use case. This quality model enables practitioners to specify and assess qualities for ML systems objectively. In addition to the overall construction process described, the main outcomes include a meta-model for specifying quality models for ML systems, reference elements regarding relevant views, entities, quality properties, and measures for ML systems based on existing research, an example instantiation of a quality model for a concrete industrial use case, and lessons learned from applying the construction process. We found that it is crucial to follow a systematic process in order to come up with measurable quality properties that can be evaluated in practice. In the future, we want to learn how the term quality differs between different types of ML systems and come up with reference quality models for evaluating qualities of ML systems.<\/jats:p>","DOI":"10.1007\/s11219-021-09557-y","type":"journal-article","created":{"date-parts":[[2021,6,25]],"date-time":"2021-06-25T02:02:26Z","timestamp":1624586546000},"page":"307-335","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":58,"title":["Construction of a quality model for machine learning systems"],"prefix":"10.1007","volume":"30","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7696-0046","authenticated-orcid":false,"given":"Julien","family":"Siebert","sequence":"first","affiliation":[]},{"given":"Lisa","family":"Joeckel","sequence":"additional","affiliation":[]},{"given":"Jens","family":"Heidrich","sequence":"additional","affiliation":[]},{"given":"Adam","family":"Trendowicz","sequence":"additional","affiliation":[]},{"given":"Koji","family":"Nakamichi","sequence":"additional","affiliation":[]},{"given":"Kyoko","family":"Ohashi","sequence":"additional","affiliation":[]},{"given":"Isao","family":"Namba","sequence":"additional","affiliation":[]},{"given":"Rieko","family":"Yamamoto","sequence":"additional","affiliation":[]},{"given":"Mikio","family":"Aoyama","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,6,25]]},"reference":[{"key":"9557_CR1","doi-asserted-by":"crossref","unstructured":"Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., & Zimmermann, T. (2019). Software Engineering for Machine Learning: A Case Study. In: 2019 IEEE\/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 291\u2013300.","DOI":"10.1109\/ICSE-SEIP.2019.00042"},{"key":"9557_CR2","doi-asserted-by":"crossref","unstructured":"Arpteg, A., Brinne, B., Crnkovic-Friis, L., & Bosch, J. (2018). Software Engineering Challenges of Deep Learning. In: 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 50\u201359. IEEE, [S.l.].","DOI":"10.1109\/SEAA.2018.00018"},{"key":"9557_CR3","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1145\/3144172","volume":"60","author":"S Barocas","year":"2017","unstructured":"Barocas, S., & Boyd, D. (2017). Engaging the ethics of data science in practice. Communications of the ACM, 60, 23\u201325.","journal-title":"Communications of the ACM"},{"key":"9557_CR4","doi-asserted-by":"crossref","unstructured":"Belani, H., Vukovic, M.,\u00a0& Car, Z. (2019). Requirements Engineering Challenges in Building AI-Based Complex Systems. In: 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW), pp. 252\u2013255. IEEE.","DOI":"10.1109\/REW.2019.00051"},{"key":"9557_CR5","unstructured":"Bosch, J., Olsson, H. H., Crnkovic, I., Wang X., Munch J., Suominen A., Bosch J., Jud C., & Hyrynsalmi S. (2018). It takes three to tango: Requirement, outcome\/data, and AI driven development. CEUR Workshop Proceedings, 2305."},{"key":"9557_CR6","doi-asserted-by":"publisher","first-page":"172096","DOI":"10.1098\/rsos.172096","volume":"5","author":"M Calder","year":"2018","unstructured":"Calder, M., Craig, C., Culley, D., de Cani, R., Donnelly, C. A., Douglas, R., Edmonds, B., Gascoigne, J., Gilbert, N., Hargrove, C., et al. (2018). Computational modelling for decision-making: where, why, what, who and how. Royal Society Open Science, 5, 172096.","journal-title":"Royal Society open science"},{"key":"9557_CR7","doi-asserted-by":"crossref","unstructured":"de Souza Nascimento, E., Ahmed, I., Oliveira, E., Palheta, M. P., Steinmacher, I., & Conte, T. (2019). Understanding Development Process of Machine Learning Systems: Challenges and Solutions. In: 2019 ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1\u20136.","DOI":"10.1109\/ESEM.2019.8870157"},{"key":"9557_CR8","doi-asserted-by":"crossref","unstructured":"Edmonds, B. (2002). The Use of Models - Making Mabs Actually Work.","DOI":"10.1007\/3-540-44561-7_2"},{"key":"9557_CR9","doi-asserted-by":"crossref","unstructured":"Edmonds, B., Le Page, C., Bithell, M., Chattoe-Brown, E., Grimm, V., Meyer, R., Monta\u00f1ola-Sales, C., Ormerod, P., Root, H., & Squazzoni, F. (2019). Different Modelling Purposes. JASSS 22.","DOI":"10.18564\/jasss.3993"},{"key":"9557_CR10","doi-asserted-by":"publisher","first-page":"e0159161","DOI":"10.1371\/journal.pone.0159161","volume":"11","author":"S Emmons","year":"2016","unstructured":"Emmons, S., Kobourov, S., Gallant, M., & B\u00f6rner, K. (2016). Analysis of network clustering algorithms and cluster quality metrics at scale. PLoS One, 11, e0159161.","journal-title":"PLoS ONE"},{"key":"9557_CR11","first-page":"12","volume":"11","author":"JM Epstein","year":"2008","unstructured":"Epstein, J. M. (2008). Why model? JASSS, 11, 12.","journal-title":"JASSS"},{"key":"9557_CR12","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1016\/j.infsof.2015.02.009","volume":"62","author":"A Goeb","year":"2015","unstructured":"Goeb, A., Heinemann, L., Kl\u00e4s, M., Lampasona, C., Lochmann, K., Mayr, A., Pl\u00f6sch, R., Seidl, A., Streit, J., & Trendowicz, A. (2015). Operationalised product quality models and assessment: The Quamoco approach. Information and Software Technology, 62, 101\u2013123.","journal-title":"Information and Software Technology"},{"key":"9557_CR13","unstructured":"Hamada, K., Ishikawa, F., Masuda, S., Matsuya, M., & Ujita, Y. (2020). Guidelines for Quality Assurance of Machine Learning-based Artificial Intelligence. In: SEKE2020: the 32nd International Conference on Software Engineering & Knowledge Engineering, 335\u2013341."},{"key":"9557_CR14","unstructured":"HLEG, A. (2019). High-Level Expert group on artificial intelligence: Ethics guidelines for trustworthy AI. European Commission."},{"key":"9557_CR15","doi-asserted-by":"crossref","unstructured":"Horkoff, J. (2019). Non-Functional Requirements for Machine Learning: Challenges and New Directions. In: 27th International Requirements Engineering Conference (RE2019), pp. 386\u2013391. IEEE Computer Society, Conference Publishing Services, Los Alamitos, California.","DOI":"10.1109\/RE.2019.00050"},{"key":"9557_CR16","doi-asserted-by":"crossref","unstructured":"Hossin, M., & Sulaiman, M. N. (2015). A Review on Evaluation Metrics for Data Classification Evaluations. IJDKP 5, 1\u201311.","DOI":"10.5121\/ijdkp.2015.5201"},{"key":"9557_CR17","doi-asserted-by":"crossref","unstructured":"Hutter, F., Kotthoff, L., & Vanschoren, J. (2018). (eds.): Automated Machine Learning: Methods, Systems, Challenges. Springer.","DOI":"10.1007\/978-3-030-05318-5"},{"key":"9557_CR18","unstructured":"IBM. (nd). Analytic Solutions Unified Method. Implementation with Agile Principles,\u00a0checked on 5\/8\/2019."},{"key":"9557_CR19","doi-asserted-by":"crossref","unstructured":"Ishikawa, F. (2018). Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments. In: Trujillo, J.C.e., Davis, K., Du, X., Li, Z., Ling, T.W., Li, G., Lee, M.L. (eds.) Conceptual modeling. 37th International Conference, ER 2018, Xi'an, China, Proceedings \/ Juan C. Trujillo, Karen C. Davis, Xiaoyong Du, Zhanhuai Li, Tok Wang Ling, Guoliang Li, Mong Li Lee (eds.), pp. 536\u2013544. Springer, Cham, Switzerland.","DOI":"10.1007\/978-3-030-00847-5_39"},{"key":"9557_CR20","doi-asserted-by":"crossref","unstructured":"Ismail, A., Truong, H.-L., & Kastner, W. (2019). Manufacturing process data analysis pipelines: a requirements analysis and survey. Journal of Big Data, 6.","DOI":"10.1186\/s40537-018-0162-3"},{"key":"9557_CR21","unstructured":"ISO\/TS 8000. (2011). Data Quality."},{"key":"9557_CR22","unstructured":"ISO\/IEC 25010. (2011).\u00a0Systems and software engineering \u2014 Systems and software Quality Requirements and Evaluation (SQuaRE) \u2014 System and software quality models."},{"key":"9557_CR23","doi-asserted-by":"crossref","unstructured":"K\u00e4stner, C.,\u00a0& Kang, E. (2020). Teaching Software Engineering for AI-Enabled Systems. In: The 42nd International Conference on Software Engineering (ICSE 2020). Software Engineering Education and Training.","DOI":"10.1145\/3377814.3381714"},{"key":"9557_CR24","doi-asserted-by":"crossref","unstructured":"Kaufman, S., Rosset, S., & Perlich, C. (2011). Leakage in data mining. In: Apte, C., Ghosh, J., Smyth, P. (eds.) Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,\u00a0San Diego, Ca, USA, p. 556. ACM, New York.","DOI":"10.1145\/2020408.2020496"},{"key":"9557_CR25","doi-asserted-by":"publisher","first-page":"431","DOI":"10.1007\/978-3-319-99229-7_36","volume-title":"Developments in Language Theory, 11088","author":"M Kl\u00e4s","year":"2018","unstructured":"Kl\u00e4s, M., & Vollmer, A. M. (2018). Uncertainty in machine learning applications: a practice-driven classification of uncertainty. In M. Hoshi & S. Seki (Eds.), Developments in Language Theory, 11088. (pp. 431\u2013438). Springer International Publishing."},{"key":"9557_CR26","unstructured":"Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent Trade-Offs in the Fair Determination of Risk Scores. arXiv.org."},{"key":"9557_CR27","doi-asserted-by":"publisher","first-page":"463","DOI":"10.3233\/IDT-190160","volume":"13","author":"F Kumeno","year":"2020","unstructured":"Kumeno, F. (2020). Software engineering challenges for machine learning applications: a literature review. IDT, 13, 463\u2013476.","journal-title":"IDT"},{"key":"9557_CR28","unstructured":"Kurakin, A., Goodfellow, I., & Bengio, S. (2016). Adversarial examples in the physical world."},{"key":"9557_CR29","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/S0269888906000737","volume":"21","author":"LA Kurgan","year":"2006","unstructured":"Kurgan, L. A., & Muslek, P. (2006). A survey of knowledge discovery and data mining process models. The Knowledge Engineering Review, 21, 1\u201324.","journal-title":"The Knowledge Engineering Review"},{"key":"9557_CR30","unstructured":"Lorenzoni, G., Alencar, P., Nascimento, N., & Cowan, D. (2021). Machine Learning Model Development from a Software Engineering Perspective: A Systematic Literature Review."},{"key":"9557_CR31","doi-asserted-by":"crossref","unstructured":"Lwakatare, L. E., Raj, A., Crnkovic, I., Bosch, J., & Olsson, H. H. (2020). Large-Scale Machine Learning Systems in Real-World Industrial Settings A Review of Challenges and Solutions. Information and Software Technology, 106368.","DOI":"10.1016\/j.infsof.2020.106368"},{"key":"9557_CR32","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1017\/S0269888910000032","volume":"25","author":"G Mariscal","year":"2010","unstructured":"Mariscal, G., Marb\u00e1n, \u00d3., & Fern\u00e1ndez, C. (2010). A survey of data mining and knowledge discovery process models and methodologies. The Knowledge Engineering Review, 25, 137\u2013166.","journal-title":"The Knowledge Engineering Review"},{"key":"9557_CR33","volume-title":"Machine Intelligence quality characteristics","author":"R Marselis","year":"2018","unstructured":"Marselis, R., & Shaukat, H. (2018). Machine Intelligence quality characteristics. How to measure the quality of Artificial Intelligence and robotics."},{"key":"9557_CR34","unstructured":"Marselis, R., Shaukat, H.,\u00a0& Gansel, T. (2017). Testing of Artificial Intelligence. Sogeti."},{"key":"9557_CR35","doi-asserted-by":"crossref","unstructured":"Martinez-Plumed, F., Contreras-Ochando, L., Ferri, C., Hernandez Orallo, J., Kull, M., Lachiche, N., Ramirez Quintana, M. J., & Flach, P. A. (2020). CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories. IEEE Transaction on Knowledge and Data Engineering, 1.","DOI":"10.1109\/TKDE.2019.2962680"},{"key":"9557_CR36","unstructured":"Marz, N., & Warren, J. (2015). Big data. Principles and best practices of scalable real-time data systems \/ Nathan Marz, James Warren. Manning, Shelter Island."},{"key":"9557_CR37","unstructured":"Microsoft. (2019). Team Data Science Process Documentation. Available online at\u00a0https:\/\/docs.microsoft.com\/en-us\/azure\/machine-learning\/team-data-science-process\/,\u00a0updated on 11\/15\/2018, checked on 11\/16\/2018."},{"key":"9557_CR38","doi-asserted-by":"crossref","unstructured":"Nakajima, S. (2018). [Invited] Quality Assurance of Machine Learning Software. In: 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE). 9\u201312\u00a0pp. 601\u2013604. IEEE, Piscataway, NJ.","DOI":"10.1109\/GCCE.2018.8574766"},{"key":"9557_CR39","doi-asserted-by":"crossref","unstructured":"Nakamichi, K., Ohashi, K., Namba, I., Yamamoto, R., Aoyama, M., Joeckel, L., Siebert, J., & Heidrich, J. (2020). Requirements-Driven Method to Determine Quality Characteristics and Measurements for Machine Learning Software and Its Evaluation. In: 28th IEEE International Requirements Engineering Conference (RE).","DOI":"10.1109\/RE48521.2020.00036"},{"key":"9557_CR40","series-title":"ICSSP 2019: 25 May 2019, Montr\u00e9al, Canada : proceedings","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1109\/ICSSP.2019.00025","volume-title":"2019 IEEE\/ACM International Conference on Software and System Processes","author":"P Nistala","year":"2019","unstructured":"Nistala, P., Nori, K. V., & Reddy, R. (2019). Software Quality Models: A Systematic Mapping Study. ICSSP 2019: 25 May 2019, Montr\u00e9al, Canada\u202f: proceedings2019 IEEE\/ACM International Conference on Software and System Processes. (pp. 125\u2013134). IEEE."},{"key":"9557_CR41","doi-asserted-by":"crossref","unstructured":"Poth, A., Meyer, B., Schlicht, P., & Riel, A. (2020). Quality Assurance for Machine Learning \u2013 an approach to function and system safeguarding. In: 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), pp. 22\u201329. IEEE (uuuu-uuuu).","DOI":"10.1109\/QRS51102.2020.00016"},{"key":"9557_CR42","unstructured":"Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J.-F., & Dennison, D. (2015). Hidden Technical Debt in Machine Learning Systems. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 2503\u20132511."},{"key":"9557_CR43","first-page":"14","volume":"5","author":"C Shearer","year":"2000","unstructured":"Shearer, C. (2000). The CRISP-DM Model: The New Blueprint for Data Mining. Journal of Data Warehousing, 5, 14\u201322.","journal-title":"Journal of Data Warehousing"},{"key":"9557_CR44","doi-asserted-by":"crossref","unstructured":"Siebert, J., Joeckel, L., Heidrich, J., Nakamichi, K., Ohashi, K., Namba, I., Yamamoto, R.,\u00a0& Aoyama, M. (2020). Towards Guidelines for Assessing Qualities of Machine Learning Systems. In: Shepperd, M., Brito e Abreu, F., Rodrigues da Silva, A., P\u00e9rez-Castillo, R. (eds.) Quality of Information and Communications Technology, 1266, pp. 17\u201331. Springer International Publishing, Cham.","DOI":"10.1007\/978-3-030-58793-2_2"},{"key":"9557_CR45","unstructured":"SPEC, D. 92001\u201301: K\u00fcnstliche Intelligenz - Life Cycle Prozesse und Qualit\u00e4tsanforderungen. Teil 1: Qualit\u00e4ts-Meta-Modell. Beuth Verlag GmbH, Berlin."},{"key":"9557_CR46","doi-asserted-by":"crossref","unstructured":"Vogelsang, A., & Borg, M. (2019). Requirements Engineering for Machine Learning: Perspectives from Data Scientists. In: 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW), pp. 245\u2013251.","DOI":"10.1109\/REW.2019.00050"},{"key":"9557_CR47","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1016\/j.infsof.2015.02.009","volume":"62","author":"S Wagner","year":"2015","unstructured":"Wagner, S., Goeb, A., Heinemann, L., Kl\u00e4s, M., Lampasona, C., Lochmann, K., Mayr, A., Pl\u00f6sch, R., Seidl, A., Streit, J., et al. (2015). Operationalised product quality models and assessment: The Quamoco approach. Information and Software Technology, 62, 101\u2013123.","journal-title":"Information and Software Technology"},{"key":"9557_CR48","doi-asserted-by":"crossref","unstructured":"Wan, Z., Xia, X., Lo, D., & Murphy, G. C. (2019). How does Machine Learning Change Software Development Practices? IEEE Transactions on Software Engineering, 1.","DOI":"10.1109\/TSE.2019.2937083"},{"key":"9557_CR49","doi-asserted-by":"crossref","unstructured":"Zhang, D., & Tsai, J. (2002). Machine learning and software engineering. In: Staff, I.C.S. (ed.) 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), pp. 22\u201329. IEEE Computer Society Press, Los Alamitos.","DOI":"10.1109\/TAI.2002.1180784"},{"key":"9557_CR50","unstructured":"Zhang, J. M., Harman, M., Ma, L., & Liu, Y. (2020) Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering, 1."},{"key":"9557_CR51","unstructured":"Zhang, X., Yang, Y., Feng, Y., & Chen, Z. (2019). Software Engineering Practice in the Development of Deep Learning Applications."}],"container-title":["Software Quality Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11219-021-09557-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11219-021-09557-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11219-021-09557-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,27]],"date-time":"2022-06-27T08:17:16Z","timestamp":1656317836000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11219-021-09557-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,25]]},"references-count":51,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,6]]}},"alternative-id":["9557"],"URL":"https:\/\/doi.org\/10.1007\/s11219-021-09557-y","relation":{},"ISSN":["0963-9314","1573-1367"],"issn-type":[{"value":"0963-9314","type":"print"},{"value":"1573-1367","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,25]]},"assertion":[{"value":"14 April 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 June 2021","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}