{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:53:22Z","timestamp":1760151202986,"version":"build-2065373602"},"reference-count":41,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T00:00:00Z","timestamp":1645401600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>In the context of higher education, the wide availability of data gathered by universities for administrative purposes or for recording the evolution of students\u2019 learning processes makes novel data mining techniques particularly useful to tackle critical issues. In Italy, current academic regulations allow students to customize the chronological sequence of courses they have to attend to obtain the final degree. This leads to a variety of sequences of exams, with an average time taken to obtain the degree that may significantly differ from the time established by law. In this contribution, we propose a mixture hidden Markov model to classify students into groups that are homogenous in terms of university paths, with the aim of detecting bottlenecks in the academic career and improving students\u2019 performance.<\/jats:p>","DOI":"10.3390\/data7020025","type":"journal-article","created":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T20:24:21Z","timestamp":1645475061000},"page":"25","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["A Mixture Hidden Markov Model to Mine Students\u2019 University Curricula"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8097-3870","authenticated-orcid":false,"given":"Silvia","family":"Bacci","sequence":"first","affiliation":[{"name":"Department of Statistics, Computer Science, Applications \u201cG. Parenti\u201d, University of Florence, Viale Morgagni 59, 50134 Firenze, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5816-2964","authenticated-orcid":false,"given":"Bruno","family":"Bertaccini","sequence":"additional","affiliation":[{"name":"Department of Statistics, Computer Science, Applications \u201cG. Parenti\u201d, University of Florence, Viale Morgagni 59, 50134 Firenze, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,2,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1007\/s10639-017-9616-z","article-title":"Educational data mining applications and tasks: A survey of the last 10 years","volume":"23","author":"Bakhshinategh","year":"2018","journal-title":"Educ. Inf. Technol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1109\/TSMCC.2010.2053532","article-title":"Educational data mining: A review of the state of the art","volume":"40","author":"Romero","year":"2010","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1432","DOI":"10.1016\/j.eswa.2013.08.042","article-title":"Educational data mining: A survey and a data mining-based analysis of recent works","volume":"41","year":"2014","journal-title":"Expert Syst. Appl."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1111\/bjet.12595","article-title":"Big data and data science: A critical review of issues for educational research","volume":"50","author":"Daniel","year":"2019","journal-title":"Br. J. Educ. Technol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"e1355","DOI":"10.1002\/widm.1355","article-title":"Educational data mining and learning analytics: An updated survey","volume":"10","author":"Romero","year":"2020","journal-title":"WIREs Data Min. Knowl. Discov."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"5508","DOI":"10.1016\/j.eswa.2015.02.052","article-title":"Data mining models for student careers","volume":"42","author":"Campagni","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"732","DOI":"10.1080\/00273171.2017.1361803","article-title":"Evaluation of student performance through a multidimensional finite mixture IRT model","volume":"52","author":"Bacci","year":"2017","journal-title":"Multivar. Behav. Res."},{"key":"ref_8","first-page":"343","article-title":"The Influence of First Year Behaviour in the Progressions of University Students","volume":"Volume 865","author":"Escudeiro","year":"2018","journal-title":"Computers Supported Education. CSEDU 2017. Communications in Computer and Information Science"},{"key":"ref_9","first-page":"1","article-title":"Early Detection of Students at Risk. Predicting Student Dropouts Using Administrative Student Data from German Universities and Machine Learning Methods","volume":"11","author":"Berens","year":"2019","journal-title":"J. Educ. Data Min."},{"key":"ref_10","first-page":"18","article-title":"Using a Latent Class Forest to Identify At-Risk Students in Higher Education","volume":"11","author":"Pelaez","year":"2019","journal-title":"J. Educ. Data Min."},{"key":"ref_11","first-page":"797","article-title":"Measuring students\u2019 academic performance through educational data mining","volume":"10","author":"Wong","year":"2020","journal-title":"Int. J. Inf. Educ. Technol."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/0049124100029001001","article-title":"Sequence Analysis and Optimal Matching Methods in Sociology","volume":"29","author":"Abbott","year":"2000","journal-title":"Sociol. Methods Res."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.1467-9531.2010.01227.x","article-title":"Multichannel squence analysis applied to social science data","volume":"40","author":"Gauthier","year":"2010","journal-title":"Sociol. Methodol."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Skrondal, A., and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling. Multilevel, Longitudinal and Structural Equation Models, Chapman and Hall\/CRC.","DOI":"10.1201\/9780203489437"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Bartholomew, D.J., Knott, M., and Moustaki, I. (2011). Latent Variable Models and Factor Analysis: A Unified Approach, Wiley.","DOI":"10.1002\/9781119970583"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"McLachlan, G., and Peel, D. (2000). Finite Mixture Models, Wiley.","DOI":"10.1002\/0471721182"},{"key":"ref_17","unstructured":"Hancock, G.R., and Samuelson, K.M. (2008). Advances in Latent Variable Mixture Models, Information Age Publishing."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1111\/j.0006-341X.1999.00463.x","article-title":"Finite mixture modelling with mixture outcomes using the EM algorithm","volume":"55","author":"Shedden","year":"1999","journal-title":"Biometrics"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kaplan, D. (2004). Latent variable analysis: Growth mixture modelling and related techniques for longitudinal data. Handbook of Quantitative Methodology for the Social Sciences, Sage.","DOI":"10.4135\/9781412986311"},{"key":"ref_20","unstructured":"Hancock, G.R., and Samuelsen, K.M. (2008). Longitudinal modeling of population heterogeneity: Methodological challenges to the analysis of empirically derived criminal trajectory profiles. Advances in Latent Variable Mixture Models, Information Age Publishing, Inc."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zucchini, W., and MacDonald, I.L. (2009). Hidden Markov Models for Time Series: An Introduction Using R, Springer.","DOI":"10.1201\/9781420010893"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Bartolucci, F., Farcomeni, A., and Pennoni, F. (2012). Latent Markov Models for Longitudinal Data, Chapman & Hall\/CRC Press.","DOI":"10.1201\/b13246"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/s11257-010-9093-1","article-title":"Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies","volume":"21","author":"Chi","year":"2011","journal-title":"User Model. User-Adapt. Interact."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1007\/s11257-010-9087-z","article-title":"Activity sequence modelling and dynamic clustering for personalized e-learning","volume":"21","author":"Paramythis","year":"2011","journal-title":"User Model. User-Adapt. Interact."},{"key":"ref_25","unstructured":"Pen\u00e3-Ayala, A. (2012). An intelligent system for modeling and supporting academic educational processes. Intelligent and Adaptive Educational-Learning Systems: Achievements and Trends, Smart Innovation, Systems and Technologies, Springer."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1198\/016214506000001086","article-title":"Mixed hidden Markov models: An extension of the hidden Markov model to the longitudinal data setting","volume":"102","author":"Altman","year":"2007","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1111\/j.1751-5823.2011.00160.x","article-title":"Mixed Hidden Markov Models for Longitudinal Data: An Overview","volume":"79","author":"Maruotti","year":"2011","journal-title":"Int. Stat. Rev."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"213","DOI":"10.2307\/271087","article-title":"Mixed Markov Latent Class Models","volume":"20","author":"Langheine","year":"1990","journal-title":"Sociol. Methodol."},{"key":"ref_29","unstructured":"Menard, S. (2008). Latent Class Models in Longitudinal Research. Handbook of Longitudinal Research: Design, Measurement, and Analysis, Elsevier."},{"key":"ref_30","unstructured":"Lazarsfeld, P.F., and Henry, N.W. (1968). Latent Structure Analysis, Houghton Mifflin."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1093\/biomet\/61.2.215","article-title":"Exploratory latent structure analysis using both identifiable and unidentifiable models","volume":"61","author":"Goodman","year":"1974","journal-title":"Biometrika"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1375","DOI":"10.1080\/01621459.1997.10473658","article-title":"Latent Variable Regression for Multiple Discrete Outcomes","volume":"92","author":"Miglioretti","year":"1997","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"5236","DOI":"10.1016\/j.csda.2006.08.020","article-title":"Mixture analysis of multivariate categorical data with covariates and missing entries","volume":"51","author":"Formann","year":"2007","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_34","unstructured":"Abbruzzo, A., Brentari, E., Chiodi, M., and Piacentino, D. (2018). Finding the best paths in university curricula of graduates to improve academic guidance services. Book of Short Papers SIS 2018, Pearson."},{"key":"ref_35","unstructured":"Petrov, B.N., and Csaki, F. (1973). Information theory and an extension of the maximum likelihood principle. Second International Symposium of Information Theory, Akademiai Kiado."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1214\/aos\/1176344136","article-title":"Estimating the dimension of a model","volume":"6","author":"Schwarz","year":"1978","journal-title":"Ann. Stat."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm (with discussion)","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v088.i03","article-title":"Mixture hidden Markov models for sequence data: The seqHMM package in R","volume":"88","author":"Helske","year":"2019","journal-title":"J. Stat. Softw."},{"key":"ref_39","unstructured":"Helske, J., and Helske, S. (2021). Mixture Hidden Markov Models for Social Sequence Data and Other Multivariate, Multichannel Categorical Time Series, University of Jyv\u00e4skyl\u00e4. Available online: https:\/\/cran.r-project.org\/package=seqHMM."},{"key":"ref_40","unstructured":"Helske, S. (2021, February 18). The Main Algorithms Used in the seqHMM Package. Available online: https:\/\/cran.r-project.org\/package=seqHMM."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1007\/s11634-013-0154-2","article-title":"A comparison of some criteria for states selection in the latent Markov model for longitudinal data","volume":"8","author":"Bacci","year":"2014","journal-title":"Adv. Data Anal. Classif."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/7\/2\/25\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:23:50Z","timestamp":1760135030000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/7\/2\/25"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,21]]},"references-count":41,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2022,2]]}},"alternative-id":["data7020025"],"URL":"https:\/\/doi.org\/10.3390\/data7020025","relation":{},"ISSN":["2306-5729"],"issn-type":[{"type":"electronic","value":"2306-5729"}],"subject":[],"published":{"date-parts":[[2022,2,21]]}}}