{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T14:04:01Z","timestamp":1780754641263,"version":"3.54.1"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T00:00:00Z","timestamp":1695340800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T00:00:00Z","timestamp":1695340800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100018702","name":"HORIZON EUROPE Non-nuclear direct actions of the Joint Research Centre","doi-asserted-by":"publisher","award":["35332"],"award-info":[{"award-number":["35332"]}],"id":[{"id":"10.13039\/100018702","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Adv Data Anal Classif"],"published-print":{"date-parts":[[2024,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The exploration and analysis of large high-dimensional data sets calls for well-thought techniques to extract the salient information from the data, such as co-clustering. Latent block models cast co-clustering in a probabilistic framework that extends finite mixture models to the two-way setting. Real-world data sets often contain anomalies which could be of interest<jats:italic>per se<\/jats:italic>and may make the results provided by standard, non-robust procedures unreliable. Also estimation of latent block models can be heavily affected by contaminated data. We propose an algorithm to compute robust estimates for latent block models. Experiments on both simulated and real data show that our method is able to resist high levels of contamination and can provide additional insight into the data by highlighting possible anomalies.<\/jats:p>","DOI":"10.1007\/s11634-023-00549-3","type":"journal-article","created":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T16:02:27Z","timestamp":1695398547000},"page":"121-161","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Co-clustering contaminated data: a robust model-based approach"],"prefix":"10.1007","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8617-1609","authenticated-orcid":false,"given":"Edoardo","family":"Fibbi","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Domenico","family":"Perrotta","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Francesca","family":"Torti","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Stefan","family":"Van Aelst","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Tim","family":"Verdonck","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2023,9,22]]},"reference":[{"key":"549_CR1","doi-asserted-by":"crossref","unstructured":"Ailem M, Role F, Nadif M (2015) Co-clustering document-term matrices by direct maximization of graph modularity. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 1807\u20131810, New York, NY, USA, 2015. Association for Computing Machinery","DOI":"10.1145\/2806416.2806639"},{"issue":"7","key":"549_CR2","doi-asserted-by":"publisher","first-page":"1563","DOI":"10.1109\/TKDE.2017.2681669","volume":"29","author":"M Ailem","year":"2017","unstructured":"Ailem M, Role F, Nadif M (2017) Sparse poisson latent block model for document clustering. IEEE Trans Knowl Data Eng 29(7):1563\u20131576","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"7","key":"549_CR3","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1109\/34.865189","volume":"22","author":"C Biernacki","year":"2000","unstructured":"Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719\u2013725","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"549_CR4","unstructured":"Biernacki C, Jacques J, Keribin C (2022) A survey on model-based co-clustering: high dimension and estimation challenges. https:\/\/hal.archives-ouvertes.fr\/hal-03769727"},{"issue":"3","key":"549_CR5","first-page":"27","volume":"156","author":"V Brault","year":"2015","unstructured":"Brault V, Lomet A (2015) Methods for co-clustering: a review. Journal de la Societe Fran\u00e7aise de Statistique 156(3):27\u201351","journal-title":"Journal de la Societe Fran\u00e7aise de Statistique"},{"issue":"3","key":"549_CR6","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1214\/ss\/1009213726","volume":"16","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16(3):199\u2013231","journal-title":"Stat Sci"},{"key":"549_CR7","volume-title":"Introduction \u00e0 l\u2019analyse des donn\u00e9es","author":"F Caillez","year":"1976","unstructured":"Caillez F, Pages JP (1976) Introduction \u00e0 l\u2019analyse des donn\u00e9es. Smash, Paris"},{"key":"549_CR8","unstructured":"Celeux G, Govaert G (1991) A classification EM algorithm for clustering and two stochastic versions. Research Report RR-1364, INRIA"},{"issue":"15","key":"549_CR9","doi-asserted-by":"publisher","first-page":"2997","DOI":"10.1080\/00949655.2017.1351564","volume":"87","author":"A Cerasa","year":"2017","unstructured":"Cerasa A, Cerioli A (2017) Outlier-free merging of homogeneous groups of pre-classified observations under contamination. J Stat Comput Simul 87(15):2997\u20133020","journal-title":"J Stat Comput Simul"},{"key":"549_CR10","first-page":"93","volume":"8","author":"Y Cheng","year":"2000","unstructured":"Cheng Y, Church G (2000) Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 8:93\u2013103","journal-title":"Proc Int Conf Intell Syst Mol Biol"},{"issue":"142","key":"549_CR11","first-page":"1","volume":"18","author":"P Coretto","year":"2017","unstructured":"Coretto P, Hennig C (2017) Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering. J Mach Learn Res 18(142):1\u201339","journal-title":"J Mach Learn Res"},{"issue":"2","key":"549_CR12","doi-asserted-by":"publisher","first-page":"553","DOI":"10.1214\/aos\/1031833664","volume":"25","author":"JA Cuesta-Albertos","year":"1997","unstructured":"Cuesta-Albertos JA, Gordaliza A, Matr\u00e1n C (1997) Trimmed $$k$$-means: an attempt to robustify quantizers. Ann Stat 25(2):553\u2013576","journal-title":"Ann Stat"},{"issue":"1","key":"549_CR13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","volume":"39","author":"AP Dempster","year":"1977","unstructured":"Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodological) 39(1):1\u201338","journal-title":"J R Stat Soc Ser B (Methodological)"},{"key":"549_CR14","doi-asserted-by":"crossref","unstructured":"Dhillon I (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 269\u2013274, New York, NY, USA (2001). Association for Computing Machinery","DOI":"10.1145\/502512.502550"},{"key":"549_CR15","doi-asserted-by":"crossref","unstructured":"Dhillon I, Mallela S, Modha D (2003) Information-theoretic co-clustering. In: Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 89\u201398, New York, NY, USA. Association for Computing Machinery","DOI":"10.1145\/956750.956764"},{"key":"549_CR16","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1007\/s00357-009-9026-z","volume":"26","author":"A Farcomeni","year":"2009","unstructured":"Farcomeni A (2009) Robust double clustering: a method based on alternating concentration steps. J Classification 26:77\u2013101","journal-title":"J Classification"},{"key":"549_CR17","volume-title":"Robust methods for data reduction","author":"A Farcomeni","year":"2015","unstructured":"Farcomeni A, Greco L (2015) Robust methods for data reduction. CRC Press, London"},{"key":"549_CR18","first-page":"225","volume-title":"Fuzzy double clustering: a robust proposal","author":"M Ferraro","year":"2015","unstructured":"Ferraro M, Vichi M (2015) Fuzzy double clustering: a robust proposal. Springer, Cham, pp 225\u2013232"},{"issue":"1","key":"549_CR19","doi-asserted-by":"publisher","first-page":"347","DOI":"10.1214\/009053604000000940","volume":"33","author":"MT Gallegos","year":"2005","unstructured":"Gallegos MT, Ritter G (2005) A robust method for cluster analysis. Ann Stat 33(1):347\u2013380. https:\/\/doi.org\/10.1214\/009053604000000940","journal-title":"Ann Stat"},{"issue":"3","key":"549_CR20","doi-asserted-by":"publisher","first-page":"1324","DOI":"10.1214\/07-AOS515","volume":"36","author":"LA Garc\u00eda-Escudero","year":"2008","unstructured":"Garc\u00eda-Escudero LA, Gordaliza A, Matr\u00e1n C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36(3):1324\u20131345","journal-title":"Ann Stat"},{"issue":"4","key":"549_CR21","doi-asserted-by":"publisher","first-page":"585","DOI":"10.1007\/s11222-010-9194-z","volume":"21","author":"LA Garc\u00eda-Escudero","year":"2011","unstructured":"Garc\u00eda-Escudero LA, Gordaliza A, Matr\u00e1n C, Mayo-Iscar A (2011) Exploring the number of groups in robust model-based clustering. Stat Comput 21(4):585\u2013599. https:\/\/doi.org\/10.1007\/s11222-010-9194-z","journal-title":"Stat Comput"},{"issue":"2","key":"549_CR22","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1016\/S0031-3203(02)00074-2","volume":"36","author":"G Govaert","year":"2003","unstructured":"Govaert G, Nadif M (2003) Clustering with block mixture models. Pattern Recogn 36(2):463\u2013473","journal-title":"Pattern Recogn"},{"issue":"4","key":"549_CR23","doi-asserted-by":"publisher","first-page":"643","DOI":"10.1109\/TPAMI.2005.69","volume":"27","author":"G Govaert","year":"2005","unstructured":"Govaert G, Nadif M (2005) An em algorithm for the block mixture model. IEEE Trans Pattern Anal Mach Intell 27(4):643\u2013647","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"6","key":"549_CR24","doi-asserted-by":"publisher","first-page":"3233","DOI":"10.1016\/j.csda.2007.09.007","volume":"52","author":"G Govaert","year":"2008","unstructured":"Govaert G, Nadif M (2008) Block clustering with Bernoulli mixture models: Comparison of different approaches. Comput Stat Data Anal 52(6):3233\u20133245","journal-title":"Comput Stat Data Anal"},{"key":"549_CR25","volume-title":"Co-Clustering: models, algorithms and applications","author":"G Govaert","year":"2014","unstructured":"Govaert G, Nadif M (2014) Co-Clustering: models, algorithms and applications. ISTE Ltd, London"},{"key":"549_CR26","doi-asserted-by":"publisher","first-page":"455","DOI":"10.1007\/s11634-016-0274-6","volume":"12","author":"G Govaert","year":"2016","unstructured":"Govaert G, Nadif M (2016) Mutual information, phi-squared and model-based co-clustering for contingency tables. Adv Data Anal Classifi 12:455\u2013488","journal-title":"Adv Data Anal Classifi"},{"issue":"337","key":"549_CR27","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1080\/01621459.1972.10481214","volume":"67","author":"J Hartigan","year":"1972","unstructured":"Hartigan J (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123\u2013129","journal-title":"J Am Stat Assoc"},{"issue":"6","key":"549_CR28","doi-asserted-by":"publisher","first-page":"1154","DOI":"10.1016\/j.jmva.2007.07.002","volume":"99","author":"C Hennig","year":"2008","unstructured":"Hennig C (2008) Dissolution point and isolation robustness: robustness criteria for general cluster analysis methods. J Multivariate Anal 99(6):1154\u20131176","journal-title":"J Multivariate Anal"},{"issue":"7","key":"549_CR29","first-page":"321","volume":"4","author":"CAR Hoare","year":"1961","unstructured":"Hoare CAR (1961) Algorithm 65: Find. Commun ACM 4(7):321\u2013322","journal-title":"Commun ACM"},{"issue":"6","key":"549_CR30","doi-asserted-by":"publisher","first-page":"1201","DOI":"10.1007\/s11222-014-9472-2","volume":"25","author":"C Keribin","year":"2015","unstructured":"Keribin C, Brault V, Celeux G, Govaert G (2015) Estimation and selection for the latent block model on categorical data. Stat Comput 25(6):1201\u20131216","journal-title":"Stat Comput"},{"issue":"2","key":"549_CR31","doi-asserted-by":"publisher","first-page":"446","DOI":"10.1007\/s10618-018-0597-3","volume":"33","author":"C Laclau","year":"2019","unstructured":"Laclau C, Brault V (2019) Noise-free latent block model for high dimensional data. Data Mini Knowl Discov 33(2):446\u2013473","journal-title":"Data Mini Knowl Discov"},{"key":"549_CR32","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1109\/TCBB.2004.2","volume":"1","author":"S Madeira","year":"2004","unstructured":"Madeira S, Oliveira A (2004) Biclustering algorithms for biological data analysis: a survey. IEEE\/ACM Trans Comput Biol Bioinform 1:24\u201345","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"549_CR33","doi-asserted-by":"publisher","DOI":"10.1002\/0470010940","volume-title":"Robust statistics: theory and methods","author":"R Maronna","year":"2006","unstructured":"Maronna R, Martin D, Yohai V (2006) Robust statistics: theory and methods. Wiley, London"},{"issue":"2","key":"549_CR34","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1137\/1026034","volume":"26","author":"R Redner","year":"1984","unstructured":"Redner R, Walker H (1984) Mixture densities, maximum likelihood and the em algorithm. SIAM Rev 26(2):195\u2013239. https:\/\/doi.org\/10.1137\/1026034","journal-title":"SIAM Rev"},{"key":"549_CR35","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1016\/j.chemolab.2012.03.017","volume":"116","author":"M Riani","year":"2012","unstructured":"Riani M, Perrotta D, Torti F (2012) FSDA: a MATLAB toolbox for robust analysis and interactive data exploration. Chemom Intell Lab Syst 116:17\u201332. https:\/\/doi.org\/10.1016\/j.chemolab.2012.03.017","journal-title":"Chemom Intell Lab Syst"},{"key":"549_CR36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v067.c01","volume":"67","author":"M Riani","year":"2015","unstructured":"Riani M, Perrotta D, Cerioli A (2015) The forward search for very large datasets. J Stat Softw 67:1","journal-title":"J Stat Softw"},{"issue":"5","key":"549_CR37","doi-asserted-by":"publisher","first-page":"1381","DOI":"10.1111\/rssc.12580","volume":"71","author":"M Riani","year":"2022","unstructured":"Riani M, Atkinson A, Torti F, Corbellini A (2022) Robust correspondence analysis. J R Stat Soc Ser C (Appl Stat) 71(5):1381\u20131401","journal-title":"J R Stat Soc Ser C (Appl Stat)"},{"key":"549_CR38","doi-asserted-by":"publisher","first-page":"158","DOI":"10.1007\/s00357-020-09379-w","volume":"38","author":"V Robert","year":"2021","unstructured":"Robert V, Vasseur Y, Brault V (2021) Comparing high-dimensional partitions with the Co-clustering Adjusted Rand Index. J Classif 38:158\u2013186","journal-title":"J Classif"},{"issue":"3","key":"549_CR39","doi-asserted-by":"publisher","first-page":"466","DOI":"10.1007\/s11749-012-0312-4","volume":"22","author":"C Ruwet","year":"2013","unstructured":"Ruwet C, Garc\u00eda-Escudero LA, Gordaliza A, Mayo-Iscar A (2013) On the breakdown behavior of the TCLUST clustering procedure. TEST Offic J Spanish Soc Stat Oper Res 22(3):466\u2013487. https:\/\/doi.org\/10.1007\/s11749-012-0312-4","journal-title":"TEST Offic J Spanish Soc Stat Oper Res"},{"key":"549_CR40","first-page":"514","volume":"7","author":"M Selosse","year":"2020","unstructured":"Selosse M, Jacques J, Biernacki C (2020) Textual data summarization using the self-organized co-clustering model. Pattern Recogn 7:514","journal-title":"Pattern Recogn"},{"key":"549_CR41","doi-asserted-by":"crossref","unstructured":"Shan H, Banerjee A (2008) Bayesian co-clustering. In: 2008 Eighth IEEE international conference on data mining, pp 530\u2013539","DOI":"10.1109\/ICDM.2008.91"},{"issue":"3","key":"549_CR42","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1007\/s10260-021-00569-3","volume":"30","author":"F Torti","year":"2021","unstructured":"Torti F, Riani M, Morelli G (2021) Semiautomatic robust regression clustering of international trade data. Stat Methods Appl 30(3):863\u2013894","journal-title":"Stat Methods Appl"},{"issue":"2","key":"549_CR43","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1016\/j.chemolab.2004.06.003","volume":"75","author":"S Verboven","year":"2005","unstructured":"Verboven S, Hubert M (2005) Libra: a matlab library for robust analysis. Chemom Intell Lab Syst 75(2):127\u2013136","journal-title":"Chemom Intell Lab Syst"},{"key":"549_CR44","first-page":"43","volume-title":"Advances in classification and data analysis","author":"M Vichi","year":"2001","unstructured":"Vichi M (2001) Double k-means clustering for simultaneous classification of objects and variables. In: Borra S, Rocci R, Vichi M, Schader M (eds) Advances in classification and data analysis. Springer, Berlin, pp 43\u201352"}],"container-title":["Advances in Data Analysis and Classification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-023-00549-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11634-023-00549-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-023-00549-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,28]],"date-time":"2024-10-28T19:44:11Z","timestamp":1730144651000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11634-023-00549-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,22]]},"references-count":44,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,3]]}},"alternative-id":["549"],"URL":"https:\/\/doi.org\/10.1007\/s11634-023-00549-3","relation":{},"ISSN":["1862-5347","1862-5355"],"issn-type":[{"value":"1862-5347","type":"print"},{"value":"1862-5355","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,22]]},"assertion":[{"value":"13 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 May 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 June 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 September 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}