{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T11:18:19Z","timestamp":1773832699029,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2021,8,16]],"date-time":"2021-08-16T00:00:00Z","timestamp":1629072000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Eiffel Scholarship Program of Excellence of Campus France","award":["P744468L"],"award-info":[{"award-number":["P744468L"]}]},{"name":"Project Hubert Curien-Carlos J. Finlay","award":["41814TM"],"award-info":[{"award-number":["41814TM"]}]},{"name":"Fondo Nacional de Desarrollo Cient\u00edfico y Tecnol\u00f3gico [CONICYT FONDECYT\/INACH\/POSTDOCTORADO","award":["3170107"],"award-info":[{"award-number":["3170107"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,12,22]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Classical Molecular Dynamics (MD) is a standard computational approach to model time-dependent processes at the atomic level. The inherent sparsity of increasingly huge generated trajectories demands clustering algorithms to reduce other post-simulation analysis complexity. The Quality Threshold (QT) variant is an appealing one from the vast number of available clustering methods. It guarantees that all members of a particular cluster will maintain a collective similarity established by a user-defined threshold. Unfortunately, its high computational cost for processing big data limits its application in the molecular simulation field.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>In this work, we propose a methodological parallel between QT clustering and another well-known algorithm in the field of Graph Theory, the Maximum Clique Problem. Molecular trajectories are represented as graphs whose nodes designate conformations, while unweighted edges indicate mutual similarity between nodes. The use of a binary-encoded RMSD matrix coupled to the exploitation of bitwise operations to extract clusters significantly contributes to reaching a very affordable algorithm compared to the few implementations of QT for MD available in the literature. Our alternative provides results in good agreement with the exact one while strictly preserving the collective similarity of clusters.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The source code and documentation of BitQT are free and publicly available on GitHub (https:\/\/github.com\/LQCT\/BitQT.git) and ReadTheDocs (https:\/\/bitqt.readthedocs.io\/en\/latest\/), respectively.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab595","type":"journal-article","created":{"date-parts":[[2021,8,14]],"date-time":"2021-08-14T03:55:42Z","timestamp":1628913342000},"page":"73-79","source":"Crossref","is-referenced-by-count":9,"title":["BitQT: a graph-based approach to the quality threshold clustering of molecular dynamics"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3852-4902","authenticated-orcid":false,"given":"Roy","family":"Gonz\u00e1lez-Alem\u00e1n","sequence":"first","affiliation":[{"name":"Departamento de Qu\u00edmica-F\u00edsica, Laboratorio de Qu\u00edmica Computacional y Te\u00f3rica (LQCT), Facultad de Qu\u00edmica, Universidad de La Habana , La Habana 10400, Cuba"},{"name":"Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Universit\u00e9 Paris Saclay , Gif-sur-Yvette F-91198, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6454-4320","authenticated-orcid":false,"given":"Daniel","family":"Platero-Rochart","sequence":"additional","affiliation":[{"name":"Departamento de Qu\u00edmica-F\u00edsica, Laboratorio de Qu\u00edmica Computacional y Te\u00f3rica (LQCT), Facultad de Qu\u00edmica, Universidad de La Habana , La Habana 10400, Cuba"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2646-1644","authenticated-orcid":false,"given":"David","family":"Hern\u00e1ndez-Castillo","sequence":"additional","affiliation":[{"name":"Institute of Theoretical Chemistry, University of Vienna , Vienna 1090, Austria"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9231-7552","authenticated-orcid":false,"given":"Erix W","family":"Hern\u00e1ndez-Rodr\u00edguez","sequence":"additional","affiliation":[{"name":"Laboratorio de Bioinform\u00e1tica y Qu\u00edmica Computacional, Escuela de Qu\u00edmica y Farmacia, Facultad de Medicina, Universidad Cat\u00f3lica del Maule , Talca 3460000, Chile"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0182-1444","authenticated-orcid":false,"given":"Julio","family":"Caballero","sequence":"additional","affiliation":[{"name":"Departamento de Bioinform\u00e1tica, Facultad de Ingenier\u00eda, Centro de Bioinform\u00e1tica, Simulaci\u00f3n y Modelado (CBSM), Universidad de Talca , Talca 3460000, Chile"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5641-1525","authenticated-orcid":false,"given":"Fabrice","family":"Leclerc","sequence":"additional","affiliation":[{"name":"Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Universit\u00e9 Paris Saclay , Gif-sur-Yvette F-91198, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4128-1203","authenticated-orcid":false,"given":"Luis","family":"Montero-Cabrera","sequence":"additional","affiliation":[{"name":"Departamento de Qu\u00edmica-F\u00edsica, Laboratorio de Qu\u00edmica Computacional y Te\u00f3rica (LQCT), Facultad de Qu\u00edmica, Universidad de La Habana , La Habana 10400, Cuba"}]}],"member":"286","published-online":{"date-parts":[[2021,8,16]]},"reference":[{"key":"2023020108395046300_btab595-B1","first-page":"1","author":"Abraham","year":"2015"},{"key":"2023020108395046300_btab595-B2","first-page":"1068","author":"Danalis","year":"2012"},{"key":"2023020108395046300_btab595-B3","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1002\/(SICI)1521-3773(19990115)38:1\/2<236::AID-ANIE236>3.0.CO;2-M","article-title":"Peptide folding: when simulation meets experiment","volume":"38","author":"Daura","year":"1999","journal-title":"Angew. Chemie Int. Ed"},{"key":"2023020108395046300_btab595-B4","first-page":"1","author":"Dutta","year":"2011"},{"key":"2023020108395046300_btab595-B5","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1021\/acs.jcim.9b00828","article-title":"BitClust: fast geometrical clustering of long molecular dynamics simulations","volume":"60","author":"Gonz\u00e1lez-Alem\u00e1n","year":"2020","journal-title":"J. Chem. Inf. Model"},{"key":"2023020108395046300_btab595-B6","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1021\/acs.jcim.9b00558","article-title":"Quality threshold clustering of molecular dynamics: a word of caution","volume":"60","author":"Gonz\u00e1lez-Alem\u00e1n","year":"2020","journal-title":"J. Chem. Inf. Model"},{"key":"2023020108395046300_btab595-B7","doi-asserted-by":"crossref","first-page":"5458","DOI":"10.1021\/jp301442n","article-title":"Conformational landscape of N-glycosylated peptides detecting autoantibodies in multiple sclerosis, revealed by Hamiltonian replica exchange","volume":"116","author":"Guardiani","year":"2012","journal-title":"J. Phys. Chem. B"},{"key":"2023020108395046300_btab595-B8","doi-asserted-by":"crossref","first-page":"1106","DOI":"10.1101\/gr.9.11.1106","article-title":"Exploring expression data identification and analysis of coexpressed genes","volume":"9","author":"Heyer","year":"1999","journal-title":"Genome Res"},{"key":"2023020108395046300_btab595-B9","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classif"},{"key":"2023020108395046300_btab595-B10","doi-asserted-by":"crossref","first-page":"1528","DOI":"10.1016\/j.bpj.2015.08.015","article-title":"MDTraj: a modern open library for the analysis of molecular dynamics trajectories","volume":"109","author":"McGibbon","year":"2015","journal-title":"Biophys. J"},{"key":"2023020108395046300_btab595-B11","doi-asserted-by":"crossref","first-page":"6130","DOI":"10.1021\/acs.jctc.6b00757","article-title":"Uncovering large-scale conformational change in molecular dynamics without prior knowledge","volume":"12","author":"Melvin","year":"2016","journal-title":"J. Chem. Theory Comput"},{"key":"2023020108395046300_btab595-B12","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1007\/s13361-011-0097-9","article-title":"Production of reliable MALDI spectra with quality threshold clustering of replicates","volume":"22","author":"Olson","year":"2011","journal-title":"J. Am. Soc. Mass Spectrom"},{"key":"2023020108395046300_btab595-B13","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1063\/1674-0068\/31\/cjcp1806147","article-title":"Clustering algorithms to analyze molecular dynamics simulation trajectories for complex chemical and biological systems","volume":"31","author":"Peng","year":"2018","journal-title":"Chin. J. Chem. Phys"},{"key":"2023020108395046300_btab595-B14","doi-asserted-by":"crossref","first-page":"1848","DOI":"10.1002\/(SICI)1096-987X(19971130)18:15<1848::AID-JCC2>3.0.CO;2-O","article-title":"ORAC: a Molecular dynamics program to simulate complex molecular systems with realistic electrostatic interactions","volume":"18","author":"Procacci","year":"1997","journal-title":"J. Comput. Chem"},{"key":"2023020108395046300_btab595-B15","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1080\/01621459.1971.10482356","article-title":"Objective criteria for the evaluation of clustering methods","volume":"66","author":"Rand","year":"1971","journal-title":"J. Am. Stat. Assoc"},{"key":"2023020108395046300_btab595-B16","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1515\/jib-2016-300","article-title":"Clustering of biological datasets in the era of big data","volume":"13","author":"R\u00f6ttger","year":"2016","journal-title":"J. Integr. Bioinf"},{"key":"2023020108395046300_btab595-B17","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1007\/s10489-015-0646-1","article-title":"A novel clique formulation for the visual feature matching problem","volume":"43","author":"San Segundo","year":"2015","journal-title":"Appl. Intell"},{"key":"2023020108395046300_btab595-B18","first-page":"352","article-title":"A new implicit branching strategy for exact maximum clique","volume":"1","author":"San Segundo","year":"2010","journal-title":"Proc. Int. Conf. Tools Artif. Intell. ICTAI"},{"key":"2023020108395046300_btab595-B19","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/j.cor.2013.10.018","article-title":"Relaxed approximate coloring in exact maximum clique search","volume":"44","author":"San Segundo","year":"2014","journal-title":"Comput. Oper. Res"},{"key":"2023020108395046300_btab595-B20","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1007\/s11590-011-0431-y","article-title":"An improved bit parallel exact maximum clique algorithm","volume":"7","author":"San Segundo","year":"2013","journal-title":"Optim. Lett"},{"key":"2023020108395046300_btab595-B21","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.cor.2015.07.013","article-title":"A new exact maximum clique algorithm for large and massive sparse graphs","volume":"66","author":"San Segundo","year":"2016","journal-title":"Comput. Oper. Res"},{"key":"2023020108395046300_btab595-B22","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1007\/s11590-016-1019-3","article-title":"A parallel maximum clique algorithm for large and massive sparse graphs","volume":"11","author":"San Segundo","year":"2017","journal-title":"Optim. Lett"},{"key":"2023020108395046300_btab595-B23","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1080\/10556788.2017.1281924","article-title":"An enhanced bitstring encoding for exact maximum clique search in sparse graphs","volume":"32","author":"San Segundo","year":"2017","journal-title":"Optim. Methods Softw"},{"key":"2023020108395046300_btab595-B24","doi-asserted-by":"crossref","first-page":"2625","DOI":"10.1093\/bioinformatics\/btm378","article-title":"Wordom: a program for efficient analysis of molecular dynamics simulations","volume":"23","author":"Seeber","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020108395046300_btab595-B25","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1007\/978-1-4939-2978-8_15","article-title":"Studying the early stages of protein aggregation using replica exchange molecular dynamics simulations","volume":"1345","author":"Shea","year":"2016","journal-title":"Methods Mol. Biol"},{"key":"2023020108395046300_btab595-B26","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1037\/1082-989X.9.3.386","article-title":"Properties of the Hubert-Arabie adjusted Rand index","volume":"9","author":"Steinley","year":"2004","journal-title":"Psychol. Methods"},{"key":"2023020108395046300_btab595-B27","doi-asserted-by":"crossref","first-page":"506","DOI":"10.1107\/S0108767302011637","article-title":"A revised proof of the metric properties of optimally superimposed vector sets","volume":"58","author":"Steipe","year":"2002","journal-title":"Acta Crystallogr. Sect. A Found. Crystallogr"},{"key":"2023020108395046300_btab595-B28","first-page":"346","author":"Tang","year":"2010"},{"key":"2023020108395046300_btab595-B29","doi-asserted-by":"crossref","first-page":"2178","DOI":"10.1021\/acs.jcim.8b00512","article-title":"TTClust: a versatile molecular simulation trajectory clustering program with graphical summaries","volume":"58","author":"Tubiana","year":"2018","journal-title":"J. Chem. Inf. Model"},{"key":"2023020108395046300_btab595-B30","first-page":"6579","article-title":"Clustering: science or art?","volume":"27","author":"von Luxburg","year":"2012","journal-title":"JMLR Work. Conf. Proc"},{"key":"2023020108395046300_btab595-B31","doi-asserted-by":"crossref","first-page":"693","DOI":"10.1016\/j.ejor.2014.09.064","article-title":"A review on algorithms for maximum clique problems","volume":"242","author":"Wu","year":"2015","journal-title":"Eur. J. Oper. Res"},{"key":"2023020108395046300_btab595-B32","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1007\/s10489-011-0310-3","article-title":"An insect classification analysis based on shape features using quality threshold ARTMAP and moment invariant","volume":"37","author":"Yaakob","year":"2012","journal-title":"Appl. Intell"},{"key":"2023020108395046300_btab595-B33","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1007\/s00521-009-0293-8","article-title":"A novel Euclidean quality threshold ARTMAP network and its application to pattern classification","volume":"19","author":"Yaakob","year":"2010","journal-title":"Neural Comput. Appl"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab595\/40351839\/btab595.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/1\/73\/49007438\/btab595.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/1\/73\/49007438\/btab595.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T19:56:39Z","timestamp":1675281399000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/1\/73\/6353027"}},"subtitle":[],"editor":[{"given":"Lenore","family":"Cowen","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,8,16]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12,22]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab595","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,1,1]]},"published":{"date-parts":[[2021,8,16]]}}}