{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,12]],"date-time":"2026-02-12T13:17:33Z","timestamp":1770902253552,"version":"3.50.1"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2025,5,29]],"date-time":"2025-05-29T00:00:00Z","timestamp":1748476800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001711","name":"Swiss National Science Foundation","doi-asserted-by":"publisher","award":["189363"],"award-info":[{"award-number":["189363"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>Despite the rapid growth of machine learning in biomolecular applications, information about protein dynamics is underutilized. Here, we introduce Nearl, an automated pipeline designed to extract dynamic features from large ensembles of molecular dynamics trajectories. Nearl aims to identify intrinsic patterns of molecular motion and to provide informative features for predictive modeling tasks. We implement two classes of dynamic features, termed marching observers and property-density flow, to capture local atomic motions while maintaining a view of the global configuration. Complemented by standard voxelization techniques, Nearl transforms substructures of proteins into three-dimensional (3D) grids, suitable for contemporary 3D convolutional neural networks (3D-CNNs). The pipeline leverages graphics processing unit (GPU) acceleration, adheres to the FAIR principles for research software, and prioritizes flexibility and user-friendliness, allowing customization of input formats and feature extraction.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The source code of Nearl is hosted at https:\/\/github.com\/miemiemmmm\/Nearl and archived at https:\/\/doi.org\/10.5281\/zenodo.15320286. The documentation is hosted on ReadTheDocs at https:\/\/nearl.readthedocs.io\/en\/latest\/. All pre-built models are implemented in PyTorch and available on GitHub.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf321","type":"journal-article","created":{"date-parts":[[2025,5,29]],"date-time":"2025-05-29T11:02:37Z","timestamp":1748516557000},"source":"Crossref","is-referenced-by-count":2,"title":["Nearl: extracting dynamic features from molecular dynamics trajectories for machine learning tasks"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0801-8064","authenticated-orcid":false,"given":"Yang","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Biochemistry, University of Zurich , Zurich, 8057,","place":["Switzerland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5422-5278","authenticated-orcid":false,"given":"Andreas","family":"Vitalis","sequence":"additional","affiliation":[{"name":"Department of Biochemistry, University of Zurich , Zurich, 8057,","place":["Switzerland"]}]}],"member":"286","published-online":{"date-parts":[[2025,5,29]]},"reference":[{"key":"2025070713413928200_btaf321-B1","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1186\/s40537-021-00419-9","article-title":"A survey on data-efficient algorithms in big data era","volume":"8","author":"Adadi","year":"2021","journal-title":"J Big Data"},{"key":"2025070713413928200_btaf321-B2","doi-asserted-by":"publisher","first-page":"5502","DOI":"10.1039\/d3sm00567d","article-title":"nanoNET: machine learning platform for predicting nanoparticles distribution in a polymer matrix","volume":"19","author":"Ayush","year":"2023","journal-title":"Soft Matter"},{"key":"2025070713413928200_btaf321-B3","doi-asserted-by":"publisher","first-page":"104105","DOI":"10.1063\/1.5063556","article-title":"On the removal of initial state bias from simulation data","volume":"150","author":"Bacci","year":"2019","journal-title":"J Chem Phys"},{"key":"2025070713413928200_btaf321-B4","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"2025070713413928200_btaf321-B5","doi-asserted-by":"publisher","first-page":"D393","DOI":"10.1093\/nar\/gkad991","article-title":"A new paradigm for molecular dynamics databases: the COVID-19 database, the legacy of a titanic community effort","volume":"52","author":"Beltr\u00e1n","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2025070713413928200_btaf321-B6","doi-asserted-by":"publisher","first-page":"5481","DOI":"10.1021\/acs.jctc.5b00618","article-title":"Weighted distance functions improve analysis of high-dimensional data: application to molecular dynamics simulations","volume":"11","author":"Bl\u00f6chliger","year":"2015","journal-title":"J Chem Theory Comput"},{"key":"2025070713413928200_btaf321-B7","doi-asserted-by":"publisher","first-page":"6383","DOI":"10.1021\/acs.jctc.0c00604","article-title":"Sapphire-based clustering","volume":"16","author":"Cocina","year":"2020","journal-title":"J Chem Theory Comput"},{"key":"2025070713413928200_btaf321-B8","doi-asserted-by":"publisher","first-page":"5983","DOI":"10.21105\/joss.05983","article-title":"DeepRank2: mining 3D protein structures with geometric deep learning","volume":"9","author":"Crocioni","year":"2024","journal-title":"JOSS"},{"key":"2025070713413928200_btaf321-B9","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1126\/science.add2187","article-title":"Robust deep learning\u2013based protein sequence design using ProteinMPNN","volume":"378","author":"Dauparas","year":"2022","journal-title":"Science"},{"key":"2025070713413928200_btaf321-B10","doi-asserted-by":"publisher","author":"De Cao","year":"2018","DOI":"10.48550\/ARXIV.1805.11973"},{"key":"2025070713413928200_btaf321-B11","doi-asserted-by":"publisher","first-page":"74111","DOI":"10.1063\/5.0016009","article-title":"Machine learning Frenkel Hamiltonian parameters to accelerate simulations of exciton dynamics","volume":"153","author":"Farahvash","year":"2020","journal-title":"J Chem Phys"},{"key":"2025070713413928200_btaf321-B12","doi-asserted-by":"publisher","first-page":"3059","DOI":"10.1021\/acs.jcim.3c01906","article-title":"An unsupervised machine learning approach for the automatic construction of local chemical descriptors","volume":"64","author":"Gallegos","year":"2024","journal-title":"J Chem Inf Model"},{"key":"2025070713413928200_btaf321-B13","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","article-title":"Automatic chemical design using a data-driven continuous representation of molecules","volume":"4","author":"G\u00f3mez-Bombarelli","year":"2018","journal-title":"ACS Cent Sci"},{"key":"2025070713413928200_btaf321-B14","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1186\/s13321-023-00745-5","article-title":"3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors","volume":"15","author":"Gorostiola Gonz\u00e1lez","year":"2023","journal-title":"J Cheminform"},{"key":"2025070713413928200_btaf321-B15","doi-asserted-by":"publisher","first-page":"109633","DOI":"10.1016\/j.jcp.2020.109633","article-title":"Data-driven molecular modeling with the generalized Langevin equation","volume":"418","author":"Grogan","year":"2020","journal-title":"J Comput Phys"},{"key":"2025070713413928200_btaf321-B16","doi-asserted-by":"publisher","author":"Guo","year":"2025","DOI":"10.1101\/2025.02.04.636233"},{"key":"2025070713413928200_btaf321-B17","doi-asserted-by":"publisher","first-page":"2791","DOI":"10.1021\/acs.jcim.0c00075","article-title":"RosENet: improving binding affinity prediction by leveraging molecular mechanics energies with an ensemble of 3D convolutional neural networks","volume":"60","author":"Hassan-Harrirou","year":"2020","journal-title":"J Chem Inf Model"},{"key":"2025070713413928200_btaf321-B18","doi-asserted-by":"publisher","first-page":"1129","DOI":"10.1016\/j.neuron.2018.08.011","article-title":"Molecular dynamics simulation for all","volume":"99","author":"Hollingsworth","year":"2018","journal-title":"Neuron"},{"key":"2025070713413928200_btaf321-B19","doi-asserted-by":"publisher","first-page":"2386","DOI":"10.1021\/jacs.7b12191","article-title":"Markov state models: from an art to a science","volume":"140","author":"Husic","year":"2018","journal-title":"J Am Chem Soc"},{"key":"2025070713413928200_btaf321-B20","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1021\/acs.jcim.7b00650","article-title":"Kdeep: protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks","volume":"58","author":"Jim\u00e9nez","year":"2018","journal-title":"Journal of Chemical Information and Modeling"},{"key":"2025070713413928200_btaf321-B21","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2025070713413928200_btaf321-B22","doi-asserted-by":"publisher","first-page":"646","DOI":"10.1038\/nsb0902-646","article-title":"Molecular dynamics simulations of biomolecules","volume":"9","author":"Karplus","year":"2002","journal-title":"Nat Struct Biol"},{"key":"2025070713413928200_btaf321-B23","doi-asserted-by":"publisher","first-page":"5516","DOI":"10.1021\/acs.jctc.3c00372","article-title":"Uncertainties in Markov state models of small proteins","volume":"19","author":"Kozlowski","year":"2023","journal-title":"J Chem Theory Comput"},{"key":"2025070713413928200_btaf321-B24","doi-asserted-by":"publisher","first-page":"bbae105","DOI":"10.1093\/bib\/bbae105","article-title":"GPCR-IPL score: multilevel featurization of GPCR\u2013ligand interaction patterns and prediction of ligand functions from selectivity to biased activation","volume":"25","author":"Kumar","year":"2024","journal-title":"Brief Bioinform"},{"key":"2025070713413928200_btaf321-B25","doi-asserted-by":"publisher","first-page":"244112","DOI":"10.1063\/5.0031979","article-title":"Modeling non-Markovian data using Markov state and Langevin models","volume":"153","author":"Lickert","year":"2020","journal-title":"J Chem Phys"},{"key":"2025070713413928200_btaf321-B26","doi-asserted-by":"publisher","first-page":"163","DOI":"10.1145\/37402.37422","article-title":"Marching cubes: a high resolution 3D surface construction algorithm","volume":"21","author":"Lorensen","year":"1987","journal-title":"SIGGRAPH Comput Graph"},{"key":"2025070713413928200_btaf321-B27","doi-asserted-by":"publisher","first-page":"5188","DOI":"10.1021\/acs.jcim.0c00558","article-title":"An ABSINTH-based protocol for predicting binding affinities between proteins and small molecules","volume":"60","author":"Marchand","year":"2020","journal-title":"J Chem Inf Model"},{"key":"2025070713413928200_btaf321-B28","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1038\/s41467-017-02388-1","article-title":"VAMPnets for deep learning of molecular kinetics","volume":"9","author":"Mardt","year":"2018","journal-title":"Nat Commun"},{"key":"2025070713413928200_btaf321-B29","doi-asserted-by":"publisher","author":"Maturana","year":"2015","DOI":"10.1109\/iros.2015.7353481"},{"key":"2025070713413928200_btaf321-B30","doi-asserted-by":"publisher","first-page":"585","DOI":"10.1038\/267585a0","article-title":"Dynamics of folded proteins","volume":"267","author":"McCammon","year":"1977","journal-title":"Nature"},{"key":"2025070713413928200_btaf321-B31","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1186\/s13321-021-00522-2","article-title":"GNINA 1.0: molecular docking with deep learning","volume":"13","author":"McNutt","year":"2021","journal-title":"J Cheminform"},{"key":"2025070713413928200_btaf321-B32","doi-asserted-by":"publisher","first-page":"1399","DOI":"10.1016\/j.str.2010.07.013","article-title":"MoDEL (Molecular Dynamics Extended Library): a database of atomistic molecular dynamics trajectories","volume":"18","author":"Meyer","year":"2010","journal-title":"Structure"},{"key":"2025070713413928200_btaf321-B33","doi-asserted-by":"publisher","first-page":"134114","DOI":"10.1063\/5.0061874","article-title":"Time-dependent principal component analysis: a unified approach to high-dimensional data reduction using adiabatic dynamics","volume":"155","author":"Morishita","year":"2021","journal-title":"J Chem Phys"},{"key":"2025070713413928200_btaf321-B34","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1186\/1758-2946-3-33","article-title":"Open Babel: an open chemical toolbox","volume":"3","author":"O'Boyle","year":"2011","journal-title":"Journal of Cheminformatics"},{"key":"2025070713413928200_btaf321-B35","author":"Paszke","year":"2019"},{"key":"2025070713413928200_btaf321-B36","doi-asserted-by":"publisher","first-page":"W591","DOI":"10.1093\/nar\/gkaa367","article-title":"Atomic charge calculator II: web-based tool for the calculation of partial atomic charges","volume":"48","author":"Ra\u010dek","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2025070713413928200_btaf321-B37","doi-asserted-by":"publisher","first-page":"7068","DOI":"10.1038\/s41467-021-27396-0","article-title":"DeepRank: a deep learning framework for data mining 3D protein-protein interfaces","volume":"12","author":"Renaud","year":"2021","journal-title":"Nat Commun"},{"key":"2025070713413928200_btaf321-B38","doi-asserted-by":"publisher","first-page":"726","DOI":"10.1021\/acs.jcim.6b00778","article-title":"Molecular dynamics fingerprints (MDFP): machine learning from MD data to predict free-energy differences","volume":"57","author":"Riniker","year":"2017","journal-title":"J Chem Inf Model"},{"key":"2025070713413928200_btaf321-B39","doi-asserted-by":"publisher","first-page":"777","DOI":"10.1038\/s41592-020-0884-y","article-title":"GPCRmd uncovers the dynamics of the 3D-GPCRome","volume":"17","author":"Rodr\u00edguez-Espigares","year":"2020","journal-title":"Nat Methods"},{"key":"2025070713413928200_btaf321-B40","doi-asserted-by":"publisher","first-page":"367","DOI":"10.1038\/s43588-024-00627-2","article-title":"MISATO: machine learning dataset of protein\u2013ligand complexes for structure-based drug discovery","volume":"4","author":"Siebenmorgen","year":"2024","journal-title":"Nat Comput Sci"},{"key":"2025070713413928200_btaf321-B41","doi-asserted-by":"publisher","first-page":"2356","DOI":"10.1021\/acs.jcim.9b00554","article-title":"DeeplyTough: learning structural comparison of protein binding sites","volume":"60","author":"Simonovsky","year":"2020","journal-title":"J Chem Inf Model"},{"key":"2025070713413928200_btaf321-B42","doi-asserted-by":"publisher","first-page":"3666","DOI":"10.1093\/bioinformatics\/bty374","article-title":"Development and evaluation of a deep learning model for protein-ligand binding affinity prediction","volume":"34","author":"Stepniewska-Dziubinska","year":"2018","journal-title":"Bioinformatics"},{"key":"2025070713413928200_btaf321-B43","doi-asserted-by":"publisher","first-page":"895","DOI":"10.1021\/acs.jcim.8b00545","article-title":"Comparative assessment of scoring functions: the CASF-2016 update","volume":"59","author":"Su","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2025070713413928200_btaf321-B44","doi-asserted-by":"publisher","first-page":"1079","DOI":"10.1021\/acs.jcim.9b01145","article-title":"Libmolgrid: graphics processing unit accelerated molecular gridding for deep learning applications","volume":"60","author":"Sunseri","year":"2020","journal-title":"J Chem Inf Model"},{"key":"2025070713413928200_btaf321-B45","first-page":"15642","author":"Townshend","year":"2019"},{"key":"2025070713413928200_btaf321-B46","author":"Townshend"},{"key":"2025070713413928200_btaf321-B47","doi-asserted-by":"publisher","first-page":"5115","DOI":"10.1038\/s41467-020-18959-8","article-title":"Learning molecular dynamics with simple language model built upon long short-term memory neural network","volume":"11","author":"Tsai","year":"2020","journal-title":"Nat Commun"},{"key":"2025070713413928200_btaf321-B48","doi-asserted-by":"publisher","first-page":"D384","DOI":"10.1093\/nar\/gkad1084","article-title":"ATLAS: protein flexibility description from atomistic molecular dynamics simulations","volume":"52","author":"Vander Meersche","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2025070713413928200_btaf321-B49","doi-asserted-by":"publisher","first-page":"2977","DOI":"10.1021\/jm030580l","article-title":"The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures","volume":"47","author":"Wang","year":"2004","journal-title":"J Med Chem"},{"key":"2025070713413928200_btaf321-B50","doi-asserted-by":"publisher","first-page":"15101","DOI":"10.1063\/5.0149207","article-title":"Optimized reaction coordinates for analysis of enhanced sampling","volume":"159","author":"Widmer","year":"2023","journal-title":"J Chem Phys"},{"key":"2025070713413928200_btaf321-B51","doi-asserted-by":"publisher","first-page":"101147","DOI":"10.1016\/j.patter.2024.101147","article-title":"Benchmarking the robustness of the correct identification of flexible 3D objects using common machine learning models","volume":"6","author":"Zhang","year":"2025","journal-title":"Patterns (N Y)"},{"key":"2025070713413928200_btaf321-B52","doi-asserted-by":"publisher","first-page":"100014","DOI":"10.1016\/j.crmeth.2021.100014","article-title":"Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations","volume":"1","author":"Zheng","year":"2021","journal-title":"Cell Rep Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf321\/63397357\/btaf321.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/7\/btaf321\/63397357\/btaf321.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/7\/btaf321\/63397357\/btaf321.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,7]],"date-time":"2025-07-07T17:41:43Z","timestamp":1751910103000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf321\/8152691"}},"subtitle":[],"editor":[{"given":"Xin","family":"Gao","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,5,29]]},"references-count":52,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2025,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf321","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,5,29]]},"article-number":"btaf321"}}