{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,28]],"date-time":"2025-03-28T04:40:56Z","timestamp":1743136856645,"version":"3.40.3"},"publisher-location":"Cham","reference-count":49,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783031723803"},{"type":"electronic","value":"9783031723810"}],"license":[{"start":{"date-parts":[[2024,9,20]],"date-time":"2024-09-20T00:00:00Z","timestamp":1726790400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,9,20]],"date-time":"2024-09-20T00:00:00Z","timestamp":1726790400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Computer-aided drug discovery gradually builds on previous work and requires reusable code to advance research. Currently, research code is mainly used to provide further insights into the original research whilst code reuse has a lower priority. Modularity, the segmentation of code for independent modules, promotes good coding practices and code reuse. The registry pattern has been proposed as a way to call functionalities dynamically, but it is currently overlooked as a shortcut to promote code reuse. In this work, we expand the registry pattern to better suit computer-aided drug discovery and achieve a unified, reusable, and interchangeable interface with optional meta information. Our reformulated pattern is particularly suitable for collaborative research with standardized frameworks where multiple internal and external modules are used interchangeably and coding is more focused on fast iteration over low-debt technical code, such as in machine learning-based research for drug discovery. In a workflow, we exemplify the usage of the design patterns. Additionally, we provide two case studies where we 1) showcase the effectiveness of registration in a larger collaborative research group, and 2) overview the potential of registration in currently available open-source tools. Finally, we empirically evaluate the registry pattern through previous implementations and indicate where additional functionality can improve its use.<\/jats:p>","DOI":"10.1007\/978-3-031-72381-0_9","type":"book-chapter","created":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T13:10:16Z","timestamp":1726751416000},"page":"98-115","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Registries in\u00a0Machine Learning-Based Drug Discovery: A Shortcut to\u00a0Code Reuse"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6406-6234","authenticated-orcid":false,"given":"Peter B. R.","family":"Hartog","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5598-0286","authenticated-orcid":false,"given":"Emma","family":"Svensson","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7271-0824","authenticated-orcid":false,"given":"Lewis","family":"Mervin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7624-7363","authenticated-orcid":false,"given":"Samuel","family":"Genheden","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4970-6461","authenticated-orcid":false,"given":"Ola","family":"Engkvist","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6855-0012","authenticated-orcid":false,"given":"Igor V.","family":"Tetko","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,9,20]]},"reference":[{"key":"9_CR1","unstructured":"Abadi, M., et\u00a0al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265\u2013283. USENIX Association (2016)"},{"issue":"18","key":"9_CR2","doi-asserted-by":"crossref","first-page":"10256","DOI":"10.3390\/su131810256","volume":"13","author":"SH Almadi","year":"2021","unstructured":"Almadi, S.H., Hooshyar, D., Ahmad, R.B.: Bad smells of gang of four design patterns: a decade systematic literature review. Sustainability 13(18), 10256 (2021)","journal-title":"Sustainability"},{"issue":"26","key":"9_CR3","first-page":"353","volume":"533","author":"M Baker","year":"2016","unstructured":"Baker, M.: Reproducibility crisis. Nature 533(26), 353\u201366 (2016)","journal-title":"Nature"},{"issue":"7","key":"9_CR4","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1287\/mnsc.1060.0546","volume":"52","author":"CY Baldwin","year":"2006","unstructured":"Baldwin, C.Y., Clark, K.B.: The architecture of participation: does code architecture mitigate free riding in the open source development model? Manage. Sci. 52(7), 1116\u20131127 (2006)","journal-title":"Manage. Sci."},{"key":"9_CR5","doi-asserted-by":"crossref","first-page":"69","DOI":"10.3389\/fninf.2017.00069","volume":"11","author":"FC Benureau","year":"2018","unstructured":"Benureau, F.C., Rougier, N.P.: Re-run, repeat, reproduce, reuse, replicate: transforming code into scientific contributions. Front. Neuroinform. 11, 69 (2018)","journal-title":"Front. Neuroinform."},{"key":"9_CR6","doi-asserted-by":"crossref","DOI":"10.7717\/peerj.13933","volume":"10","author":"L Cadwallader","year":"2022","unstructured":"Cadwallader, L., Hrynaszkiewicz, I.: A survey of researchers\u2019 code sharing and code reuse practices, and assessment of interactive notebook prototypes. PeerJ 10, e13933 (2022)","journal-title":"PeerJ"},{"issue":"6","key":"9_CR7","doi-asserted-by":"crossref","first-page":"1241","DOI":"10.1016\/j.drudis.2018.01.039","volume":"23","author":"H Chen","year":"2018","unstructured":"Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., Blaschke, T.: The rise of deep learning in drug discovery. Drug Discov. Today 23(6), 1241\u20131250 (2018)","journal-title":"Drug Discov. Today"},{"key":"9_CR8","unstructured":"Chollet, F., et\u00a0al.: Keras (2015). https:\/\/keras.io"},{"issue":"56","key":"9_CR9","first-page":"1","volume":"23","author":"VGT da Costa","year":"2022","unstructured":"da Costa, V.G.T., Fini, E., Nabi, M., Sebe, N., Ricci, E.: solo-learn: a library of self-supervised methods for visual representation learning. J. Mach. Learn. Res. 23(56), 1\u20136 (2022)","journal-title":"J. Mach. Learn. Res."},{"issue":"5","key":"9_CR10","doi-asserted-by":"crossref","DOI":"10.1002\/cpz1.113","volume":"1","author":"C Dallago","year":"2021","unstructured":"Dallago, C., et al.: Learned embeddings from deep learning to visualize and predict protein sets. Curr. Protoc. 1(5), e113 (2021)","journal-title":"Curr. Protoc."},{"issue":"3","key":"9_CR11","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1007\/s10462-021-10058-4","volume":"55","author":"S Dara","year":"2022","unstructured":"Dara, S., Dhamercherla, S., Jadav, S.S., Babu, C., Ahsan, M.J.: Machine learning in drug discovery: a review. Artif. Intell. Rev. 55(3), 1947\u20131999 (2022)","journal-title":"Artif. Intell. Rev."},{"key":"9_CR12","unstructured":"Fowler, M., Rice, D., Foemmel, M., Hieatt, E., Mee, R., Stafford, R.: Patterns of Enterprise Application Architecture. Addison-Wesley Professional (2002)"},{"key":"9_CR13","volume-title":"Design Patterns: Elements of Reusable Object-Oriented Software","author":"E Gamma","year":"1995","unstructured":"Gamma, E., Helm, R., Johnson, R., Vlissides, J.M.: Design Patterns: Elements of Reusable Object-Oriented Software. Pearson Deutschland GmbH, Munich (1995)"},{"key":"9_CR14","unstructured":"Goyal, S.: More data science, less engineering: a Netflix original. In: 2020 USENIX Conference on Operational Machine Learning (2020)"},{"issue":"2","key":"9_CR15","doi-asserted-by":"publisher","first-page":"108","DOI":"10.3390\/info11020108","volume":"11","author":"J Howard","year":"2020","unstructured":"Howard, J., Gugger, S.: FastAI: a layered API for deep learning. Information 11(2), 108 (2020)","journal-title":"Information"},{"key":"9_CR16","unstructured":"Huang, K., et\u00a0al.: Therapeutics data commons: machine learning datasets and tasks for drug discovery and development. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1) (2021)"},{"key":"9_CR17","volume-title":"The Pragmatic Programmer","author":"A Hunt","year":"1999","unstructured":"Hunt, A., Thomas, D.: The Pragmatic Programmer. Addison-Wesley, Boston, United States (1999)"},{"key":"9_CR18","doi-asserted-by":"crossref","unstructured":"Hussain, S., Keung, J., Khan, A.A.: The effect of gang-of-four design patterns usage on design quality attributes. In: 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 263\u2013273. IEEE (2017)","DOI":"10.1109\/QRS.2017.37"},{"key":"9_CR19","doi-asserted-by":"crossref","unstructured":"Hussain, S., Keung, J., Khan, A.A., Bennin, K.E.: Correlation Between the Frequent Use of Gang-of-four Design Patterns and Structural Complexity. In: 2017 24th Asia-Pacific Software Engineering Conference (APSEC), pp. 189\u2013198. IEEE (2017)","DOI":"10.1109\/APSEC.2017.25"},{"key":"9_CR20","doi-asserted-by":"crossref","unstructured":"Jaspan, C., et\u00a0al.: Advantages and disadvantages of a monolithic repository: a case study at Google. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, pp. 225\u2013234 (2018)","DOI":"10.1145\/3183519.3183550"},{"key":"9_CR21","doi-asserted-by":"publisher","unstructured":"Landrum, G.: RDKit: Open-Source Cheminformatics (2006). https:\/\/doi.org\/10.5281\/zenodo.6961488, http:\/\/www.rdkit.org","DOI":"10.5281\/zenodo.6961488"},{"key":"9_CR22","unstructured":"Lhoest, Q., et\u00a0al.: Datasets: a community library for natural language processing. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 175\u2013184. Association for Computational Linguistics (2021)"},{"issue":"6","key":"9_CR23","doi-asserted-by":"crossref","first-page":"1811","DOI":"10.1145\/197320.197383","volume":"16","author":"BH Liskov","year":"1994","unstructured":"Liskov, B.H., Wing, J.M.: A behavioral notion of subtyping. ACM Trans. Program. Lang. Syst. 16(6), 1811\u20131841 (1994)","journal-title":"ACM Trans. Program. Lang. Syst."},{"issue":"1","key":"9_CR24","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1038\/s41524-023-01028-1","volume":"9","author":"M Manica","year":"2023","unstructured":"Manica, M., et al.: Accelerating material design with the generative toolkit for scientific discovery. NPJ Comput. Mater. 9(1), 69 (2023)","journal-title":"NPJ Comput. Mater."},{"key":"9_CR25","unstructured":"Martin, R.C.: The dependency inversion principle. C++ Report 8(6), 61\u201366 (1996)"},{"issue":"34","key":"9_CR26","first-page":"597","volume":"1","author":"RC Martin","year":"2000","unstructured":"Martin, R.C.: Design principles and design patterns. Object Mentor 1(34), 597 (2000)","journal-title":"Object Mentor"},{"key":"9_CR27","doi-asserted-by":"publisher","unstructured":"Mary, H., et\u00a0al.: Datamol: molecular manipulation made easy (2022). https:\/\/doi.org\/10.5281\/zenodo.6856321, https:\/\/datamol.io\/","DOI":"10.5281\/zenodo.6856321"},{"issue":"4","key":"9_CR28","doi-asserted-by":"crossref","first-page":"1955","DOI":"10.1021\/acs.jcim.9b01053","volume":"60","author":"AJ Minnich","year":"2020","unstructured":"Minnich, A.J., et al.: AMPL: a data-driven modeling pipeline for drug discovery. J. Chem. Inf. Model. 60(4), 1955\u20131968 (2020)","journal-title":"J. Chem. Inf. Model."},{"key":"9_CR29","unstructured":"Gnu general public license, version 3. https:\/\/opensource.org\/licenses\/MIT. Accessed 17 January 2022"},{"key":"9_CR30","doi-asserted-by":"crossref","unstructured":"Mo, R., Cai, Y., Kazman, R., Xiao, L., Feng, Q.: Decoupling level: a new metric for architectural maintenance complexity. In: 2016 IEEE\/ACM 38th International Conference on Software Engineering (ICSE), pp. 499\u2013510. IEEE (2016)","DOI":"10.1145\/2884781.2884825"},{"issue":"1","key":"9_CR31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1758-2946-3-1","volume":"3","author":"NM O\u2019Boyle","year":"2011","unstructured":"O\u2019Boyle, N.M., Banck, M., James, C.A., Morley, C., Vandermeersch, T., Hutchison, G.R.: Open Babel: an open chemical toolbox. J. Cheminf. 3(1), 1\u201314 (2011)","journal-title":"J. Cheminf."},{"key":"9_CR32","unstructured":"Paszke, A., et\u00a0al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol.\u00a032 (2019)"},{"key":"9_CR33","volume-title":"Software Engineering: A Practitioner\u2019s Approach","author":"RS Pressman","year":"2005","unstructured":"Pressman, R.S.: Software Engineering: A Practitioner\u2019s Approach. Palgrave Macmillan, Gurgaon, India (2005)"},{"key":"9_CR34","volume-title":"Deep Learning for the Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery, and More","author":"B Ramsundar","year":"2019","unstructured":"Ramsundar, B., Eastman, P., Walters, P., Pande, V.: Deep Learning for the Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery, and More. O\u2019Reilly Media, Sebastopol (2019)"},{"key":"9_CR35","doi-asserted-by":"crossref","unstructured":"Sarjoughian, H.S.: Model composability. In: Proceedings of the 2006 Winter Simulation Conference, pp. 149\u2013158. IEEE (2006)","DOI":"10.1109\/WSC.2006.323047"},{"issue":"2","key":"9_CR36","first-page":"43","volume":"3","author":"C Sieb","year":"2007","unstructured":"Sieb, C., Meinl, T., Berthold, M.R.: Parallel and distributed data pipelining with KNIME. Mediterr. J. Comput. Netw. 3(2), 43\u201351 (2007)","journal-title":"Mediterr. J. Comput. Netw."},{"issue":"1","key":"9_CR37","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-3-28","volume":"3","author":"JC St\u00e5lring","year":"2011","unstructured":"St\u00e5lring, J.C., Carlsson, L.A., Almeida, P., Boyer, S.: AZOrange - high performance open source machine learning for QSAR modeling in a graphical programming environment. J. Cheminf. 3(1), 1\u201310 (2011)","journal-title":"J. Cheminf."},{"issue":"2","key":"9_CR38","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The Chemistry Development Kit (CDK): an open-source java library for chemo-and bioinformatics. J. Chem. Inf. Comput. Sci. 43(2), 493\u2013500 (2003)","journal-title":"J. Chem. Inf. Comput. Sci."},{"issue":"5","key":"9_CR39","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1145\/503271.503224","volume":"26","author":"KJ Sullivan","year":"2001","unstructured":"Sullivan, K.J., Griswold, W.G., Cai, Y., Hallen, B.: The structure and value of modularity in software design. ACM SIGSOFT Softw. Eng. Notes 26(5), 99\u2013108 (2001)","journal-title":"ACM SIGSOFT Softw. Eng. Notes"},{"key":"9_CR40","doi-asserted-by":"crossref","unstructured":"Sushko, I., et\u00a0al.: Online Chemical Modeling Environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J. Comput. Aided Mol. Des. 25(6), 533\u2013554 (2011)","DOI":"10.1007\/s10822-011-9440-2"},{"key":"9_CR41","doi-asserted-by":"crossref","first-page":"741","DOI":"10.1016\/B978-0-12-809633-8.20157-X","volume-title":"Encyclopedia of Bioinformatics and Computational Biology","author":"V Tomar","year":"2019","unstructured":"Tomar, V., Mazumder, M., Chandra, R., Yang, J., Sakharkar, M.K.: Small molecule drug design. In: Ranganathan, S., Gribskov, M., Nakai, K., Sch\u00f6nbach, C. (eds.) Encyclopedia of Bioinformatics and Computational Biology, pp. 741\u2013760. Academic Press, Oxford (2019)"},{"issue":"6","key":"9_CR42","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1038\/s41573-019-0024-5","volume":"18","author":"J Vamathevan","year":"2019","unstructured":"Vamathevan, J., et al.: Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18(6), 463\u2013477 (2019)","journal-title":"Nat. Rev. Drug Discov."},{"key":"9_CR43","doi-asserted-by":"publisher","unstructured":"William, F.: PyTorch Lightning (2019). https:\/\/doi.org\/10.5281\/zenodo.3828935, https:\/\/www.pytorchlightning.ai","DOI":"10.5281\/zenodo.3828935"},{"issue":"1","key":"9_CR44","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13321-015-0078-2","volume":"7","author":"M W\u00f3jcikowski","year":"2015","unstructured":"W\u00f3jcikowski, M., Zielenkiewicz, P., Siedlecki, P.: Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field. J. Cheminf. 7(1), 1\u20136 (2015)","journal-title":"J. Cheminf."},{"key":"9_CR45","unstructured":"Wolf, T., et\u00a0al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38\u201345. Association for Computational Linguistics (2020)"},{"issue":"4","key":"9_CR46","doi-asserted-by":"crossref","first-page":"344","DOI":"10.3390\/e21040344","volume":"21","author":"Y Xiang","year":"2019","unstructured":"Xiang, Y., Pan, W., Jiang, H., Zhu, Y., Li, H.: Measuring software modularity based on software networks. Entropy 21(4), 344 (2019)","journal-title":"Entropy"},{"issue":"4","key":"9_CR47","first-page":"39","volume":"41","author":"M Zaharia","year":"2018","unstructured":"Zaharia, M., et al.: Accelerating the machine learning lifecycle with MLflow. IEEE Data Eng. Bull. 41(4), 39\u201345 (2018)","journal-title":"IEEE Data Eng. Bull."},{"issue":"5","key":"9_CR48","doi-asserted-by":"crossref","first-page":"1213","DOI":"10.1109\/TSE.2011.79","volume":"38","author":"C Zhang","year":"2011","unstructured":"Zhang, C., Budgen, D.: What do we know about the effectiveness of software design patterns? IEEE Trans. Softw. Eng. 38(5), 1213\u20131231 (2011)","journal-title":"IEEE Trans. Softw. Eng."},{"key":"9_CR49","unstructured":"Zhu, Z., et\u00a0al.: TorchDrug: A powerful and flexible machine learning platform for drug discovery. arXiv preprint arXiv:2202.08320 (2022)"}],"container-title":["Lecture Notes in Computer Science","AI in Drug Discovery"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-72381-0_9","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,28]],"date-time":"2024-11-28T11:44:33Z","timestamp":1732794273000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-72381-0_9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,20]]},"ISBN":["9783031723803","9783031723810"],"references-count":49,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-72381-0_9","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2024,9,20]]},"assertion":[{"value":"20 September 2024","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"The authors have no competing interests to declare that are relevant to the content of this article.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Disclosure of Interests"}},{"value":"AIDD","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Workshop on AI in Drug Discovery","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Lugano","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Switzerland","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2024","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"19 September 2024","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"19 September 2024","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"1","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"aidd2024","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/e-nns.org\/wp-content\/uploads\/2024\/ICANN2024-AIDrugDis-CfP_final.pdf","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}