{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:30:03Z","timestamp":1772166603967,"version":"3.50.1"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,2,22]],"date-time":"2021-02-22T00:00:00Z","timestamp":1613952000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,2,22]],"date-time":"2021-02-22T00:00:00Z","timestamp":1613952000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000865","name":"Bill and Melinda Gates Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000865","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100013060","name":"European Molecular Biology Laboratory","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100013060","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Malaria is a disease affecting hundreds of millions of people across the world, mainly in developing countries and especially in sub-Saharan Africa. It is the cause of hundreds of thousands of deaths each year and there is an ever-present need to identify and develop effective new therapies to tackle the disease and overcome increasing drug resistance. Here, we extend a previous study in which a number of partners collaborated to develop a consensus in silico model that can be used to identify novel molecules that may have antimalarial properties. The performance of machine learning methods generally improves with the number of data points available for training. One practical challenge in building large training sets is that the data are often proprietary and cannot be straightforwardly integrated. Here, this was addressed by sharing QSAR models, each built on a private data set. We describe the development of an open-source software platform for creating such models, a comprehensive evaluation of methods to create a single consensus model and a web platform called MAIP available at\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/www.ebi.ac.uk\/chembl\/maip\/\">https:\/\/www.ebi.ac.uk\/chembl\/maip\/<\/jats:ext-link>\n                    . MAIP is freely available for the wider community to make large-scale predictions of potential malaria inhibiting compounds. This project also highlights some of the practical challenges in reproducing published computational methods and the opportunities that open-source software can offer to the community.\n                  <\/jats:p>","DOI":"10.1186\/s13321-021-00487-2","type":"journal-article","created":{"date-parts":[[2021,2,22]],"date-time":"2021-02-22T06:04:47Z","timestamp":1613973887000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["MAIP: a web service for predicting blood\u2010stage malaria inhibitors"],"prefix":"10.1186","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3562-1328","authenticated-orcid":false,"given":"Nicolas","family":"Bosc","sequence":"first","affiliation":[]},{"given":"Eloy","family":"Felix","sequence":"additional","affiliation":[]},{"given":"Ricardo","family":"Arcila","sequence":"additional","affiliation":[]},{"given":"David","family":"Mendez","sequence":"additional","affiliation":[]},{"given":"Martin R.","family":"Saunders","sequence":"additional","affiliation":[]},{"given":"Darren V. S.","family":"Green","sequence":"additional","affiliation":[]},{"given":"Jason","family":"Ochoada","sequence":"additional","affiliation":[]},{"given":"Anang A.","family":"Shelat","sequence":"additional","affiliation":[]},{"given":"Eric J.","family":"Martin","sequence":"additional","affiliation":[]},{"given":"Preeti","family":"Iyer","sequence":"additional","affiliation":[]},{"given":"Ola","family":"Engkvist","sequence":"additional","affiliation":[]},{"given":"Andreas","family":"Verras","sequence":"additional","affiliation":[]},{"given":"James","family":"Duffy","sequence":"additional","affiliation":[]},{"given":"Jeremy","family":"Burrows","sequence":"additional","affiliation":[]},{"given":"J. Mark F.","family":"Gardner","sequence":"additional","affiliation":[]},{"given":"Andrew R.","family":"Leach","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,2,22]]},"reference":[{"key":"487_CR1","unstructured":"WHO (2019) World malaria report 2019"},{"key":"487_CR2","doi-asserted-by":"publisher","first-page":"917","DOI":"10.1038\/nm.4381","volume":"23","author":"B Blasco","year":"2017","unstructured":"Blasco B, Leroy D, Fidock DA (2017) Antimalarial drug resistance: linking Plasmodium falciparum parasite biology to the clinic. Nat Med 23:917\u2013928. https:\/\/doi.org\/10.1038\/nm.4381","journal-title":"Nat Med"},{"key":"487_CR3","doi-asserted-by":"publisher","first-page":"e84555","DOI":"10.1371\/journal.pone.0084555","volume":"9","author":"K Bruxvoort","year":"2014","unstructured":"Bruxvoort K, Goodman C, Kachur SP, Schellenberg D (2014) How patients take malaria treatment: A systematic review of the literature on adherence to antimalarial drugs. PLoS ONE 9:e84555. https:\/\/doi.org\/10.1371\/journal.pone.0084555","journal-title":"PLoS ONE"},{"key":"487_CR4","doi-asserted-by":"publisher","first-page":"e1000221","DOI":"10.1371\/journal.pmed.1000221","volume":"7","author":"S Dellicour","year":"2010","unstructured":"Dellicour S, Tatem AJ, Guerra CA et al (2010) Quantifying the Number of Pregnancies at Risk of Malaria in 2007: A Demographic Study. PLoS Medicine 7:e1000221. https:\/\/doi.org\/10.1371\/journal.pmed.1000221","journal-title":"PLoS Medicine"},{"key":"487_CR5","doi-asserted-by":"publisher","unstructured":"Plouffe D, Brinker A, McNamara C et al (2008) In silico activity profiling reveals the mechanism of action of antimalarials discovered in a high-throughput screen. Proceedings of the National Academy of Sciences 105:9059\u20139064. https:\/\/doi.org\/10.1073\/pnas.0802982105","DOI":"10.1073\/pnas.0802982105"},{"key":"487_CR6","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1038\/nature09107","volume":"465","author":"F-J Gamo","year":"2010","unstructured":"Gamo F-J, Sanz LM, Vidal J et al (2010) Thousands of chemical starting points for antimalarial lead identification. Nature 465:305\u2013310. https:\/\/doi.org\/10.1038\/nature09107","journal-title":"Nature"},{"key":"487_CR7","doi-asserted-by":"publisher","first-page":"311","DOI":"10.1038\/nature09099","volume":"465","author":"WA Guiguemde","year":"2010","unstructured":"Guiguemde WA, Shelat AA, Bouck D et al (2010) Chemical genetics of Plasmodium falciparum. Nature 465:311\u2013315. https:\/\/doi.org\/10.1038\/nature09099","journal-title":"Nature"},{"key":"487_CR8","doi-asserted-by":"publisher","first-page":"17050","DOI":"10.1038\/nrdp.2017.50","volume":"3","author":"MA Phillips","year":"2017","unstructured":"Phillips MA, Burrows JN, Manyando C et al (2017) Nature reviews disease primers. Malaria 3:17050. https:\/\/doi.org\/10.1038\/nrdp.2017.50","journal-title":"Malaria"},{"key":"487_CR9","doi-asserted-by":"publisher","unstructured":"LaMonte GM, Rocamora F, Marapana DS et al (2020) Pan-active imidazolopiperazine antimalarials target the Plasmodium falciparum intracellular secretory pathway. Nat Commun 11:. https:\/\/doi.org\/10.1038\/s41467-020-15440-4","DOI":"10.1038\/s41467-020-15440-4"},{"key":"487_CR10","doi-asserted-by":"publisher","first-page":"948","DOI":"10.1038\/nrd4128","volume":"12","author":"JG Cumming","year":"2013","unstructured":"Cumming JG, Davis AM, Muresan S et al (2013) Chemical predictive modelling to improve compound quality. Nat Rev Drug Discovery 12:948\u2013962. https:\/\/doi.org\/10.1038\/nrd4128","journal-title":"Nat Rev Drug Discovery"},{"key":"487_CR11","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1038\/s41573-019-0024-5","volume":"18","author":"J Vamathevan","year":"2019","unstructured":"Vamathevan J, Clark D, Czodrowski P et al (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discovery 18:463\u2013477. https:\/\/doi.org\/10.1038\/s41573-019-0024-5","journal-title":"Nat Rev Drug Discovery"},{"key":"487_CR12","doi-asserted-by":"publisher","first-page":"4977","DOI":"10.1021\/jm4004285","volume":"57","author":"A Cherkasov","year":"2014","unstructured":"Cherkasov A, Muratov EN, Fourches D et al (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977\u20135010. https:\/\/doi.org\/10.1021\/jm4004285","journal-title":"J Med Chem"},{"key":"487_CR13","doi-asserted-by":"publisher","first-page":"D930","DOI":"10.1093\/nar\/gky1075","volume":"47","author":"D Mendez","year":"2019","unstructured":"Mendez D, Gaulton A, Bento AP et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47:D930\u2013D940. https:\/\/doi.org\/10.1093\/nar\/gky1075","journal-title":"Nucleic Acids Res"},{"key":"487_CR14","doi-asserted-by":"publisher","first-page":"445","DOI":"10.1021\/acs.jcim.6b00572","volume":"57","author":"A Verras","year":"2017","unstructured":"Verras A, Waller CL, Gedeck P et al (2017) Shared consensus machine learning models for predicting blood stage malaria inhibition. J Chem Inf Model 57:445\u2013453. https:\/\/doi.org\/10.1021\/acs.jcim.6b00572","journal-title":"J Chem Inf Model"},{"key":"487_CR15","doi-asserted-by":"publisher","first-page":"673","DOI":"10.1021\/acs.jcim.7b00523","volume":"58","author":"M Patel","year":"2018","unstructured":"Patel M, Chilton ML, Sartini A et al (2018) Assessment and reproducibility of quantitative structure\u2013activity relationship models by the nonexpert. J Chem Inf Model 58:673\u2013682. https:\/\/doi.org\/10.1021\/acs.jcim.7b00523","journal-title":"J Chem Inf Model"},{"key":"487_CR16","unstructured":"Haibe-Kains B, Adam GA, Hosny A et al (2020) The importance of transparency and reproducibility in artificial intelligence research. arXiv 2003.00898"},{"key":"487_CR17","doi-asserted-by":"publisher","first-page":"D1102","DOI":"10.1093\/nar\/gky1033","volume":"47","author":"S Kim","year":"2019","unstructured":"Kim S, Chen J, Cheng T et al (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47:D1102\u2013D1109. https:\/\/doi.org\/10.1093\/nar\/gky1033","journal-title":"Nucleic Acids Res"},{"key":"487_CR18","volume-title":"2017.2.0.1361","author":"BIOVIA Dassault Syst\u00e8mes","year":"2016","unstructured":"Dassault Syst\u00e8mes BIOVIA, Pipeline, Pilot (2016) 2017.2.0.1361. Dassault Syst\u00e8mes, San Diego"},{"key":"487_CR19","unstructured":"RDKit: Open-Source Cheminformatics. http:\/\/www.rdkit.org"},{"key":"487_CR20","doi-asserted-by":"publisher","first-page":"868","DOI":"10.1021\/ci990307l","volume":"39","author":"SA Wildman","year":"1999","unstructured":"Wildman SA, Crippen GM (1999) Prediction of physicochemical parameters by atomic contributions. J Chem Inf Comput Sci 39:868\u2013873. https:\/\/doi.org\/10.1021\/ci990307l","journal-title":"J Chem Inf Comput Sci"},{"key":"487_CR21","doi-asserted-by":"publisher","first-page":"1124","DOI":"10.1021\/ci060003g","volume":"46","author":"M Nidhi, Glick","year":"2006","unstructured":"Nidhi, Glick M, Davies JW, Jenkins JL (2006) Prediction of biological targets for compounds using multiple-category Bayesian Models trained on chemogenomics databases. J Chem Inf Model 46:1124\u20131133. https:\/\/doi.org\/10.1021\/ci060003g","journal-title":"J Chem Inf Model"},{"key":"487_CR22","doi-asserted-by":"publisher","first-page":"4463","DOI":"10.1021\/jm0303195","volume":"47","author":"X Xia","year":"2004","unstructured":"Xia X, Maliski EG, Gallant P, Rogers D (2004) Classification of kinase inhibitors using a Bayesian Model. J Med Chem 47:4463\u20134470. https:\/\/doi.org\/10.1021\/jm0303195","journal-title":"J Med Chem"},{"key":"487_CR23","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"487_CR24","doi-asserted-by":"publisher","first-page":"488","DOI":"10.1021\/ci600426e","volume":"47","author":"J-F Truchon","year":"2007","unstructured":"Truchon J-F, Bayly CI (2007) Evaluating virtual screening methods: good and bad metrics for the \u201cearly recognition\u201d problem. J Chem Inf Model 47:488\u2013508. https:\/\/doi.org\/10.1021\/ci600426e","journal-title":"J Chem Inf Model"},{"key":"487_CR25","first-page":"2579","volume":"9","author":"L van der Matten","year":"2008","unstructured":"van der Matten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579\u20132605","journal-title":"J Mach Learn Res"},{"key":"487_CR26","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1021\/ci100050t","volume":"50","author":"D Rogers","year":"2010","unstructured":"Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742\u2013754. https:\/\/doi.org\/10.1021\/ci100050t","journal-title":"J Chem Inf Model"},{"key":"487_CR27","doi-asserted-by":"publisher","first-page":"1315","DOI":"10.1016\/j.jmgm.2008.01.002","volume":"26","author":"S Weaver","year":"2008","unstructured":"Weaver S, Gleeson MP (2008) The importance of the domain of applicability in QSAR modeling. J Mol Graph Model 26:1315\u20131326. https:\/\/doi.org\/10.1016\/j.jmgm.2008.01.002","journal-title":"J Mol Graph Model"},{"key":"487_CR28","doi-asserted-by":"publisher","first-page":"4791","DOI":"10.3390\/molecules17054791","volume":"17","author":"F Sahigara","year":"2012","unstructured":"Sahigara F, Mansouri K, Ballabio D et al (2012) Comparison of different approaches to define the applicability domain of QSAR models. Molecules 17:4791\u20134810. https:\/\/doi.org\/10.3390\/molecules17054791","journal-title":"Molecules"},{"key":"487_CR29","doi-asserted-by":"publisher","first-page":"814","DOI":"10.1021\/ci300004n","volume":"52","author":"RP Sheridan","year":"2012","unstructured":"Sheridan RP (2012) Three useful dimensions for domain applicability in QSAR models using random forest. J Chem Inf Model 52:814\u2013823. https:\/\/doi.org\/10.1021\/ci300004n","journal-title":"J Chem Inf Model"},{"key":"487_CR30","doi-asserted-by":"publisher","first-page":"1596","DOI":"10.1021\/ci5001168","volume":"54","author":"U Norinder","year":"2014","unstructured":"Norinder U, Carlsson L, Boyer S, Eklund M (2014) Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination. J Chem Inf Model 54:1596\u20131603. https:\/\/doi.org\/10.1021\/ci5001168","journal-title":"J Chem Inf Model"},{"key":"487_CR31","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/s13321-018-0325-4","volume":"11","author":"N Bosc","year":"2019","unstructured":"Bosc N, Atkinson F, Felix E et al (2019) Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. J Cheminform 11:4. https:\/\/doi.org\/10.1186\/s13321-018-0325-4","journal-title":"J Cheminform"},{"key":"487_CR32","doi-asserted-by":"crossref","unstructured":"Cort\u00e9s-Ciriano I, Bender A (2019) Concepts and applications of conformal prediction in computational drug discovery. arXiv:190803569 [cs, q-bio]","DOI":"10.1039\/9781788016841-00063"},{"key":"487_CR33","doi-asserted-by":"publisher","first-page":"1221","DOI":"10.1021\/acs.jcim.8b00640","volume":"59","author":"APA Janssen","year":"2019","unstructured":"Janssen APA, Grimm SH, Wijdeven RHM et al (2019) Drug discovery maps, a machine learning model that visualizes and predicts Kinome\u2013inhibitor interaction landscapes. J Chem Inf Model 59:1221\u20131229. https:\/\/doi.org\/10.1021\/acs.jcim.8b00640","journal-title":"J Chem Inf Model"},{"key":"487_CR34","doi-asserted-by":"publisher","first-page":"5151","DOI":"10.1039\/C8RA10182E","volume":"9","author":"DS Karlov","year":"2019","unstructured":"Karlov DS, Sosnin S, Tetko IV, Fedorov MV (2019) Chemical space exploration guided by deep neural networks. RSC Adv 9:5151\u20135157. https:\/\/doi.org\/10.1039\/C8RA10182E","journal-title":"RSC Adv"},{"key":"487_CR35","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1007\/s10822-014-9819-y","volume":"29","author":"E Martin","year":"2015","unstructured":"Martin E, Cao E (2015) Euclidean chemical spaces from molecular fingerprints: Hamming distance and Hempel\u2019s ravens. J Comput Aided Mol Des 29:387\u2013395. https:\/\/doi.org\/10.1007\/s10822-014-9819-y","journal-title":"J Comput Aided Mol Des"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00487-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-021-00487-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00487-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,4]],"date-time":"2021-05-04T12:16:27Z","timestamp":1620130587000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-021-00487-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,22]]},"references-count":35,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["487"],"URL":"https:\/\/doi.org\/10.1186\/s13321-021-00487-2","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-41814\/v1","asserted-by":"object"},{"id-type":"doi","id":"10.21203\/rs.3.rs-41814\/v2","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,22]]},"assertion":[{"value":"13 July 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 January 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 April 2021","order":4,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":5,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"In the original publication, the Funding section statement was incomplete and the statement \u201cOpen Access funding enabled and organized by Projekt DEAL\u201d was missing. The article has been updated and the Funding section has been corrected.","order":6,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare that they have no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"13"}}