{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T15:27:14Z","timestamp":1771342034201,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2023,5,23]],"date-time":"2023-05-23T00:00:00Z","timestamp":1684800000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1238187"],"award-info":[{"award-number":["1238187"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Developing new crop varieties with superior performance is highly important to ensure robust and sustainable global food security. The speed of variety development is limited by long field cycles and advanced generation selections in plant breeding programs. While methods to predict yield from genotype or phenotype data have been proposed, improved performance and integrated models are needed.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose a machine learning model that leverages both genotype and phenotype measurements by fusing genetic variants with multiple data sources collected by unmanned aerial systems. We use a deep multiple instance learning framework with an attention mechanism that sheds light on the importance given to each input during prediction, enhancing interpretability. Our model reaches 0.754\u2009\u00b1\u20090.024 Pearson correlation coefficient when predicting yield in similar environmental conditions; a 34.8% improvement over the genotype-only linear baseline (0.559\u2009\u00b1\u20090.050). We further predict yield on new lines in an unseen environment using only genotypes, obtaining a prediction accuracy of 0.386\u2009\u00b1\u20090.010, a 13.5% improvement over the linear baseline. Our multi-modal deep learning architecture efficiently accounts for plant health and environment, distilling the genetic contribution and providing excellent predictions. Yield prediction algorithms leveraging phenotypic observations during training therefore promise to improve breeding programs, ultimately speeding up delivery of improved varieties.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Available at https:\/\/github.com\/BorgwardtLab\/PheGeMIL (code) and https:\/\/doi.org\/doi:10.5061\/dryad.kprr4xh5p (data).<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad336","type":"journal-article","created":{"date-parts":[[2023,5,24]],"date-time":"2023-05-24T00:03:15Z","timestamp":1684886595000},"source":"Crossref","is-referenced-by-count":29,"title":["Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6554-2167","authenticated-orcid":false,"given":"Matteo","family":"Togninalli","sequence":"first","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , Basel, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"},{"name":"Visium , Lausanne, Switzerland"}]},{"given":"Xu","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Plant Pathology, Kansas State University , Manhattan, KS, United States"},{"name":"Department of Agricultural and Biological Engineering, IFAS Gulf Coast Research and Education Center, University of Florida , Wimauma, FL, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4358-7932","authenticated-orcid":false,"given":"Tim","family":"Kucera","sequence":"additional","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , Basel, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"},{"name":"Department of Machine Learning and Systems Biology, Max Planck Institute of Biochemistry , Martinsried, Germany"}]},{"given":"Sandesh","family":"Shrestha","sequence":"additional","affiliation":[{"name":"Department of Plant Pathology, Kansas State University , Manhattan, KS, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6922-0173","authenticated-orcid":false,"given":"Philomin","family":"Juliana","sequence":"additional","affiliation":[{"name":"Global Wheat Program, International Maize and Wheat Improvement Center , Texcoco, Estado de Mexico, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8582-8899","authenticated-orcid":false,"given":"Suchismita","family":"Mondal","sequence":"additional","affiliation":[{"name":"Global Wheat Program, International Maize and Wheat Improvement Center , Texcoco, Estado de Mexico, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3000-9084","authenticated-orcid":false,"given":"Francisco","family":"Pinto","sequence":"additional","affiliation":[{"name":"Global Wheat Program, International Maize and Wheat Improvement Center , Texcoco, Estado de Mexico, Mexico"}]},{"given":"Velu","family":"Govindan","sequence":"additional","affiliation":[{"name":"Global Wheat Program, International Maize and Wheat Improvement Center , Texcoco, Estado de Mexico, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0506-4700","authenticated-orcid":false,"given":"Leonardo","family":"Crespo-Herrera","sequence":"additional","affiliation":[{"name":"Global Wheat Program, International Maize and Wheat Improvement Center , Texcoco, Estado de Mexico, Mexico"}]},{"given":"Julio","family":"Huerta-Espino","sequence":"additional","affiliation":[{"name":"Global Wheat Program, International Maize and Wheat Improvement Center , Texcoco, Estado de Mexico, Mexico"},{"name":"Campo Experimental Valle de Mexico-INIFAP , Texcoco, Estado de Mexico, Mexico"}]},{"given":"Ravi P","family":"Singh","sequence":"additional","affiliation":[{"name":"Global Wheat Program, International Maize and Wheat Improvement Center , Texcoco, Estado de Mexico, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7221-2393","authenticated-orcid":false,"given":"Karsten","family":"Borgwardt","sequence":"additional","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , Basel, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"},{"name":"Department of Machine Learning and Systems Biology, Max Planck Institute of Biochemistry , Martinsried, Germany"}]},{"given":"Jesse","family":"Poland","sequence":"additional","affiliation":[{"name":"Department of Plant Pathology, Kansas State University , Manhattan, KS, United States"},{"name":"Center for Desert Agriculture, King Abdullah University of Science and Technology , Thuwal, Saudi Arabia"}]}],"member":"286","published-online":{"date-parts":[[2023,5,23]]},"reference":[{"key":"2023060914134854800_btad336-B1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12711-020-00531-z","article-title":"Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes","volume":"52","author":"Abdollahi-Arpanahi","year":"2020","journal-title":"Genet Sel Evol"},{"key":"2023060914134854800_btad336-B2","doi-asserted-by":"crossref","first-page":"872","DOI":"10.1111\/tpj.14659","article-title":"Imputation of 3 million SNPs in the Arabidopsis regional mapping population","volume":"102","author":"Arouisse","year":"2020","journal-title":"Plant J"},{"key":"2023060914134854800_btad336-B3","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/j.tig.2020.03.005","article-title":"Opening the black box: interpretable machine learning for geneticists","volume":"36","author":"Azodi","year":"2020","journal-title":"Trends Genet"},{"key":"2023060914134854800_btad336-B4","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1038\/nature11867","article-title":"Finding the sources of missing heritability in a yeast cross","volume":"494","author":"Bloom","year":"2013","journal-title":"Nature"},{"key":"2023060914134854800_btad336-B5","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1109\/TMI.2020.3021387","article-title":"Pathomic fusion: An integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis","volume":"41","author":"Chen","year":"2022","journal-title":"IEEE Trans Med Imaging"},{"key":"2023060914134854800_btad336-B6","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/j.isprsjprs.2014.02.013","article-title":"Unmanned aerial systems for photogrammetry and remote sensing: a review","volume":"92","author":"Colomina","year":"2014","journal-title":"ISPRS J Photogramm Remote Sens"},{"key":"2023060914134854800_btad336-B7","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1016\/j.tplants.2017.08.011","article-title":"Genomic selection in plant breeding: methods, models, and perspectives","volume":"22","author":"Crossa","year":"2017","journal-title":"Trends Plant Sci"},{"key":"2023060914134854800_btad336-B8","doi-asserted-by":"crossref","first-page":"289","DOI":"10.3390\/rs9030289","article-title":"Monitoring of wheat growth status and mapping of wheat yield\u2019s within-field spatial variations using color images acquired from UAV-camera system","volume":"9","author":"Du","year":"2017","journal-title":"Remote Sens"},{"key":"2023060914134854800_btad336-B9","volume-title":"PyTorch Lightning","author":"Falcon","year":"2019"},{"key":"2023060914134854800_btad336-B10","doi-asserted-by":"crossref","DOI":"10.18356\/63e608ce-en","volume-title":"The State of Food Security and Nutrition in the World 2019","author":"Food and Agriculture Organization of the United Nations","year":"2019"},{"key":"2023060914134854800_btad336-B11","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1186\/s13007-018-0338-z","article-title":"Remote estimation of rapeseed yield with unmanned aerial vehicle (UAV) imaging and spectral mixture analysis","volume":"14","author":"Gong","year":"2018","journal-title":"Plant Methods"},{"key":"2023060914134854800_btad336-B12","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1186\/s13007-016-0134-6","article-title":"Application of unmanned aerial systems for high throughput phenotyping of large wheat breeding nurseries","volume":"12","author":"Haghighattalab","year":"2016","journal-title":"Plant Methods"},{"key":"2023060914134854800_btad336-B13","first-page":"770","author":"He","year":"2016"},{"key":"2023060914134854800_btad336-B14","doi-asserted-by":"crossref","first-page":"097095","DOI":"10.1117\/1.JRS.9.097095","article-title":"Potential of ensemble tree methods for early-season prediction of winter wheat yield from short time series of remotely sensed normalized difference vegetation index and in situ meteorological data","volume":"9","author":"Heremans","year":"2015","journal-title":"J Appl Remote Sens"},{"key":"2023060914134854800_btad336-B15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13007-020-00620-6","article-title":"Yield prediction by machine learning from UAS-based multi-sensor data fusion in soybean","volume":"16","author":"Herrero-Huerta","year":"2020","journal-title":"Plant Methods"},{"key":"2023060914134854800_btad336-B16","first-page":"4353","author":"Horn","year":"2020"},{"key":"2023060914134854800_btad336-B17","doi-asserted-by":"crossref","first-page":"e20034","DOI":"10.1002\/tpg2.20034","article-title":"Genome-based prediction of multiple wheat quality traits in multiple years","volume":"13","author":"Ibba","year":"2020","journal-title":"Plant Genome"},{"key":"2023060914134854800_btad336-B18","first-page":"2127","volume-title":"Proceedings of the 35th International Conference on Machine Learning","author":"Ilse","year":"2018"},{"key":"2023060914134854800_btad336-B19","doi-asserted-by":"crossref","first-page":"1251788","DOI":"10.1126\/science.1251788","article-title":"A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome","volume":"345","author":"International Wheat Genome Sequencing Consortium and Others","year":"2014","journal-title":"Science"},{"key":"2023060914134854800_btad336-B20","doi-asserted-by":"crossref","first-page":"1750","DOI":"10.3389\/fpls.2019.01750","article-title":"A CNN-RNN framework for crop yield prediction","volume":"10","author":"Khaki","year":"2019","journal-title":"Front Plant Sci"},{"key":"2023060914134854800_btad336-B21","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1534\/g3.118.200856","article-title":"Hyperspectral reflectance-derived relationship matrices for genomic prediction of grain yield in wheat","volume":"9","author":"Krause","year":"2019","journal-title":"G3 (Bethesda)"},{"key":"2023060914134854800_btad336-B22","author":"Loshchilov","year":"2017"},{"key":"2023060914134854800_btad336-B23","doi-asserted-by":"crossref","first-page":"111599","DOI":"10.1016\/j.rse.2019.111599","article-title":"Soybean yield prediction from UAV using multimodal data fusion and deep learning","volume":"237","author":"Maimaitijiang","year":"2020","journal-title":"Remote Sens Environ"},{"key":"2023060914134854800_btad336-B24","doi-asserted-by":"crossref","first-page":"973","DOI":"10.3390\/rs8120973","article-title":"Analysis of vegetation indices to determine nitrogen application and yield prediction in maize (Zea mays L.) from a standard UAV service","volume":"8","author":"Maresma","year":"2016","journal-title":"Remote Sens"},{"key":"2023060914134854800_btad336-B25","doi-asserted-by":"crossref","first-page":"952","DOI":"10.1038\/s41588-019-0414-y","article-title":"Genomic prediction of maize yield across European environmental conditions","volume":"51","author":"Millet","year":"2019","journal-title":"Nat Genet"},{"key":"2023060914134854800_btad336-B26","doi-asserted-by":"crossref","first-page":"4000","DOI":"10.3390\/rs12234000","article-title":"Crop yield prediction using multitemporal UAV data and spatio-temporal deep learning models","volume":"12","author":"Nevavuori","year":"2020","journal-title":"Remote Sens"},{"key":"2023060914134854800_btad336-B27","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.compag.2015.11.018","article-title":"Wheat yield prediction using machine learning and advanced sensing techniques","volume":"121","author":"Pantazi","year":"2016","journal-title":"Comput Electron Agric"},{"key":"2023060914134854800_btad336-B28","doi-asserted-by":"crossref","first-page":"2125","DOI":"10.1098\/rstb.2005.1751","article-title":"Climate change, global food supply and risk of hunger","volume":"360","author":"Parry","year":"2005","journal-title":"Philos Trans R Soc Lond B Biol Sci"},{"key":"2023060914134854800_btad336-B29","volume-title":"Advances in Neural Information Processing Systems","author":"Paszke","year":"2019"},{"key":"2023060914134854800_btad336-B30","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2023060914134854800_btad336-B31","doi-asserted-by":"crossref","first-page":"553","DOI":"10.3390\/genes10070553","article-title":"A guide on deep learning for complex trait genomic prediction","volume":"10","author":"P\u00e9rez-Enciso","year":"2019","journal-title":"Genes"},{"key":"2023060914134854800_btad336-B32","first-page":"103","article-title":"Genomic selection in wheat breeding using genotyping-by-sequencing","volume":"5","author":"Poland","year":"2012","journal-title":"Plant Genome"},{"key":"2023060914134854800_btad336-B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-021-02416-w","article-title":"MegaLMM: mega-scale linear mixed models for genomic predictions with thousands of traits","volume":"22","author":"Runcie","year":"2021","journal-title":"Genome Biol"},{"key":"2023060914134854800_btad336-B34","doi-asserted-by":"crossref","first-page":"e1249","DOI":"10.1002\/widm.1249","article-title":"Ensemble learning: a survey","volume":"8","author":"Sagi","year":"2018","journal-title":"Wiley Interdiscip Rev Data Min Knowl Discov"},{"key":"2023060914134854800_btad336-B35","doi-asserted-by":"crossref","first-page":"394","DOI":"10.3389\/fpls.2019.00394","article-title":"High-throughput phenotyping enabled genetic dissection of crop lodging in wheat","volume":"10","author":"Singh","year":"2019","journal-title":"Front Plant Sci"},{"key":"2023060914134854800_btad336-B36","first-page":"1","author":"Stas","year":"2016"},{"key":"2023060914134854800_btad336-B37","first-page":"e190005","article-title":"Genetic gains in wheat breeding and its role in feeding the world","volume":"1","author":"Tadesse","year":"2019","journal-title":"Crop Breed Genet Genom"},{"key":"2023060914134854800_btad336-B38","doi-asserted-by":"crossref","first-page":"bbaa177","DOI":"10.1093\/bib\/bbaa177","article-title":"Interpretation of deep learning in genomics and epigenomics","volume":"22","author":"Talukder","year":"2021","journal-title":"Brief Bioinformatics"},{"key":"2023060914134854800_btad336-B39","doi-asserted-by":"crossref","first-page":"818","DOI":"10.1126\/science.1183700","article-title":"Breeding technologies to increase crop production in a changing world","volume":"327","author":"Tester","year":"2010","journal-title":"Science"},{"key":"2023060914134854800_btad336-B40","doi-asserted-by":"crossref","first-page":"105709","DOI":"10.1016\/j.compag.2020.105709","article-title":"Crop yield prediction using machine learning: a systematic literature review","volume":"177","author":"van Klompenburg","year":"2020","journal-title":"Comput Electron Agric"},{"key":"2023060914134854800_btad336-B41","first-page":"5998","author":"Vaswani","year":"2017"},{"key":"2023060914134854800_btad336-B42","author":"Veli\u010dkovi\u0107","year":"2018"},{"key":"2023060914134854800_btad336-B43","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1016\/j.fcr.2014.05.001","article-title":"Predicting grain yield and protein content in wheat by fusing multi-sensor and multi-temporal remote-sensing images","volume":"164","author":"Wang","year":"2014","journal-title":"Field Crops Res"},{"key":"2023060914134854800_btad336-B44","first-page":"1616","article-title":"Improved accuracy of high-throughput phenotyping from unmanned aerial systems by extracting traits directly from orthorectified images","volume":"11","author":"Wang","year":"2020","journal-title":"Front Plant Sci"},{"key":"2023060914134854800_btad336-B45","doi-asserted-by":"crossref","first-page":"5192","DOI":"10.1080\/01431161.2015.1040135","article-title":"Comparison of two inversion methods for leaf area index using HJ-1 satellite data in a temperate meadow steppe","volume":"36","author":"Wu","year":"2015","journal-title":"Int J Remote Sens"},{"key":"2023060914134854800_btad336-B46","first-page":"2048","author":"Xu","year":"2015"},{"key":"2023060914134854800_btad336-B47","author":"You","year":"2017"},{"key":"2023060914134854800_btad336-B48","first-page":"3391","author":"Zaheer","year":"2017"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad336\/50425556\/btad336.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/6\/btad336\/50529231\/btad336.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/6\/btad336\/50529231\/btad336.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T16:01:36Z","timestamp":1686326496000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad336\/7176366"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,5,23]]},"references-count":48,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad336","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,6,1]]},"published":{"date-parts":[[2023,5,23]]},"article-number":"btad336"}}