{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,4]],"date-time":"2025-11-04T12:44:25Z","timestamp":1762260265423,"version":"build-2065373602"},"reference-count":54,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,11,4]],"date-time":"2025-11-04T00:00:00Z","timestamp":1762214400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100008968","name":"National Research Council Canada","doi-asserted-by":"publisher","award":["AiP-006"],"award-info":[{"award-number":["AiP-006"]}],"id":[{"id":"10.13039\/100008968","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>Elderly populations often face significant challenges when it comes to dietary intake tracking, often exacerbated by health complications. Unfortunately, conventional diet assessment techniques such as food frequency questionnaires, food diaries, and 24 h recall are subject to substantial bias. Recent advancements in machine learning and computer vision show promise of automated nutrition tracking methods of food, but require a large, high-quality dataset in order to accurately identify the nutrients from the food on the plate. However, manual creation of large-scale datasets with such diversity is time-consuming and hard to scale. On the other hand, synthesized 3D food models enable view augmentation to generate countless photorealistic 2D renderings from any viewpoint, reducing imbalance across camera angles. In this paper, we present a process to collect a large image dataset of food scenes that span diverse viewpoints and highlight its usage in dietary intake estimation. We first collect quality 3D objects of food items (NV-3D) that are used to generate photorealistic synthetic 2D food images (NV-Synth) and then manually collect a validation 2D food image dataset (NV-Real). We benchmark various intake estimation approaches on these datasets and present NutritionVerse3D2D, a collection of datasets that contain 3D objects and 2D images, along with models that estimate intake from the 2D food images. We release all the datasets along with the developed models to accelerate machine learning research on dietary sensing.<\/jats:p>","DOI":"10.3390\/data10110180","type":"journal-article","created":{"date-parts":[[2025,11,4]],"date-time":"2025-11-04T11:11:16Z","timestamp":1762254676000},"page":"180","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["NutritionVerse3D2D: Large 3D Object and 2D Image Food Dataset for Dietary Intake Estimation"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7023-8784","authenticated-orcid":false,"given":"Chi-en Amy","family":"Tai","sequence":"first","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Matthew","family":"Keller","sequence":"additional","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Saeejith","family":"Nair","sequence":"additional","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuhao","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2856-9835","authenticated-orcid":false,"given":"Yifan","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Olivia","family":"Markham","sequence":"additional","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Krish","family":"Parmar","sequence":"additional","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3236-5234","authenticated-orcid":false,"given":"Pengcheng","family":"Xi","sequence":"additional","affiliation":[{"name":"National Research Council Canada, Ottawa, ON K1A 0R6, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"Wong","sequence":"additional","affiliation":[{"name":"Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,4]]},"reference":[{"unstructured":"Davis, M.R. (2022, October 11). Despite Pandemic, Percentage of Older Adults Who Want to Age in Place Stays Steady. Available online: https:\/\/www.overleaf.com\/project\/6902d3b535892b7c7552d53d.","key":"ref_1"},{"key":"ref_2","first-page":"207","article-title":"Assessment and management of nutrition in older people and its importance to health","volume":"5","author":"Ahmed","year":"2010","journal-title":"Clin. Interv. Aging"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"M68","DOI":"10.1093\/gerona\/59.1.M68","article-title":"Nutritional risk predicts quality of life in elderly community-living Canadians","volume":"59","author":"Keller","year":"2004","journal-title":"J. Gerontol. Ser. A"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1734","DOI":"10.1111\/j.1532-5415.2010.03016.x","article-title":"Frequency of Malnutrition in Older Adults: A Multinational Perspective Using the Mini Nutritional Assessment","volume":"58","author":"Kaiser","year":"2010","journal-title":"J. Am. Geriatr. Soc."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1134","DOI":"10.1016\/j.jand.2012.04.016","article-title":"The Automated Self-Administered 24-Hour Dietary Recall (ASA24): A Resource for Researchers, Clinicians, and Educators from the National Cancer Institute","volume":"112","author":"Subar","year":"2012","journal-title":"J. Acad. Nutr. Diet."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1093\/aje\/kwg091","article-title":"Structure of dietary measurement error: Results of the OPEN biomarker study","volume":"158","author":"Kipnis","year":"2003","journal-title":"Am. J. Epidemiol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1093\/aje\/kwu116","article-title":"Pooled results from 5 validation studies of dietary self-report instruments using recovery biomarkers for energy and protein intake","volume":"180","author":"Freedman","year":"2014","journal-title":"Am. J. Epidemiol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1093\/aje\/kwu325","article-title":"Pooled results from 5 validation studies of dietary self-report instruments using recovery biomarkers for potassium and sodium intake","volume":"181","author":"Freedman","year":"2015","journal-title":"Am. J. Epidemiol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"e147","DOI":"10.2196\/jmir.5056","article-title":"A Mobile Phone App Intervention Targeting Fruit and Vegetable Consumption: The Efficacy of Textual and Auditory Tailored Health Information Tested in a Randomized Controlled Trial","volume":"18","author":"Elbert","year":"2016","journal-title":"J. Med. Internet Res."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1177\/1932296815582222","article-title":"\u201cSnap-n-Eat\u201d: Food Recognition and Nutrition Estimation on a Smartphone","volume":"9","author":"Zhang","year":"2015","journal-title":"J. Diabetes Sci. Technol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1139","DOI":"10.1016\/S0002-8223(03)00974-X","article-title":"Comparison of digital photography to weighed and visual estimation of portion sizes","volume":"103","author":"Williamson","year":"2003","journal-title":"J. Am. Diet. Assoc."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1007\/s12062-019-09259-1","article-title":"Aspects Influencing Food Intake and Approaches towards Personalising Nutrition in the Elderly","volume":"13","author":"Rusu","year":"2020","journal-title":"J. Popul. Ageing"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1109\/JBHI.2016.2636441","article-title":"Food Recognition: A New Dataset, Experiments, and Results","volume":"21","author":"Ciocca","year":"2017","journal-title":"IEEE J. Biomed. Health Inform."},{"doi-asserted-by":"crossref","unstructured":"Ando, Y., Ege, T., Cho, J., and Yanai, K. (2019, January 21). DepthCalorieCam: A Mobile Application for Volume-Based FoodCalorie Estimation Using Depth Cameras. Proceedings of the 5th International Workshop on Multimedia Assisted Dietary Management, MADiMa \u201919, Nice, France.","key":"ref_14","DOI":"10.1145\/3347448.3357172"},{"unstructured":"Tai, C.e.A., Keller, M., Nair, S., Chen, Y., Wu, Y., Markham, O., Parmar, K., Xi, P., Keller, H., and Kirkpatrick, S. (November, January 29). NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches. Proceedings of the 8th International Workshop on Multimedia Assisted Dietary Management, Ottawa, ON, Canada.","key":"ref_15"},{"doi-asserted-by":"crossref","unstructured":"Beijbom, O., Joshi, N., Morris, D., Saponas, S., and Khullar, S. (2015, January 5\u20139). Menu-Match: Restaurant-Specific Food Logging from Images. Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","key":"ref_16","DOI":"10.1109\/WACV.2015.117"},{"doi-asserted-by":"crossref","unstructured":"Myers, A., Johnston, N., Rathod, V., Korattikara, A., Gorban, A., Silberman, N., Guadarrama, S., Papandreou, G., Huang, J., and Murphy, K. (2015, January 7\u201313). Im2Calories: Towards an Automated Mobile Vision Food Diary. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","key":"ref_17","DOI":"10.1109\/ICCV.2015.146"},{"unstructured":"Liang, Y., and Li, J. (2017). Computer vision-based food calorie estimation: Dataset, method, and experiment. arXiv.","key":"ref_18"},{"doi-asserted-by":"crossref","unstructured":"Thames, Q., Karpur, A., Norris, W., Xia, F., Panait, L., Weyand, T., and Sim, J. (2021, January 20\u201325). Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","key":"ref_19","DOI":"10.1109\/CVPR46437.2021.00879"},{"doi-asserted-by":"crossref","unstructured":"Wu, X., Fu, X., Liu, Y., Lim, E.P., Hoi, S.C., and Sun, Q. (2021, January 20\u201324). A Large-Scale Benchmark for Food Image Segmentation. Proceedings of the ACM International Conference on Multimedia, Chengdu, China.","key":"ref_20","DOI":"10.1145\/3474085.3475201"},{"unstructured":"Kaur, P., Sikka, K., Wang, W., Belongie, S.J., and Divakaran, A. (2019). FoodX-251: A Dataset for Fine-grained Food Classification. arXiv.","key":"ref_21"},{"doi-asserted-by":"crossref","unstructured":"Matsuda, Y., Hoashi, H., and Yanai, K. (2012, January 9\u201313). Recognition of multiple-food images by detecting candidate regions. Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, Melbourne, VIC, Australia.","key":"ref_22","DOI":"10.1109\/ICME.2012.157"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"9932","DOI":"10.1109\/TPAMI.2023.3237871","article-title":"Large Scale Visual Food Recognition","volume":"45","author":"Min","year":"2023","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"unstructured":"Chen, X., Zhu, Y., Zhou, H., Diao, L., and Wang, D. (2017). Chinesefoodnet: A large-scale image dataset for chinese food recognition. arXiv.","key":"ref_24"},{"doi-asserted-by":"crossref","unstructured":"Fleet, D., Pajdla, T., Schiele, B., and Tuytelaars, T. (2014, January 6\u201312). Food-101\u2014Mining Discriminative Components with Random Forests. Proceedings of the European Conference on Computer Vision\u2014ECCV 2014, Zurich, Switzerland.","key":"ref_25","DOI":"10.1007\/978-3-319-10602-1"},{"doi-asserted-by":"crossref","unstructured":"Karabay, A., Varol, H.A., and Chan, M.Y. (2025). Improved food image recognition by leveraging deep learning and data-driven methods with an application to Central Asian Food Scene. Sci. Rep., 15.","key":"ref_26","DOI":"10.1038\/s41598-025-95770-9"},{"doi-asserted-by":"crossref","unstructured":"Karabay, A., Bolatov, A., Varol, H.A., and Chan, M.Y. (2023). A central asian food dataset for personalized dietary interventions. Nutrients, 15.","key":"ref_27","DOI":"10.3390\/nu15071728"},{"doi-asserted-by":"crossref","unstructured":"Min, W., Liu, L., Wang, Z., Luo, Z., Wei, X., Wei, X., and Jiang, S. (2020, January 12\u201316). Isia food-500: A dataset for large-scale food recognition via stacked global-local attention network. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA.","key":"ref_28","DOI":"10.1145\/3394171.3414031"},{"doi-asserted-by":"crossref","unstructured":"Romero-Tapiador, S., Tolosana, R., Lacruz-Pleguezuelos, B., Marcos-Zambrano, L.J., Baz\u00e1n, G.X., Espinosa-Salinas, I., Fierrez, J., Ortega-Garcia, J., de Santa Pau, E.C., and Morales, A. (2025, January 11\u201315). Are Vision-Language Models Ready for Dietary Assessment? Exploring the Next Frontier in AI-Powered Food Image Recognition. Proceedings of the Computer Vision and Pattern Recognition Conference, Nashville, TN, USA.","key":"ref_29","DOI":"10.1109\/CVPRW67362.2025.00047"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1109\/TPAMI.2019.2927476","article-title":"Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images","volume":"43","author":"Marin","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"doi-asserted-by":"crossref","unstructured":"Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., and Torralba, A. (2017, January 21\u201326). Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","key":"ref_31","DOI":"10.1109\/CVPR.2017.327"},{"doi-asserted-by":"crossref","unstructured":"Chen, J., and Ngo, C.W. (2016, January 15\u201319). Deep-Based Ingredient Recognition for Cooking Recipe Retrieval. Proceedings of the 24th ACM International Conference on Multimedia, MM \u201916, Amsterdam, The Netherlands.","key":"ref_32","DOI":"10.1145\/2964284.2964315"},{"unstructured":"Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., and Su, H. (2015). ShapeNet: An Information-Rich 3D Model Repository. arXiv.","key":"ref_33"},{"doi-asserted-by":"crossref","unstructured":"Wu, T., Zhang, J., Fu, X., Wang, Y., Ren, J., Pan, L., Wu, W., Yang, L., Wang, J., and Qian, C. (2023, January 18\u201322). Omniobject3d: Large-vocabulary 3D object dataset for realistic perception, reconstruction and generation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","key":"ref_34","DOI":"10.1109\/CVPR52729.2023.00084"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1109\/TIM.2014.2303533","article-title":"Measuring Calorie and Nutrition From Food Image","volume":"63","author":"Pouladzadeh","year":"2014","journal-title":"IEEE Trans. Instrum. Meas."},{"doi-asserted-by":"crossref","unstructured":"Bola\u00f1os, M., and Radeva, P. (2016, January 4\u20138). Simultaneous food localization and recognition. Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico.","key":"ref_36","DOI":"10.1109\/ICPR.2016.7900117"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1109\/RBME.2023.3283149","article-title":"A review of image-based food recognition and volume estimation artificial intelligence systems","volume":"17","author":"Konstantakopoulos","year":"2023","journal-title":"IEEE Rev. Biomed. Eng."},{"doi-asserted-by":"crossref","unstructured":"Mezgec, S., and Seljak, B. (2017). NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment. Nutrients, 9.","key":"ref_38","DOI":"10.3390\/nu9070657"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1016\/j.amepre.2012.11.006","article-title":"An Ethical Framework for Automated, Wearable Cameras in Health Behavior Research","volume":"44","author":"Kelly","year":"2013","journal-title":"Am. J. Prev. Med."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1007\/s12124-014-9289-8","article-title":"Too Much Information: Visual Research Ethics in the Age of Wearable Cameras","volume":"49","author":"Mok","year":"2015","journal-title":"Integr. Psychol. Behav. Sci."},{"unstructured":"Apple (2022, October 25). Apple. Available online: https:\/\/www.apple.com\/ca\/iphone\/.","key":"ref_41"},{"unstructured":"Polycam (2022, October 25). Polycam\u2014LiDAR & 3D Scanner for iPhone & Android. Available online: https:\/\/poly.cam\/.","key":"ref_42"},{"unstructured":"Chambers, J., Hullette, T., and Gharge, P. (2022, October 25). The Best 3D Scanner Apps of 2022 (iPhone & Android). Available online: https:\/\/all3dp.com\/2\/best-3d-scanner-app-iphone-android-photogrammetry\/.","key":"ref_43"},{"unstructured":"Government of Canada (2023, February 25). Canadian Nutrient File (CNF)\u2014Search by Food, Available online: https:\/\/food-nutrition.canada.ca\/cnf-fce\/.","key":"ref_44"},{"key":"ref_45","first-page":"22","article-title":"An overview of 3D data content, file formats and viewers","volume":"1205","author":"McHenry","year":"2008","journal-title":"Natl. Cent. Supercomput. Appl."},{"unstructured":"Tai, C.e.A., Keller, M., Kerrigan, M., Chen, Y., Nair, S., Xi, P., and Wong, A. (2023, January 18\u201322). NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake Estimation. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada. Women in Computer Vision (WiCV).","key":"ref_46"},{"unstructured":"NVIDIA (2023, July 21). NVIDIA Isaac Sim. Available online: https:\/\/developer.nvidia.com\/isaac-sim.","key":"ref_47"},{"unstructured":"(2023, September 29). Roboflow, Version 1.0; Computer Vision: 2022. Available online: https:\/\/research.roboflow.com\/citations.","key":"ref_48"},{"unstructured":"Government of Canada (2023, September 29). Percent Daily Value, Available online: https:\/\/www.canada.ca\/en\/health-canada\/services\/understanding-food-labels\/percent-daily-value.html.","key":"ref_49"},{"unstructured":"Osilla, E.V., Safadi, A.O., and Sharma, S. (2018). Calories, StatPearls Publishing.","key":"ref_50"},{"doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., and Vanhoucke, V. (2016). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv.","key":"ref_51","DOI":"10.1609\/aaai.v31i1.11231"},{"unstructured":"Wu, B., Xu, C., Dai, X., Wan, A., Zhang, P., Tomizuka, M., Keutzer, K., and Vajda, P. (2020). Visual Transformers: Token-based Image Representation and Processing for Computer Vision. arXiv.","key":"ref_52"},{"doi-asserted-by":"crossref","unstructured":"He, K., Chen, X., Xie, S., Li, Y., Doll\u00e1r, P., and Girshick, R. (2021). Masked Autoencoders Are Scalable Vision Learners. arXiv.","key":"ref_53","DOI":"10.1109\/CVPR52688.2022.01553"},{"doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Li, F.F. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","key":"ref_54","DOI":"10.1109\/CVPR.2009.5206848"}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/11\/180\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,4]],"date-time":"2025-11-04T12:18:24Z","timestamp":1762258704000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/11\/180"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,4]]},"references-count":54,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["data10110180"],"URL":"https:\/\/doi.org\/10.3390\/data10110180","relation":{},"ISSN":["2306-5729"],"issn-type":[{"type":"electronic","value":"2306-5729"}],"subject":[],"published":{"date-parts":[[2025,11,4]]}}}