{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T16:44:37Z","timestamp":1777135477669,"version":"3.51.4"},"reference-count":44,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2020,5,11]],"date-time":"2020-05-11T00:00:00Z","timestamp":1589155200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Social determining factors such as the adverse influence of globalization, supermarket growth, fast unplanned urbanization, sedentary lifestyle, economy, and social position slowly develop behavioral risk factors in humans. Behavioral risk factors such as unhealthy habits, improper diet, and physical inactivity lead to physiological risks, and \u201cobesity\/overweight\u201d is one of the consequences. \u201cObesity and overweight\u201d are one of the major lifestyle diseases that leads to other health conditions, such as cardiovascular diseases (CVDs), chronic obstructive pulmonary disease (COPD), cancer, diabetes type II, hypertension, and depression. It is not restricted within the age and socio-economic background of human beings. The \u201cWorld Health Organization\u201d (WHO) has anticipated that 30% of global death will be caused by lifestyle diseases by 2030 and it can be prevented with the appropriate identification of associated risk factors and behavioral intervention plans. Health behavior change should be given priority to avoid life-threatening damages. The primary purpose of this study is not to present a risk prediction model but to provide a review of various machine learning (ML) methods and their execution using available sample health data in a public repository related to lifestyle diseases, such as obesity, CVDs, and diabetes type II. In this study, we targeted people, both male and female, in the age group of &gt;20 and &lt;60, excluding pregnancy and genetic factors. This paper qualifies as a tutorial article on how to use different ML methods to identify potential risk factors of obesity\/overweight. Although institutions such as \u201cCenter for Disease Control and Prevention (CDC)\u201d and \u201cNational Institute for Clinical Excellence (NICE)\u201d guidelines work to understand the cause and consequences of overweight\/obesity, we aimed to utilize the potential of data science to assess the correlated risk factors of obesity\/overweight after analyzing the existing datasets available in \u201cKaggle\u201d and \u201cUniversity of California, Irvine (UCI) database\u201d, and to check how the potential risk factors are changing with the change in body-energy imbalance with data-visualization techniques and regression analysis. Analyzing existing obesity\/overweight related data using machine learning algorithms did not produce any brand-new risk factors, but it helped us to understand: (a) how are identified risk factors related to weight change and how do we visualize it? (b) what will be the nature of the data (potential monitorable risk factors) to be collected over time to develop our intended eCoach system for the promotion of a healthy lifestyle targeting \u201cobesity and overweight\u201d as a study case in the future? (c) why have we used the existing \u201cKaggle\u201d and \u201cUCI\u201d datasets for our preliminary study? (d) which classification and regression models are performing better with a corresponding limited volume of the dataset following performance metrics?<\/jats:p>","DOI":"10.3390\/s20092734","type":"journal-article","created":{"date-parts":[[2020,5,11]],"date-time":"2020-05-11T12:26:30Z","timestamp":1589199990000},"page":"2734","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":181,"title":["Identification of Risk Factors Associated with Obesity and Overweight\u2014A Machine Learning Overview"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0407-7702","authenticated-orcid":false,"given":"Ayan","family":"Chatterjee","sequence":"first","affiliation":[{"name":"Department of Information and Communication Technology, Centre for e-Health, University of Agder, 4604 Kristiansand, Norway"}]},{"given":"Martin W.","family":"Gerdes","sequence":"additional","affiliation":[{"name":"Department of Information and Communication Technology, Centre for e-Health, University of Agder, 4604 Kristiansand, Norway"}]},{"given":"Santiago G.","family":"Martinez","sequence":"additional","affiliation":[{"name":"Department of Health and Nursing Science, Centre for e-Health, University of Agder, 4604 Kristiansand, Norway"}]}],"member":"1968","published-online":{"date-parts":[[2020,5,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1681","DOI":"10.1001\/jama.2013.3075","article-title":"Overweight, obesity, and all-cause mortality","volume":"309","author":"Willett","year":"2013","journal-title":"JAMA"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"GBD 2015 Obesity Collaborators (2017). Health effects of overweight and obesity in 195 countries over 25 years. N. Engl. J. Med., 377, 13\u201327.","DOI":"10.1056\/NEJMoa1614362"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2440","DOI":"10.1056\/NEJMsa1909301","article-title":"Projected US State-Level Prevalence of Adult Obesity and Severe Obesity","volume":"381","author":"Ward","year":"2019","journal-title":"N. Engl. J. Med."},{"key":"ref_4","unstructured":"(2020, March 18). WHO Page. Available online: https:\/\/www.who.int\/news-room\/fact-sheets\/detail\/obesity-and-overweight; https:\/\/www.who.int\/nmh\/publications\/ncd_report_chapter1.pdf."},{"key":"ref_5","unstructured":"(2020, March 18). CDC Page, Available online: https:\/\/www.cdc.gov\/obesity\/adult\/index.html."},{"key":"ref_6","unstructured":"(2020, March 18). NICE Page. Available online: https:\/\/www.nice.org.uk\/guidance\/cg189."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3407306","DOI":"10.1155\/2018\/3407306","article-title":"The impact of obesity on the cardiovascular system","volume":"2018","author":"Csige","year":"2018","journal-title":"J. Diabetes Res."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1097\/NT.0000000000000092","article-title":"Body Mass Index: Obesity, BMI, and Health: A Critical Review","volume":"50","author":"Nuttall","year":"2015","journal-title":"Nutr. Today"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1412","DOI":"10.1001\/jamainternmed.2015.2405","article-title":"Prevalence of overweight and obesity in the United States, 2007\u20132012","volume":"175","author":"Yang","year":"2015","journal-title":"JAMA Intern. Med."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Gerdes, M., Martinez, S., and Tjondronegoro, D. (2017, January 23\u201326). Conceptualization of a personalized ecoach for wellness promotion. Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for Healthcare, Barcelona, Spain.","DOI":"10.1145\/3154862.3154930"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Chatterjee, A., Gerdes, M.W., and Martinez, S. (2019, January 21\u201323). eHealth Initiatives for The Promotion of Healthy Lifestyle and Allied Implementation Difficulties. Proceedings of the 2019 IEEE International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), Barcelona, Spain.","DOI":"10.1109\/WiMOB.2019.8923324"},{"key":"ref_12","unstructured":"(2020, March 18). Kaggle Data Page. Available online: https:\/\/www.kaggle.com\/data."},{"key":"ref_13","unstructured":"Dua, D., and Graff, C. (2019). UCI Machine Learning Repository, University of California, School of Information and Computer Science."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., and The PRISMA Group (2009). Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Med., 6.","DOI":"10.1371\/journal.pmed.1000097"},{"key":"ref_15","unstructured":"(2020, March 18). PRISMA Page. Available online: www.prisma-statement.org."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Woodward, M. (2013). Epidemiology: Study Design and Data Analysis, CRC Press.","DOI":"10.1201\/b16343"},{"key":"ref_17","unstructured":"(2020, March 18). Epidemiology Page. Available online: https:\/\/www.bmj.com\/about-bmj\/resources-readers\/publications\/epidemiology-uninitiated\/1-what-epidemiology."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Grabner, M. (2012). BMI trends, socioeconomic status, and the choice of dataset. Obesity Facts, Karger Publishers.","DOI":"10.1159\/000337018"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Singh, B., and Tawfik, H. (2019, January 5\u20137). A Machine Learning Approach for Predicting Weight Gain Risks in Young Adults. Proceedings of the 10th IEEE International Conference on Dependable Systems, Services and Technologies (DESSERT), Leeds, UK.","DOI":"10.1109\/DESSERT.2019.8770016"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"624","DOI":"10.3389\/fendo.2019.00624","article-title":"Use of Non-invasive Parameters and Machine-Learning Algorithms for Predicting Future Risk of Type 2 Diabetes: A Retrospective Cohort Study of Health Data From Kuwait","volume":"10","author":"Farran","year":"2019","journal-title":"Front. Endocrinol."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Padmanabhan, M., Yuan, P., Chada, G., and van Nguyen, H. (2019). Physician-friendly machine learning: A case study with cardiovascular disease risk prediction. J. Clin. Med., 8.","DOI":"10.3390\/jcm8071050"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Selya, A.S., and Anshutz, D. (2018). Machine Learning for the Classification of Obesity from Dietary and Physical Activity Patterns. Advanced Data Analytics in Health, Springer.","DOI":"10.1007\/978-3-319-77911-9_5"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Jindal, K., Baliyan, N., and Rana, P.S. (2018). Obesity Prediction Using Ensemble Machine Learning Approaches. Recent Findings in Intelligent Computing Techniques, Springer.","DOI":"10.1007\/978-981-10-8636-6_37"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zheng, Z., and Ruggiero, K. (2017, January 13\u201316). Using machine learning to predict obesity in high school students. Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA.","DOI":"10.1109\/BIBM.2017.8217988"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Dunstan, J., Aguirre, M., Bast\u00edas, M., Nau, C., Glass, T.A., and Tobar, F. (2019). Predicting nationwide obesity from food sales using machine learning. Health Inform. J., Available online: https:\/\/journals.sagepub.com\/doi\/full\/10.1177\/1460458219845959.","DOI":"10.1177\/1460458219845959"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"668","DOI":"10.1111\/obr.12667","article-title":"A review of machine learning in obesity","volume":"19","author":"DeGregory","year":"2018","journal-title":"Obes. Rev."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"637635","DOI":"10.1155\/2014\/637635","article-title":"Predicting increased blood pressure using machine learning","volume":"2014","author":"Golino","year":"2014","journal-title":"J. Obes."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1038\/s41430-018-0337-1","article-title":"A machine learning approach relating 3D body scans to body composition in humans","volume":"73","author":"Pleuss","year":"2019","journal-title":"Eur. J. Clin. Nutr."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"e181535","DOI":"10.1001\/jamanetworkopen.2018.1535","article-title":"Use of deep learning to examine the association of the built environment with prevalence of neighborhood adult obesity","volume":"1","author":"Maharana","year":"2018","journal-title":"JAMA Netw. Open"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Pouladzadeh, P., Kuhad, P., Peddi, S.V.B., Yassine, A., and Shirmohammadi, S. (2016, January 23\u201326). Food calorie measurement using deep learning neural network. Proceedings of the 2016 IEEE International Instrumentation and Measurement Technology Conference, Taipei, Taiwan.","DOI":"10.1109\/I2MTC.2016.7520547"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Schapire, R.E., and Freund, Y. (2013). Boosting: Foundations and algorithms. Kybernetes, Emerald Insight.","DOI":"10.7551\/mitpress\/8291.001.0001"},{"key":"ref_32","unstructured":"Brandt, S. (1976). Statistical and Computational Methods in Data Analysis, North-Holland Publishing Company. No. 04; QA273, B73 1976."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"179","DOI":"10.2188\/jea.JE20140212","article-title":"Epidemiology, population health, and health impact assessment","volume":"25","author":"Gulis","year":"2015","journal-title":"J. Epidemiol."},{"key":"ref_34","unstructured":"(2020, March 18). Physio Net Page. Available online: https:\/\/physionet.org\/about\/database\/."},{"key":"ref_35","unstructured":"(2020, March 18). BMI data GitHub page. Available online: https:\/\/github.com\/chriswmann\/datasets\/blob\/master\/500_Person_Gender_Height_Weight_Index.csv."},{"key":"ref_36","unstructured":"(2020, March 18). Insurance dataset page. Available online: http:\/\/www.sci.csueastbay.edu\/~esuess\/stat6620\/#week-6."},{"key":"ref_37","unstructured":"(2020, March 18). Eating-Health-Module-Dataset Description, Available online: https:\/\/www.bls.gov\/tus\/ehmintcodebk1416.pdf."},{"key":"ref_38","unstructured":"(2020, March 18). Python Page. Available online: https:\/\/docs.python.org\/."},{"key":"ref_39","unstructured":"(2020, March 18). Sklearn Page. Available online: https:\/\/scikit-learn.org\/stable\/supervised_learning.html."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1097\/EDE.0b013e3181c30fb2","article-title":"Assessing the performance of prediction models: a framework for some traditional and novel measures","volume":"21","author":"Steyerberg","year":"2010","journal-title":"Epidemiology"},{"key":"ref_41","unstructured":"(2020, March 18). Sklearn Probability Calibration Page. Available online: https:\/\/scikit-learn.org\/stable\/modules\/calibration.html."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"791","DOI":"10.1162\/NECO_a_00089","article-title":"Machine-learning-based coadaptive calibration for brain-computer interfaces","volume":"23","author":"Vidaurre","year":"2011","journal-title":"Neural Comput."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"291","DOI":"10.5194\/amt-11-291-2018","article-title":"A machine learning calibration model using random forests to improve sensor performance for lower-cost air quality monitoring","volume":"11","author":"Zimmerman","year":"2018","journal-title":"Atmos. Meas. Tech."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Bella, A., Ferri, C., Hern\u00e1ndez-Orallo, J., and Ram\u00edrez-Quintana, M.J. (2010). Calibration of machine learning models. Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques, IGI Global.","DOI":"10.4018\/978-1-60566-766-9.ch006"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/9\/2734\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:27:43Z","timestamp":1760174863000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/9\/2734"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,11]]},"references-count":44,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2020,5]]}},"alternative-id":["s20092734"],"URL":"https:\/\/doi.org\/10.3390\/s20092734","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,11]]}}}