Estimation of Obesity Levels through the Proposed Predictive Approach Based on Physical Activity and Nutritional Habits


Creative Commons License

GÖZÜKARA BAĞ H. G., YAĞIN F. H., GÖRMEZ Y., González P. P., ÇOLAK C., Gülü M., ...Daha Fazla

Diagnostics, cilt.13, sa.18, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 13 Sayı: 18
  • Basım Tarihi: 2023
  • Doi Numarası: 10.3390/diagnostics13182949
  • Dergi Adı: Diagnostics
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, EMBASE, INSPEC, Directory of Open Access Journals
  • Anahtar Kelimeler: classification, machine learning, nutritional habits, obesity, physical activity
  • Sivas Cumhuriyet Üniversitesi Adresli: Evet

Özet

Obesity is the excessive accumulation of adipose tissue in the body that leads to health risks. The study aimed to classify obesity levels using a tree-based machine-learning approach considering physical activity and nutritional habits. Methods: The current study employed an observational design, collecting data from a public dataset via a web-based survey to assess eating habits and physical activity levels. The data included gender, age, height, weight, family history of being overweight, dietary patterns, physical activity frequency, and more. Data preprocessing involved addressing class imbalance using Synthetic Minority Over-sampling TEchnique-Nominal Continuous (SMOTE-NC) and feature selection using Recursive Feature Elimination (RFE). Three classification algorithms (logistic regression (LR), random forest (RF), and Extreme Gradient Boosting (XGBoost)) were used for obesity level prediction, and Bayesian optimization was employed for hyperparameter tuning. The performance of different models was evaluated using metrics such as accuracy, recall, precision, F1-score, area under the curve (AUC), and precision–recall curve. The LR model showed the best performance across most metrics, followed by RF and XGBoost. Feature selection improved the performance of LR and RF models, while XGBoost’s performance was mixed. The study contributes to the understanding of obesity classification using machine-learning techniques based on physical activity and nutritional habits. The LR model demonstrated the most robust performance, and feature selection was shown to enhance model efficiency. The findings underscore the importance of considering both physical activity and nutritional habits in addressing the obesity epidemic.