Presenter(s)
Logan Stewart
Abstract
Each year the Centers for Disease Control and Prevention (CDC) distribute the Behavioral Risk Factor Surveillance System (BRFSS) telephone surveys to acquire nationwide data about US residents' health. This survey collects data on hundreds of health-related variables from hundreds of thousands of participants. It is a powerful tool for building health promotion and has even influenced and supported health-related state legislation. This study uses that data to predict an individual's probability of having diabetes based on dietary, demographic, health, and economic factors. Diabetes is a group of diseases that causes the body to use insulin ineffectively, and one in four people with diabetes don't know they have it. This study aims to discover what factors are most prevalent in people with diabetes to spread awareness and encourage those at higher risk to seek treatment that would prevent further complications. To achieve this, a machine learning decision tree model called XGBoost is used. The results of this study provide predictions with high accuracy as well as the factors that are most influential in the model's decision making.
College
College of Science & Engineering
Department
Mathematics & Statistics
Campus
Winona
First Advisor/Mentor
Silas Bergen
Location
Kryzsko Great River Ballroom, Winona, Minnesota; United States
Start Date
4-23-2026 2:00 PM
End Date
4-23-2026 3:00 PM
Presentation Type
Poster Session
Format of Presentation or Performance
In-Person
Session
2b=2pm-3pm
Poster Number
62
Remember to login to Zoom with your starID for access to the Zoom session.
Predicting the Probability of Having Diabetes Using Machine Learning
Kryzsko Great River Ballroom, Winona, Minnesota; United States
Each year the Centers for Disease Control and Prevention (CDC) distribute the Behavioral Risk Factor Surveillance System (BRFSS) telephone surveys to acquire nationwide data about US residents' health. This survey collects data on hundreds of health-related variables from hundreds of thousands of participants. It is a powerful tool for building health promotion and has even influenced and supported health-related state legislation. This study uses that data to predict an individual's probability of having diabetes based on dietary, demographic, health, and economic factors. Diabetes is a group of diseases that causes the body to use insulin ineffectively, and one in four people with diabetes don't know they have it. This study aims to discover what factors are most prevalent in people with diabetes to spread awareness and encourage those at higher risk to seek treatment that would prevent further complications. To achieve this, a machine learning decision tree model called XGBoost is used. The results of this study provide predictions with high accuracy as well as the factors that are most influential in the model's decision making.

Comments
Stewart, Logan C