Presenter(s)

Logan Stewart

Abstract

Each year the Centers for Disease Control and Prevention (CDC) distribute the Behavioral Risk Factor Surveillance System (BRFSS) telephone surveys to acquire nationwide data about US residents' health. This survey collects data on hundreds of health-related variables from hundreds of thousands of participants. It is a powerful tool for building health promotion and has even influenced and supported health-related state legislation. This study uses that data to predict an individual's probability of having diabetes based on dietary, demographic, health, and economic factors. Diabetes is a group of diseases that causes the body to use insulin ineffectively, and one in four people with diabetes don't know they have it. This study aims to discover what factors are most prevalent in people with diabetes to spread awareness and encourage those at higher risk to seek treatment that would prevent further complications. To achieve this, a machine learning decision tree model called XGBoost is used. The results of this study provide predictions with high accuracy as well as the factors that are most influential in the model's decision making.

College

College of Science & Engineering

Department

Mathematics & Statistics

Campus

Winona

First Advisor/Mentor

Silas Bergen

Location

Kryzsko Great River Ballroom, Winona, Minnesota; United States

Start Date

4-23-2026 2:00 PM

End Date

4-23-2026 3:00 PM

Presentation Type

Poster Session

Format of Presentation or Performance

In-Person

Session

2b=2pm-3pm

Poster Number

62

Comments

Stewart, Logan C

Zoom Link

Remember to login to Zoom with your starID for access to the Zoom session.

Share

COinS
 
Apr 23rd, 2:00 PM Apr 23rd, 3:00 PM

Predicting the Probability of Having Diabetes Using Machine Learning

Kryzsko Great River Ballroom, Winona, Minnesota; United States

Each year the Centers for Disease Control and Prevention (CDC) distribute the Behavioral Risk Factor Surveillance System (BRFSS) telephone surveys to acquire nationwide data about US residents' health. This survey collects data on hundreds of health-related variables from hundreds of thousands of participants. It is a powerful tool for building health promotion and has even influenced and supported health-related state legislation. This study uses that data to predict an individual's probability of having diabetes based on dietary, demographic, health, and economic factors. Diabetes is a group of diseases that causes the body to use insulin ineffectively, and one in four people with diabetes don't know they have it. This study aims to discover what factors are most prevalent in people with diabetes to spread awareness and encourage those at higher risk to seek treatment that would prevent further complications. To achieve this, a machine learning decision tree model called XGBoost is used. The results of this study provide predictions with high accuracy as well as the factors that are most influential in the model's decision making.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.