Advances and Applications in Statistics
Volume 60, Issue 2, Pages 105 - 135
(February 2020) http://dx.doi.org/10.17654/AS060020105 |
|
USING TREE-BASED METHODS TO PREDICT STUDENT RETENTION WITH AN EMPHASIS ON THE EDUCATION, WEALTH, AND DENSITY OF THE HOMETOWN NEIGHBORHOOD
Sima Sharghi, Kevin E. Stoll, Kimberlyn K. Brooks and Andrew Alt
|
Abstract: Retention affects student success and university reputation and impels universities to invest in creative ways to increase retention. Bowling Green State University (BGSU) uses extensive research and modeling to better understand retention and promote student success. Here we expand upon the use of statistical learning, particularly tree-based methods including random forest and gradient boosted trees, to model BGSU retention. In addition to traditional pre and post-enrollment predictors, we obtain wealth, education, and density information about a student’s hometown neighborhood from the 2017 5-year summary of the American Community Survey. When analyzed along with the traditional predictors, we find median household income, proportion of bachelor’s degree, population density, and distance from hometown neighborhood to BGSU are important in predicting student retention. Moreover, we present estimates of functional relationships of these predictors with mean retention which suggest the existence of social determinants to student retention at BGSU. Further, we the distribution of a new cohort’s retention rate via predicting each student’s probability of retaining and using Monte Carlo simulation. |
Keywords and phrases: retention modeling, social trends, wealth, education, random forest, decision trees, boosted trees, partial plots, American Community Survey.
|
|
Number of Downloads: 407 | Number of Views: 1438 |
|