SPLINE REGRESSION FOR ZERO-INFLATED MODELS
We propose a regression model for count data when the classical generalized linear model approach is too rigid due to a high outcome of zero counts and a non-linear influence of continuous covariates. Zero-inflation is applied to take into account the presence of excess zeros with separate link functions for the zero and the nonzero components.Non-linearityincovariatesiscaptured by spline functions based on B-splines. Our algorithm relies on maximum-likelihood estimation and allows for adaptive box-constrained knots, thus improving the goodness of the spline fit and allowing for detection of sensitivity changepoints. The AIC criterion can be shown to serve well for model selection, in particular if non-linearities are weak such that BIC tends to overly simplistic models. We fit the introduced models to real data of children’s dental sanity, linking caries counts with the so-called Body-Mass-Index (BMI) and other socio-economic factors. This reveals a puzzling non-monotonic influence of BMI on caries counts which is yet to be explained by clinical experts.
B-splines, count data, DMFS index, nonlinear regression, overdispersion, zero inflation.