Abstract: Missing data problem, which is common but difficult in most of researches, especially in longitudinal research studies which deal with continuous and repeated measures over time, may lead to biased and inefficient inferences if treated incorrectly. The Bayesian method has gained a lot of attention in the literature as a natural and efficient way to deal with missing data problem. This study examines recent Bayesian system advances and implementations for the two forms of missing data: ignorable and non-ignorable. Based on available literature, missing data mechanisms and a Bayesian framework for dealing with missing data are first implemented followed by missing data models under ignorable and non-ignorable missing data circumstances. Following this, fundamental components of Bayesian inference are statistically discussed, including prior construction and posterior computation in the R programming language. This study concludes that Bayesian method is efficient when missing data is ignorable at a larger sample size compared to classical regression method. However, under non-ignorable missing data, missing values should be modeled first before Bayesian method is applied. In addition, a variety of potential problems that warrant further review are summarized and concluded.
|
Keywords and phrases: missing ignorable and non-ignorable missing data, longitudinal studies, Bayesian method, classical regression method.
Received: June 9, 2022; Accepted: July 20, 2022; Published: August 1, 2022
How to cite this article: Azman A. Nads and Daisy Lou L. Polestico, Bayesian imputation for missing data, Advances and Applications in Statistics 79 (2022), 83-104. http://dx.doi.org/10.17654/0972361722061
This Open Access Article is Licensed under Creative Commons Attribution 4.0 International License
References:
[1] Teresa Alves de Sousa and Imke Mayer, Simulating missing values, 2020. https://docplayer.net/195572180-how-to-simulate-missing-values.html. [2] M. J. Daniels and J. W. Hogan, Missing Data in Longitudinal Studies, Taylor and Francis Group, 2008. DOI: 10.1201/9781420011180. [3] Hyun Kang, The prevention and handling of the missing data, Korean Journal of Anesthesiology 64 (2013), 402-406. DOI: 10.4097/kjae.2013.64.5.402. [4] Z. Ma and G. Chen, Bayesian methods for dealing with missing data problems, J. Korean Statist. Soc. 47(3) (2018), 297-313. DOI: 10.1016/j.jkss.2018.03.002. [5] M. L. Yadav and B. Roychoudhury, Handling missing values: a study of popular imputation packages in R, Knowledge-Based Systems 160 (2018), 104-118. DOI: 10.1016/j.knosys.2018.06.012. [6] D. B. Rubin and R. J. Little, Statistical analysis with missing data, Journal of Education 16 (2002), 150-155. DOI: 10.1002/9781119013563. [7] X. Zhou and J. P. Reiter, A note on Bayesian inference after multiple imputation, Amer. Statist. 64 (2010), 159-163. DOI: 10.1016/j.jkss.2018.03.002.
|