IMPACT OF MISSING RATE AND METHOD OF IMPUTATION OF MISSING DATA ON THE SIZE ESTIMATION OF HIDDEN GROUPS USING NETWORK SCALE-UP
Network scale-up is a standard tool in size estimation of hidden groups. Our aim is to address the impact of missing data and imputation methods on its results. Recruiting 997 Iranian from general population, the prevalence of misuse of ten drugs was calculated. Then 10%, 30% and 50% of data were deleted 200 times. Sizes of groups were predicted analyzing complete case (CC), and after imputation by median replacement (MED), linear and negative binomial regression (NB), and expectation maximum (EM). For positive relative biases (RB), values > 10% were defined as severe relative bias (SRB+). For negative RBs, SRB– was defined as values < –10%. At 10% and 30% missing rates, differences between contribution of MED and EM to create SRBs were 35% (41% versus 6%) and 10% (25% versus 10%). For MED, majority of SRBs happened was SRB–. However, majority of SRBs seen in linear and NB regression was SRB+. At 10%, relative to EM, all methods were more likely to produce SRB. By increase in missing rate, superiority of EM over other methods reduced. MED and linear regression imputations were the poorest methods. At 10% missing, EM partially reduced bias. However, at moderate missing rate, performance of no method was satisfying.
AIDS, hidden group, missing data, network scale-up, size estimation.