Advances and Applications in Statistics
Volume 64, Issue 1, Pages 87 - 102
(September 2020) http://dx.doi.org/10.17654/AS064010087 |
|
IMPUTATION FOR CONSECUTIVE MISSING VALUES IN NON-STATIONARY TIME SERIES DATA
Chantha Wongoutong
|
Abstract: Missing data have a significant effect on forecasting from time series data. Since many applications require complete data, missing values must be imputed before further data processing is possible. Several methods to account for missing data have been proposed, but an appropriate imputation method depends on the type of time series and the pattern of the missing data. Simple methods such as mean or moving average (MA) imputation do not perform well when handling missing values in complex situations as a non-stationary time series where both trend and seasonality exist. This study focuses on handling missing the non-stationary where both trend and seasonality exist with the pattern as consecutive missing values based on the deseasonalizing the data and then interpolation (DES-I) or Kalman (DES-K) imputation by using na_seadec in the imputeTS R package. Five real datasets were used to evaluate the performance of the imputation methods with three scenarios of missing artificial data sequences in the time series created at missing rates of 10%, 20% and 50%. The performances of traditional imputation methods such as interpolation, Kalman, MA, last observation carried forward, mean, and linear trend at point were compared with the DES-I and DSE-K. In terms of RMSE and MAPE, the performances of the two methods (DES-I and DSE-K) were far superior to the six traditional imputation methods in the order of 60-80%. Hence, deseasonalizing is a necessary process before imputing missing values for time series data exhibiting both trend and seasonality. |
Keywords and phrases: imputation method, consecutive missing values, non-stationary time series.
|
|
Number of Downloads: 459 | Number of Views: 907 |
|