In September, 2020, the research paper entitled “A spatiotemporal approach for traffic data imputation with complicated missing patterns” was accepted by Transportation Research Part C: Emerging Technologies.
• A hybrid spatiotemporal approach for traffic missing data imputation.
• Tackling multiple missing patterns, including randomly, clustered non-completely and clustered completely missing.
• Extensive numerical tests with real-world dataset.
• Applicable even when the missing rate approaches 90%.
With the advent of intelligent transportation systems (ITS), spatiotemporal traffic data has gained growing importance in real-time monitoring, prediction, and control of traffic. However, in practical implementations, data collection devices are often faced with malfunctions caused by various unpredictable disruptions, thereby resulting in the so-called “missing value problems.” In realistic cases, the disruptions to the data collection devices are often associated with some key events (e.g., power cut and natural disasters), in addition, along with other disruptions the missing value problem could be in a complicated manner with both randomly and completely missing patterns. To perform the imputation task with such complicated missing patterns, we propose a hybrid spatiotemporal method which utilizes the time series properties by “prophet” model and captures the spatial residuals information by iterative random forest model. The spatiotemporal method first applies the temporal part to fill the missing value and then adopts the spatial part to acquire the residual component of the missing values. The results of the two components are integrated into the final imputations. Based on the PeMS freeway dataset (PeMS, 2019) and an urban road dataset under extensive artificially designed scenarios like randomly, clustered non-completely and completely missing patterns, we test our proposed approach with some existing techniques such as K-Nearest Neighbor (KNN), Seasonal-Trend decomposition using Loess (STL), Bayesian tensor decomposition, Denoising AutoEncoder (DAE). The test results indicate that the hybrid method achieves the best imputation quality for most missing patterns, particularly for those with completely or hybrid missing patterns. Furthermore, the hybrid model still performs well under extreme missing rates as high as 0.9, which validates the robustness of the model in extreme situations.