Walk-Forward Optimization — Cross-Validation Technique for Time-Series Data
The basic theory and implementation of walk-forward optimization as a cross-validation technique for time-series data
After reading this short article, you will absolutely understand the basic theory and implementation of walk-forward optimization for time-series data modelling. The common questions like why the scientist must implement the walk-forward optimization on their time-series data will be answered.
Furthermore, in the last section, we will also demonstrate the comparison between walk-forward optimization with other cross-validation techniques commonly used for cross-section data like k-fold. Can the implementation of the walk forward make a significant impact on the model performance?
Keep reading and enjoy the trips!
Before talking deeper about walk-forward optimization, let’s talk about the time-series data. Why does it differ from cross-section? Basically, time-series data is one of the data types that the observation is obtained from a sequence of time. For instance, the Covid-19 data comes from its beginning in January 2020 till now. Each day, or maybe each hour, the data is collected and stored in a database. The consequence of this is the existence of autocorrelation.
Autocorrelation is often found in time-series data. However, not all time-series data has high autocorrelation
Let’s move on to the main topic — walk-forward optimization (WFO). According to Carta et al. (2021), walk-forward optimization is one popular technique commonly used by analysts to make decisions in stock trading. Like other cross-validation techniques, WFO ensures the model is repeated to make it robust in future trends. The difference locates in how WFO split the data into groups but still preserves the data order (based on time unit).
Commonly, the WFO has two different approaches — anchored and non-anchored.
- Anchored WFO — in this scenario, the training set…