Holidays & Special Dates
Guide to using holiday calendar variables and special dates to improve forecast accuracy in time series.
What Are Holiday Variables and Special Dates?
Special dates, such as holidays, promotions, or significant events, often cause notable deviations from normal patterns in your time series. By incorporating these special dates into your forecasting model, you can better capture these expected variations and improve prediction accuracy.
How to Add Holiday Variables and Special Dates
Step 1: Import Packages
Import the required libraries and initialize the Nixtla client.
Step 2: Load Data
We use a Google Trends dataset on “chocolate” with monthly frequency:
month | chocolate | |
---|---|---|
0 | 2004-01-31 | 35 |
1 | 2004-02-29 | 45 |
2 | 2004-03-31 | 28 |
3 | 2004-04-30 | 30 |
4 | 2004-05-31 | 29 |
Step 3: Create a Future Dataframe
When adding exogenous variables (like holidays) to time series forecasting, we need a future DataFrame because:
- Historical data already exists: Our training data contains past values of both the target variable and exogenous features
- Future exogenous features are known: Unlike the target variable, we can determine future values of exogenous features (like holidays) in advance
For example, we know that Christmas will occur on December 25th next year, so we can include this information in our future DataFrame to help the model understand seasonal patterns during the forecast period.
Start with creating a future DataFrame with 14 months of dates starting from May 2024.
month | |
---|---|
9 | 2025-02-28 00:00:00 |
10 | 2025-03-31 00:00:00 |
11 | 2025-04-30 00:00:00 |
12 | 2025-05-31 00:00:00 |
13 | 2025-06-30 00:00:00 |
Step 4: Forecast with Holidays and Special Dates
TimeGPT automatically generates standard date-based features (like month, day of week, etc.) during forecasting. For more specialized temporal patterns, you can manually add holiday indicators to both your historical and future datasets.
Create a Function to Add Date Features
To make it easier to add date features to a DataFrame, we’ll create the add_date_features_to_DataFrame
function that takes:
- A pandas DataFrame
- A date extractor function, which can be
CountryHolidays
orSpecialDates
- A time column name
Add Holiday Features
To add holiday features, we’ll use the CountryHolidays
class to compute US holidays and merge them into the future DataFrame.
month | US_New Year’s Day | US_Memorial Day | US_Juneteenth National Independence Day | US_Independence Day | US_Labor Day | US_Veterans Day | US_Thanksgiving Day | US_Christmas Day | US_Martin Luther King Jr. Day | US_Washington’s Birthday | US_Columbus Day | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2024-05-31 00:00:00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 2024-06-30 00:00:00 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 2024-07-31 00:00:00 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 2024-08-31 00:00:00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 2024-09-30 00:00:00 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
This DataFrame now includes columns for each identified US holiday as binary indicators.
Next, add holiday indicators to the historical DataFrame.
month | chocolate | US_New Year’s Day | US_New Year’s Day (observed) | US_Memorial Day | US_Independence Day | US_Independence Day (observed) | US_Labor Day | US_Veterans Day | US_Thanksgiving Day | US_Christmas Day | US_Christmas Day (observed) | US_Martin Luther King Jr. Day | US_Washington’s Birthday | US_Columbus Day | US_Veterans Day (observed) | US_Juneteenth National Independence Day | US_Juneteenth National Independence Day (observed) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
239 | 2023-12-31 00:00:00 | 90 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
240 | 2024-01-31 00:00:00 | 64 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
241 | 2024-02-29 00:00:00 | 66 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
242 | 2024-03-31 00:00:00 | 59 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
243 | 2024-04-30 00:00:00 | 51 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Now, your historical DataFrame also contains holiday flags for each month.
Finally, forecast with the holiday features.
Plot the forecast with holiday effects.
We can then plot the weights of each holiday to see which are more important in forecasting the interest in chocolate. We will use the SHAP library to plot the weights.
For more details on how to use the shap library, see our tutorial on model interpretability.
The SHAP values reveal that Christmas, Independence Day, and Labor Day have the strongest influence on chocolate interest forecasting. These holidays show the highest feature importance weights, indicating they significantly impact consumer behavior patterns. This aligns with expectations since these are major US holidays associated with gift-giving, celebrations, and seasonal consumption patterns that drive chocolate sales.
Add Special Dates
Beyond country holidays, you can create custom special dates with SpecialDates
. These can represent unique one-time events or recurring patterns on specific dates of your choice.
Assume we already have a future DataFrame with monthly dates. We’ll create Valentine’s Day and Halloween as custom special dates and add them to the future DataFrame.
month | Valentine_season | Halloween_season | |
---|---|---|---|
0 | 2024-05-31 00:00:00 | 0 | 0 |
1 | 2024-06-30 00:00:00 | 0 | 0 |
2 | 2024-07-31 00:00:00 | 0 | 0 |
3 | 2024-08-31 00:00:00 | 0 | 0 |
4 | 2024-09-30 00:00:00 | 0 | 0 |
We will also add custom special dates to the historical DataFrame.
month | chocolate | Valentine_season | Halloween_season | |
---|---|---|---|---|
239 | 2023-12-31 00:00:00 | 90 | 0 | 0 |
240 | 2024-01-31 00:00:00 | 64 | 0 | 0 |
241 | 2024-02-29 00:00:00 | 66 | 1 | 0 |
242 | 2024-03-31 00:00:00 | 59 | 0 | 0 |
243 | 2024-04-30 00:00:00 | 51 | 0 | 0 |
Now, forecast with the special date features.
Plot the forecast with special date effects.
Examine the feature importance of the special dates.
The SHAP values reveal that Valentine’s Day has the strongest positive impact on chocolate sales forecasts. This aligns with consumer behavior patterns, as chocolate is a popular gift choice during Valentine’s Day celebrations.
Congratulations! You have successfully integrated holiday and special date features into your time series forecasts. Use these steps as a starting point for further experimentation with advanced date features.