Scale Time Series Forecasting: Replace Manual Parameter Testing with AutoARIMA
How much time does your team waste on parameter tuning? Manual ARIMA modeling requires weeks of statistical testing and expertise that doesn't scale beyond single series.
AutoARIMA
eliminates manual parameter testing through automated optimization. It handles parameter selection, seasonality detection, and uncertainty quantification automatically. Complete 50-series forecasting in minutes, not weeks.
Introduction to StatsForecast
StatsForecast provides fast classical statistical models with automatic parameter selection. The library offers:
- Automatic model selection:
AutoARIMA
,AutoETS
,AutoTheta
with intelligent parameter optimization - Scalable processing: Handle millions of time series with distributed computing and optimized algorithms
- Unified interface: Consistent API across different forecasting methods
- Built-in validation: Time series cross-validation for reliable model evaluation
- Prediction intervals: Confidence intervals for uncertainty quantification
Install StatsForecast with a single command:
pip install statsforecast
For this tutorial, install additional dependencies:
pip install pandas numpy matplotlib
The source code for this tutorial is available on GitHub.
Setup - Retail Sales Scenario
Import the necessary libraries and create our retail forecasting scenario:
import pandas as pd
import numpy as np
from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA, AutoETS, SeasonalNaive
import matplotlib.pyplot as plt
We will create a controlled retail dataset with realistic business patterns:
- 5 product categories: electronics, clothing, home & garden, sports, books
- 4-year timespan: 2020-2023 monthly sales data (240 observations)
- Realistic patterns: trend growth, seasonal cycles, and category-specific scaling factors
View the code to create the dataset in the source code.
Here is a sample of the dataset:
unique_id | ds | y |
---|---|---|
electronics | 2020-01-01 | 62980.284918 |
electronics | 2020-02-01 | 65936.371640 |
electronics | 2020-03-01 | 75810.350968 |
electronics | 2020-04-01 | 83436.051479 |
electronics | 2020-05-01 | 72051.214384 |
AutoARIMA for Automatic Model Selection
Traditional ARIMA modeling requires extensive manual parameter testing. For example, the nested loops below test every ARIMA parameter combination:
- Six nested loops generate all possible
(p,d,q)(P,D,Q)
combinations - Each iteration instantiates a new ARIMA object
- AIC comparison tracks the lowest information criterion score
The exhaustive parameter search fits 144 models per series (3×2×3×2×2×2 combinations). This time-intensive process doesn't scale to multiple series.
from statsmodels.tsa.arima.model import ARIMA
def manual_arima_approach(data, max_p=2, max_q=2, max_d=2, max_P=1, max_Q=1):
"""Conceptual overview of manual ARIMA selection"""
best_aic = float('inf')
best_params = None
# Test all parameter combinations manually with nested loops
for p in range(max_p + 1):
for d in range(max_d + 1):
for q in range(max_q + 1):
for P in range(max_P + 1):
for D in range(2):
for Q in range(max_Q + 1):
model = ARIMA(
data,
order=(p, d, q),
seasonal_order=(P, D, Q, 12),
)
aic = model.fit().aic
if aic < best_aic:
best_aic = aic
best_params = (p, d, q, P, D, Q)
return best_params, best_aic
Manual ARIMA parameter selection for electronics category:
Testing parameter combinations manually...
==================================================
ARIMA(0,0,0)(0,0,0)[12]: AIC = 1045.73
ARIMA(0,0,0)(0,0,1)[12]: AIC = 1034.88
ARIMA(0,0,0)(0,1,0)[12]: AIC = 777.79
ARIMA(0,0,0)(0,1,1)[12]: AIC = 787.59
ARIMA(0,0,0)(1,0,0)[12]: AIC = 1037.25
... (tested 144 combinations)
Best: ARIMA(0, 1, 0)x(0, 1, 1)[12], AIC = 740.28
The manual approach shows rapid AIC improvements from 1045.73 to 777.79 in early iterations, but requires testing all 144 combinations to ensure the optimal model isn't missed.
AutoARIMA
transforms weeks of manual parameter tuning into seconds of automatic optimization. One simple fit()
call replaces complex nested loops with blazing-fast algorithms.
# AutoARIMA with constrained search space for fair comparison with manual approach
sf_fair = StatsForecast(
models=[
AutoARIMA(
season_length=12, # Annual seasonality
max_p=2, max_q=2, # Match manual search limits
max_P=1, max_Q=1, # Match manual seasonal limits
seasonal=True # Enable seasonal components
)
],
freq='MS'
)
Let's compare the performance of the manual approach with the AutoARIMA
approach with constrained search space.
electronics_series = retail_df[retail_df["unique_id"] == "electronics"]
start_time = time.time()
sf_fair.fit(electronics_series)
fair_execution_time = time.time() - start_time
print(f"Manual approach (statsmodels): {manual_execution_time:.2f}s for 144 combinations")
print(f"AutoARIMA (statsforecast): {fair_execution_time:.2f}s for 144 combinations")
print(f"Algorithm efficiency gain: {manual_execution_time/fair_execution_time:.1f}x faster")
Manual approach (statsmodels): 7.02s for 144 combinations
AutoARIMA (statsforecast): 0.36s for 144 combinations
Algorithm efficiency gain: 19.5x faster
AutoARIMA achieves 19.5x speed improvements over the manual approach.
Multiple Model Comparison and Ensemble
Compare AutoARIMA
with other automatic methods to validate performance:
# Configure multiple automatic models for comparison
sf_comparison = StatsForecast(
models=[
AutoARIMA(season_length=12, stepwise=True),
AutoETS(season_length=12), # Exponential smoothing
SeasonalNaive(season_length=12) # Simple seasonal baseline
],
freq='MS'
)
Split the data into training and testing sets:
# Split data for training and testing (use 80% for training)
split_date = retail_df["ds"].quantile(0.8)
train_data = retail_df[retail_df["ds"] <= split_date]
test_data = retail_df[retail_df["ds"] > split_date]
print(f"Training period: {train_data['ds'].min()} to {train_data['ds'].max()}")
print(f"Test period: {test_data['ds'].min()} to {test_data['ds'].max()}")
Fit all models on the training data:
# Fit all models simultaneously
sf_comparison.fit(train_data)
print("All models fitted for comparison")
Training period: 2020-01-01 00:00:00 to 2023-03-01 00:00:00
Test period: 2023-04-01 00:00:00 to 2023-12-01 00:00:00
All models fitted for comparison
Generate forecasts from all models:
# Generate forecasts for comparison
forecasts = sf_comparison.predict(h=7) # 7-month ahead forecasts
print(f"Forecast shape: {forecasts.shape}")
forecasts.head()
Forecast shape: (35, 5)
unique_id | ds | AutoARIMA | AutoETS | SeasonalNaive |
---|---|---|---|---|
books | 2023-04-01 | 60523.191642 | 55534.897267 | 54350.389275 |
books | 2023-05-01 | 61275.648592 | 55534.897267 | 61674.121108 |
books | 2023-06-01 | 59179.501052 | 55534.897267 | 44922.018634 |
books | 2023-07-01 | 55284.722061 | 55534.897267 | 50806.165985 |
books | 2023-08-01 | 48753.993447 | 55534.897267 | 39706.558281 |
Prediction Intervals and Uncertainty Quantification
Prediction intervals quantify forecast uncertainty by providing upper and lower bounds around point predictions. They answer critical business questions: "What's the worst-case scenario for inventory planning?" and "How confident should we be in these sales projections?"
Unlike point forecasts that give single values, prediction intervals capture the inherent uncertainty in future outcomes. A 95% confidence interval means that if you repeated the forecasting process 100 times, 95 of those intervals would contain the actual future value.
AutoARIMA
generates prediction intervals automatically through its predict()
method. To generate prediction intervals, specify confidence levels using the level
parameter (e.g., level=[80, 95]
).
First, create and configure the AutoARIMA model for generating prediction intervals:
# Create AutoARIMA model for prediction intervals
sf_auto = StatsForecast(
models=[AutoARIMA(season_length=12, stepwise=True)],
freq='MS'
)
Next, fit the model on training data and generate forecasts with multiple confidence levels:
# Fit on training data and generate forecasts with multiple confidence levels
sf_auto.fit(train_data)
forecasts_with_intervals = sf_auto.predict(
h=12, # 12-month forecast horizon
level=[50, 80, 90, 95] # Multiple confidence levels
)
print(f"Forecast columns: {forecasts_with_intervals.columns.tolist()}")
Forecast columns: ['unique_id', 'ds', 'AutoARIMA', 'AutoARIMA-lo-95', 'AutoARIMA-lo-90', 'AutoARIMA-lo-80', 'AutoARIMA-lo-50', 'AutoARIMA-hi-50', 'AutoARIMA-hi-80', 'AutoARIMA-hi-90', 'AutoARIMA-hi-95']
Finally, examine the forecast results with confidence intervals for the electronics category:
sample_forecast = forecasts_with_intervals[
forecasts_with_intervals["unique_id"] == "electronics"
].head()
sample_forecast[["ds", "AutoARIMA", "AutoARIMA-lo-95", "AutoARIMA-hi-95"]]
ds | AutoARIMA | AutoARIMA-lo-95 | AutoARIMA-hi-95 |
---|---|---|---|
2023-04-01 | 99350.422989 | 87049.266213 | 111651.579766 |
2023-05-01 | 94747.079991 | 82445.923214 | 107048.236767 |
2023-06-01 | 97033.851200 | 84732.694424 | 109335.007977 |
2023-07-01 | 86302.421665 | 74001.264889 | 98603.578442 |
2023-08-01 | 83997.249910 | 71696.093133 | 96298.406686 |
Visualize predictions with uncertainty bands:
# Visualize forecasts with confidence intervals
def plot_forecasts_with_intervals(category_name):
# Get data
historical = train_data[train_data["unique_id"] == category_name].tail(24)
forecast = forecasts_with_intervals[forecasts_with_intervals["unique_id"] == category_name]
# Create plot
plt.figure(figsize=(12, 6))
# Plot data
plt.plot(historical["ds"], historical["y"], label="Historical Sales")
plt.plot(forecast["ds"], forecast["AutoARIMA"], label="AutoARIMA Forecast")
plt.fill_between(forecast["ds"], forecast["AutoARIMA-lo-95"], forecast["AutoARIMA-hi-95"],
alpha=0.2, label="95% Confidence")
plt.fill_between(forecast["ds"], forecast["AutoARIMA-lo-80"], forecast["AutoARIMA-hi-80"],
alpha=0.3, label="80% Confidence")
# Labels and legend
plt.title(f"{category_name.title()} Sales Forecast with Confidence Intervals")
plt.ylabel("Sales ($)")
plt.xlabel("Date")
plt.legend()
plt.show()
# Plot forecasts for electronics category
plot_forecasts_with_intervals("electronics")
The electronics forecast captures seasonal patterns with confidence intervals that widen over time, reflecting increasing uncertainty in longer-term predictions. AutoARIMA successfully identifies the cyclical sales patterns while providing realistic uncertainty bounds for inventory planning decisions.
Cross-Validation for Model Evaluation
Time series cross-validation tests model performance by training on historical data and validating on future periods. This approach respects temporal order by using multiple validation windows that simulate how models perform as new data arrives.
AutoARIMA
integrates cross-validation through the cross_validation()
method. This automatically handles the complex temporal splitting required for reliable time series validation.
# Perform time series cross-validation
cv_results = sf_auto.cross_validation(
df=train_data,
h=6, # 6-month forecast horizon
step_size=3, # Move validation window by 3 months
n_windows=4 # Use 4 validation windows
)
print(f"Cross-validation results shape: {cv_results.shape}")
cv_results.head()
Cross-validation results shape: (120, 5)
unique_id | ds | cutoff | y | AutoARIMA |
---|---|---|---|---|
books | 2022-01-01 | 2021-12-01 | 43018.516004 | 43226.769208 |
books | 2022-02-01 | 2021-12-01 | 48841.347642 | 42457.344848 |
books | 2022-03-01 | 2021-12-01 | 50980.426686 | 41966.543477 |
books | 2022-04-01 | 2021-12-01 | 54350.389275 | 41653.470486 |
books | 2022-05-01 | 2021-12-01 | 61674.121108 | 41453.767096 |
Calculate MAPE (Mean Absolute Percentage Error) for each product category to measure forecasting accuracy:
# Calculate mean absolute percentage error (MAPE) for each category
def calculate_mape(actual, predicted):
return np.mean(np.abs((actual - predicted) / actual)) * 100
cv_performance = cv_results.groupby("unique_id").apply(
lambda x: calculate_mape(x["y"], x["AutoARIMA"]),
include_groups=False
).reset_index(name="MAPE")
print("AutoARIMA Cross-Validation Performance (MAPE):")
print(cv_performance.sort_values("MAPE"))
AutoARIMA Cross-Validation Performance (MAPE):
unique_id MAPE
4 sports 9.252036
2 electronics 11.865445
3 home_garden 12.132121
1 clothing 13.172707
0 books 14.270602
The MAPE values show consistent performance across categories, with errors under 15% demonstrating reliable forecasting capability.
Complete Retail Forecasting Workflow
Let's put it all together in a complete retail forecasting pipeline, including:
- Split data for validation
- Configure AutoARIMA with essential parameters
- Generate forecasts with confidence intervals
- Validate performance on holdout data
# Split data for validation
split_date = retail_df["ds"].quantile(0.85) # Use 85% for training
train_data = retail_df[retail_df["ds"] <= split_date]
test_data = retail_df[retail_df["ds"] > split_date]
print(f"Training data: {len(train_data)} observations")
print(f"Training period: {train_data['ds'].min()} to {train_data['ds'].max()}")
Training data: 205 observations
Training period: 2020-01-01 00:00:00 to 2023-05-01 00:00:00
Configure the AutoARIMA model:
# Configure AutoARIMA with essential parameters
sf_production = StatsForecast(
models=[
AutoARIMA(
season_length=12, # Annual seasonality
stepwise=True, # Efficient search
seasonal=True, # Enable seasonal detection
)
],
freq='MS'
)
# Fit model on training data
sf_production.fit(train_data)
Generate forecasts with confidence intervals:
# Generate forecasts with confidence intervals
forecasts = sf_production.predict(
h=12, # 12-month forecast horizon
level=[80, 95] # Confidence levels
)
print(f"Final forecasts shape: {forecasts.shape}")
Final forecasts shape: (60, 7)
Validate model performance on holdout data:
# Generate predictions for validation period
validation_horizon = len(test_data["ds"].unique())
validation_forecasts = sf_production.predict(h=validation_horizon)
# Calculate validation metrics
validation_mape = calculate_mape(
test_data["y"].values,
validation_forecasts["AutoARIMA"].values
)
print(f"Validation MAPE: {validation_mape:.2f}%")
Validation MAPE: 22.42%
The 22.42% MAPE demonstrates reasonable predictive accuracy across product categories.
Conclusion
AutoARIMA
transforms weeks of manual parameter testing into a single, fast function call. This simple automation delivers reliable forecasts in minutes instead of months.
This time savings redirects analyst expertise toward business insights and strategic planning. Teams focus on interpreting forecasts and making data-driven decisions rather than manual model tuning.