Time Series Forecasting with Ray

Overview

Ray is an open-source unified compute framework that helps scale Python workloads for distributed computing. This guide demonstrates how to distribute TimeGPT forecasting jobs on top of Ray. Ray is ideal for machine learning pipelines with complex task dependencies and datasets with 10+ million observations. Its unified framework excels at orchestrating distributed ML workflows, making it perfect for integrating TimeGPT into broader AI applications.

Why Use Ray for Time Series Forecasting?

Ray offers unique advantages for ML-focused time series forecasting:

ML pipeline integration: Seamlessly integrate TimeGPT into complex ML workflows with Ray Tune and Ray Serve
Task parallelism: Handle complex task dependencies beyond data parallelism
Python-native: Pure Python with minimal boilerplate code
Flexible architecture: Scale from laptop to cluster with the same code
Actor model: Stateful computations for advanced forecasting scenarios

Choose Ray when you’re building ML pipelines, need complex task orchestration, or want to integrate TimeGPT with other ML frameworks like PyTorch or TensorFlow. What you’ll learn:

Install Fugue with Ray support for distributed computing
Initialize Ray clusters for distributed forecasting
Run TimeGPT forecasting and cross-validation on Ray

Prerequisites

Before proceeding, make sure you have an API key from Nixtla. When executing on a distributed Ray cluster, ensure the nixtla library is installed on all workers.

How to Use TimeGPT with Ray

Step 1: Install Fugue and Ray

Fugue provides an easy-to-use interface for distributed computation across frameworks like Ray. Install Fugue with Ray support:

pip install fugue[ray]

Step 2: Load Your Data

Load your dataset into a pandas DataFrame. This tutorial uses hourly electricity prices from various markets:

import pandas as pd

df = pd.read_csv(
    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv',
    parse_dates=['ds'],
)
df.head()

Example pandas DataFrame:

	unique_id	ds	y
0	BE	2016-10-22 00:00:00	70.00
1	BE	2016-10-22 01:00:00	37.10
2	BE	2016-10-22 02:00:00	37.10
3	BE	2016-10-22 03:00:00	44.75
4	BE	2016-10-22 04:00:00	37.10

Step 3: Initialize Ray

Create a Ray cluster locally by initializing a head node. You can scale this to multiple machines in a real cluster environment.

import ray
from ray.cluster_utils import Cluster

ray_cluster = Cluster(
    initialize_head=True,
    head_node_args={"num_cpus": 2}
)

ray.init(address=ray_cluster.address, ignore_reinit_error=True)

# Convert your DataFrame to Ray format:
ray_df = ray.data.from_pandas(df)
ray_df

Step 4: Use TimeGPT on Ray

To use TimeGPT with Ray, provide a Ray Dataset to Nixtla’s client methods instead of a pandas DataFrame. The API remains the same as local usage. Instantiate the NixtlaClient class to interact with Nixtla’s API:

from nixtla import NixtlaClient

nixtla_client = NixtlaClient(
    api_key='my_api_key_provided_by_nixtla'
)

You can use any method from the NixtlaClient, such as forecast or cross_validation.

Forecast Example
Cross-validation Example

fcst_df = nixtla_client.forecast(ray_df, h=12)
fcst_df.to_pandas().tail()

Public API models supported include timegpt-1 (default) and timegpt-1-long-horizon. For long horizon forecasting, see the long-horizon model tutorial.

Step 5: Shutdown Ray

Always shut down Ray after you finish your tasks to free up resources:

ray.shutdown()

Working with Exogenous Variables

TimeGPT with Ray also supports exogenous variables. Refer to the Exogenous Variables Tutorial for details. Simply substitute pandas DataFrames with Ray Datasets—the API remains identical. Explore more distributed forecasting options:

Distributed Computing Overview - Compare Spark, Dask, and Ray
Spark Integration - For datasets with 100M+ observations
Dask Integration - For datasets with 10M-100M observations
Fine-tuning TimeGPT - Improve accuracy at scale
Cross-Validation - Validate distributed forecasts

INTRODUCTION

SETUP

DATA REQUIREMENTS

FORECASTING

ANOMALY DETECTION

USE CASES

REFERENCE

About

Overview

Why Use Ray for Time Series Forecasting?

Prerequisites

How to Use TimeGPT with Ray

Step 1: Install Fugue and Ray

Step 2: Load Your Data

Step 3: Initialize Ray

Step 4: Use TimeGPT on Ray

Step 5: Shutdown Ray

Working with Exogenous Variables

INTRODUCTION

SETUP

DATA REQUIREMENTS

FORECASTING

ANOMALY DETECTION

USE CASES

REFERENCE

About

​Overview

​Why Use Ray for Time Series Forecasting?

​Prerequisites

​How to Use TimeGPT with Ray

​Step 1: Install Fugue and Ray

​Step 2: Load Your Data

​Step 3: Initialize Ray

​Step 4: Use TimeGPT on Ray

​Step 5: Shutdown Ray

​Working with Exogenous Variables

​Related Resources

Overview

Why Use Ray for Time Series Forecasting?

Prerequisites

How to Use TimeGPT with Ray

Step 1: Install Fugue and Ray

Step 2: Load Your Data

Step 3: Initialize Ray

Step 4: Use TimeGPT on Ray

Step 5: Shutdown Ray

Working with Exogenous Variables

Related Resources