Dask
Run TimeGPT in a distributed manner using Dask for scalable forecasting.
Dask is an open-source parallel computing library for Python. This guide explains how to use TimeGPT from Nixtla with Dask for distributed forecasting tasks.
Highlights
• Simplify distributed computing with Fugue.
• Run TimeGPT at scale on a Dask cluster.
• Seamlessly convert pandas DataFrames to Dask.
Outline
Step 1: Installation
Install Fugue and Dask
Install Fugue and Dask
Fugue provides an easy-to-use interface for distributed computing over frameworks like Dask.
You can install fugue
with:
If running on a distributed Dask cluster, ensure the nixtla
library is installed on all worker nodes.
Step 2: Load Your Data
You can start by loading data into a pandas DataFrame. In this example, we use hourly electricity prices from multiple markets:
unique_id | ds | y | |
---|---|---|---|
0 | BE | 2016-10-22 00:00:00 | 70.00 |
1 | BE | 2016-10-22 01:00:00 | 37.10 |
2 | BE | 2016-10-22 02:00:00 | 37.10 |
3 | BE | 2016-10-22 03:00:00 | 44.75 |
4 | BE | 2016-10-22 04:00:00 | 37.10 |
Step 3: Import Dask
Convert the pandas DataFrame into a Dask DataFrame for parallel processing.
When converting to a Dask DataFrame, you can specify the number of partitions based on your data size or system resources.
Step 4: Use TimeGPT on Dask
To use TimeGPT with Dask, provide a Dask DataFrame to Nixtla’s client methods instead of a pandas DataFrame.
Important Concept: NixtlaClient
Instantiate the NixtlaClient
class to interact with Nixtla’s API.
Using an Azure AI endpoint
Using an Azure AI endpoint
To use Azure AI, set the base_url
parameter:
You can use any method from the NixtlaClient
, such as forecast
or cross_validation
.
unique_id | ds | TimeGPT | |
---|---|---|---|
0 | BE | 2016-12-31 00:00:00 | 45.190453 |
1 | BE | 2016-12-31 01:00:00 | 43.244446 |
2 | BE | 2016-12-31 02:00:00 | 41.958389 |
3 | BE | 2016-12-31 03:00:00 | 39.796486 |
4 | BE | 2016-12-31 04:00:00 | 39.204533 |
unique_id | ds | TimeGPT | |
---|---|---|---|
0 | BE | 2016-12-31 00:00:00 | 45.190453 |
1 | BE | 2016-12-31 01:00:00 | 43.244446 |
2 | BE | 2016-12-31 02:00:00 | 41.958389 |
3 | BE | 2016-12-31 03:00:00 | 39.796486 |
4 | BE | 2016-12-31 04:00:00 | 39.204533 |
unique_id | ds | cutoff | TimeGPT | |
---|---|---|---|---|
0 | BE | 2016-12-30 04:00:00 | 2016-12-30 03:00:00 | 39.375439 |
1 | BE | 2016-12-30 05:00:00 | 2016-12-30 03:00:00 | 40.039215 |
2 | BE | 2016-12-30 06:00:00 | 2016-12-30 03:00:00 | 43.455849 |
3 | BE | 2016-12-30 07:00:00 | 2016-12-30 03:00:00 | 47.716408 |
4 | BE | 2016-12-30 08:00:00 | 2016-12-30 03:00:00 | 50.316650 |
Azure AI Models
Azure AI Models
When using an Azure AI endpoint, set model
to "azureai"
:
For the public API, two models are available:
• timegpt-1
(default)
• timegpt-1-long-horizon
See the Long Horizon Forecasting Tutorial for details on timegpt-1-long-horizon
.
TimeGPT with Dask also supports exogenous variables. Refer to the Exogenous Variables Tutorial for details. Substitute pandas DataFrames with Dask DataFrames as needed.