TimeGPT accepts pandas and polars dataframes in long format. The minimum required columns are:
Required Columns
- unique_id: String or numerical value to label each series.
-
ds(timestamp): String or datetime in
YYYY-MM-DD
orYYYY-MM-DD HH:MM:SS
format. - y(numeric): Numerical target variable to forecast.
Optional Index
If a DataFrame lacks the
ds
column but uses a DatetimeIndex, that is also supported.TimeGPT also supports distributed dataframe libraries such as dask, spark, and ray.
You can include additional exogenous features in the same DataFrame. See the Exogenous Variables tutorial for details.
Example DataFrame
Below is a sample of a valid input DataFrame for TimeGPT (with columns namedtimestamp
and value
instead of ds
and y
):
Sample Data Loading
Data Preview
Sample Data Preview
unique_id | timestamp | value |
---|---|---|
series1 | 1949-01-01 | 112 |
series1 | 1949-02-01 | 118 |
series1 | 1949-03-01 | 132 |
series1 | 1949-04-01 | 129 |
series1 | 1949-05-01 | 121 |
unique_id
identifies the seriestimestamp
corresponds tods
.value
corresponds toy
.
Matching Columns to TimeGPT
You can choose how to align your DataFrame columns with TimeGPT’s expected structure:Rename Now your DataFrame has the explicitly required columns:
timestamp
to ds
and value
to y
:Rename Columns Example
Show Head of DataFrame
Example Forecast
When you run the forecast method:Forecast Example
Forecast Logs
Forecast Logs
Forecast Logs
unique_id | timestamp | TimeGPT |
---|---|---|
series1 | 1961-01-01 | 437.83792 |
series1 | 1961-02-01 | 426.06270 |
series1 | 1961-03-01 | 463.11655 |
series1 | 1961-04-01 | 478.24450 |
series1 | 1961-05-01 | 505.64648 |
Forecast Output Preview
TimeGPT attempts to automatically infer your data’s frequency (
freq
). You can override this by specifying the freq parameter (e.g., freq='MS'
).Important Considerations
Warning: Data passed to TimeGPT must not contain missing values or time gaps.
Minimum Data Requirements (Azure AI)
These are the minimum data sizes required for each frequency when using Azure AI:
Frequency | Minimum Size |
---|---|
Hourly and subhourly (e.g., “H”) | 1008 |
Daily (“D”) | 300 |
Weekly (e.g., “W-MON”) | 64 |
Monthly and others | 48 |
1
Forecast horizon (h)
Number of future periods you want to predict.
2
Number of validation windows (n_windows)
How many times to test the model’s performance.
3
Gaps (step_size)
Periodic offset between validation windows during cross-validation.