Official code repository for the paper "Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain". Check out our paper for more details. Accompanying datasets can be found here.
Install the required packages.
Torch experiments:
pip install -r requirements/requirements-pytorch.txtstatsforecast experiments:
pip install -r requirements/requirements-stats.txtEasily load and access the dataset from Hugging Face Hub:
from datasets import load_dataset
ds = load_dataset(
"Salesforce/cloudops_tsf",
"azure_vm_traces_2017", # "borg_cluster_data_2011", "alibaba_cluster_trace_2018"
split=None, # "train_test", "pretrain"
)We use Hydra for config management.
Run the hyperparameter tuning script:
python -m benchmark.benchmark_exp model_name=MODEL_NAME dataset_name=DATASET- where
MODEL_SIZEis one of:TemporalFusionTransformer,Autoformer,FEDformer,NSTransformer,PatchTST,LinearFamily,DeepTime,TimeGrad, orDeepVAR. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.
After hyperparameter tuning, run the test script:
python -m benchmark.benchmark_exp model_name=MODEL_NAME dataset_name=DATASET test=true- where
MODEL_SIZEis one of:TemporalFusionTransformer,Autoformer,FEDformer,NSTransformer,PatchTST,LinearFamily,DeepTime,TimeGrad, orDeepVAR. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.- training logs and checkpoints will be saved in
outputs/benchmark_exp
python -m benchmark.stats_exp DATASET --models MODEL_1 MODEL_2DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.MODEL_1,MODEL_2is a list of models you want to run, fromnaive,auto_arima,auto_ets,auto_theta,multivariate_naive, orvar.
Run the pre-training script:
python -m pretraining.pretrain_exp backbone=BACKBONE size=SIZE ++data.dataset_name=DATASET- where the options for
BACKBONE,SIZEoptions can be found inconf/backboneandconf/sizerespectively. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.- see
confg/pretrain.yamlfor more details on the options. - training logs and checkpoints will be saved in
outputs/pretrain_exp
Run the forecast script:
python -m pretraining.forecast_exp backbone=BACKBONE forecast=FORECAST size=SIZE ++data.dataset_name=DATASET- where the options for
BACKBONE,FORECAST,SIZEoptions can be found inconf/backbone,conf/forecast, andconf/sizerespectively. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.- see
confg/forecast.yamlfor more details on the options. - training logs and checkpoints will be saved in
outputs/forecast_exp
If you find the paper or the source code useful to your projects, please cite the following bibtex:
@article{woo2023pushing,
title={Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain},
author={Woo, Gerald and Liu, Chenghao and Kumar, Akshat and Sahoo, Doyen},
journal={arXiv preprint arXiv:2310.05063},
year={2023}
}