Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain

Official code repository for the paper "Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain". Check out our paper for more details. Accompanying datasets can be found here.

Usage

Install the required packages.

Torch experiments:

pip install -r requirements/requirements-pytorch.txt

statsforecast experiments:

pip install -r requirements/requirements-stats.txt

Dataset

Easily load and access the dataset from Hugging Face Hub:

from datasets import load_dataset

ds = load_dataset(
    "Salesforce/cloudops_tsf",
    "azure_vm_traces_2017",  # "borg_cluster_data_2011", "alibaba_cluster_trace_2018"
    split=None,  # "train_test", "pretrain"
)

Benchmark Experiments

We use Hydra for config management.

Deep Learning Models

Run the hyperparameter tuning script:

python -m benchmark.benchmark_exp model_name=MODEL_NAME dataset_name=DATASET

where MODEL_SIZE is one of: TemporalFusionTransformer, Autoformer, FEDformer, NSTransformer, PatchTST, LinearFamily, DeepTime, TimeGrad, or DeepVAR.
DATASET is one of azure_vm_traces_2017, borg_cluster_data_2011, or alibaba_cluster_trace_2018.

After hyperparameter tuning, run the test script:

python -m benchmark.benchmark_exp model_name=MODEL_NAME dataset_name=DATASET test=true

where MODEL_SIZE is one of: TemporalFusionTransformer, Autoformer, FEDformer, NSTransformer, PatchTST, LinearFamily, DeepTime, TimeGrad, or DeepVAR.
DATASET is one of azure_vm_traces_2017, borg_cluster_data_2011, or alibaba_cluster_trace_2018.
training logs and checkpoints will be saved in outputs/benchmark_exp

Statistical Models

python -m benchmark.stats_exp DATASET --models MODEL_1 MODEL_2

DATASET is one of azure_vm_traces_2017, borg_cluster_data_2011, or alibaba_cluster_trace_2018.
MODEL_1, MODEL_2 is a list of models you want to run, from naive, auto_arima, auto_ets, auto_theta, multivariate_naive, or var.

Pre-training Experiments

Run the pre-training script:

python -m pretraining.pretrain_exp backbone=BACKBONE size=SIZE ++data.dataset_name=DATASET

where the options for BACKBONE, SIZE options can be found in conf/backbone and conf/size respectively.
DATASET is one of azure_vm_traces_2017, borg_cluster_data_2011, or alibaba_cluster_trace_2018.
see confg/pretrain.yaml for more details on the options.
training logs and checkpoints will be saved in outputs/pretrain_exp

Run the forecast script:

python -m pretraining.forecast_exp backbone=BACKBONE forecast=FORECAST size=SIZE ++data.dataset_name=DATASET

where the options for BACKBONE, FORECAST, SIZE options can be found in conf/backbone, conf/forecast, and conf/size respectively.
DATASET is one of azure_vm_traces_2017, borg_cluster_data_2011, or alibaba_cluster_trace_2018.
see confg/forecast.yaml for more details on the options.
training logs and checkpoints will be saved in outputs/forecast_exp

Citation

If you find the paper or the source code useful to your projects, please cite the following bibtex:

@article{woo2023pushing,
  title={Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain},
  author={Woo, Gerald and Liu, Chenghao and Kumar, Akshat and Sahoo, Doyen},
  journal={arXiv preprint arXiv:2310.05063},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
benchmark		benchmark
pretraining		pretraining
requirements		requirements
util		util
AI_ETHICS.md		AI_ETHICS.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING-ARCHIVED.md		CONTRIBUTING-ARCHIVED.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain

Usage

Dataset

Benchmark Experiments

Deep Learning Models

Statistical Models

Pre-training Experiments

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

SalesforceAIResearch/pretrain-time-series-cloudops

Folders and files

Latest commit

History

Repository files navigation

Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain

Usage

Dataset

Benchmark Experiments

Deep Learning Models

Statistical Models

Pre-training Experiments

Citation

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages