Skip to content

[BUG] DeepSpeed errors when running BLOOM #67

@jataylo

Description

@jataylo

Describe the bug
I am facing issues getting the BLOOM model to run with DeepSpeed using TOT upstream pytorch.

The first slough of errors observed are resolved with @rraminen's workaround in the transformer_inference branch.

This occurs both in 5.4.2 and 5.5.

Log snippet:
deepspeed_error.txt

/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.
    4 | #error C++17 or later compatible compiler is required to use ATen.
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/include/ATen/core/ivalue_inl.h: In lambda function:
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:30: error: ‘is_convertible_v’ is not a member of ‘std’; did you mean ‘is_convertible’?
1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {

To Reproduce
Docker image: rocm/pytorch-private:BLOOM_DeepSpeed_tranformer_inference_enabled_tot_issue

Steps to reproduce the behavior:

  1. Build upstream PyTorch and the transformer_inference ROCm DeepSpeed branch
  2. git clone https://github.com/huggingface/transformers-bloom-inference
  3. deepspeed --num_gpus 1 transformers-bloom-inference/bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloom-560m

ds_report output
DeepSpeed general environment info:
torch install path ............... ['/opt/conda/lib/python3.8/site-packages/torch']
torch version .................... 2.1.0a0+gitfde024b
deepspeed install path ........... ['/opt/conda/lib/python3.8/site-packages/deepspeed']
deepspeed info ................... 0.9.3+44c0bbfe, 44c0bbf, transformer_inference
torch cuda version ............... None
torch hip version ................ 5.5.30201-c1741e9b
nvcc version ..................... None
deepspeed wheel compiled w. ...... torch 2.0, hip 5.5

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions