Must the sp_size be equal to the total_gpus in UlyssesSPAttentionHF? #7671

NiuMa-1234 · 2025-11-05T08:18:11Z

NiuMa-1234
Nov 5, 2025

I found the sequence_parallel_size in the provided example of Ulysses (test_ulysses_sp_hf.py) is equal to the world_size( total gpus) , and if the sequence_parallel_size is less than the world_size, the training would encounter an error when going backwards, as shown below:

 self.loss_scaler.backward(loss.float(), retain_graph=retain_graph)
  File "/opt/conda/lib/python3.10/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 65, in backward
    scaled_loss.backward(retain_graph=retain_graph)
  File "/opt/conda/lib/python3.10/site-packages/torch/_tensor.py", line 522, in backward
    torch.autograd.backward(
  File "/opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py", line 266, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 289, in apply
    return user_fn(self, *args)
  File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/functional.py", line 343, in backward
    gx = torch.empty_like(grad_outputs[rank])
IndexError: tuple index out of range

And the error is likely caused by that, when executing torch._AllGather, the gradoutput only has sp_world_size items but torch.distributed.get_rank() causes each GPU to choose its own grad from the gradoutput. Therefore raise this index mismatch error.

So is there a must to make sure the sequence_parallel_size be equal to the world_size?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Must the sp_size be equal to the total_gpus in UlyssesSPAttentionHF? #7671

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Must the sp_size be equal to the total_gpus in UlyssesSPAttentionHF? #7671

Uh oh!

Uh oh!

NiuMa-1234 Nov 5, 2025

Replies: 0 comments

NiuMa-1234
Nov 5, 2025