Skip to content

[CI Failure]: mi325_1: Multi-Modal Models Test (Standard) #29520

@AndreasKaratzas

Description

@AndreasKaratzas

Name of failing test

pip install git+https://github.com/TIGER-AI-Lab/Mantis.git && pip freeze | grep -E 'torch' && pytest -v -s models/multimodal -m core_model --ignore models/multimodal/generation/test_whisper.py --ignore models/multimodal/processing && cd .. && VLLM_WORKER_MULTIPROC_METHOD=spawn pytest -v -s tests/models/multimodal/generation/test_whisper.py -m core_model

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

Failing Tests Summary:

test_single_image_models in test_common.py (6 failures)

  • Tests: qwen2_5_vl (test_case60-62), qwen3_vl (test_case65-67)
  • Failure: Engine process crash during multimodal generation
  • Configuration: Single image input with various size factors and dtypes
  • Likely cause: Memory or compute instability in Qwen vision models during image encoding, possibly related to EVS (Efficient Video Sampling) or vision tower processing differences between vLLM and HF implementations.

test_multi_image_models in test_common.py (1 failure)

  • Tests: qwen3_vl (test_case66)
  • Failure: Engine core process died unexpectedly
  • Configuration: Multiple images with batched processing
  • Likely cause: Similar to single-image failures, memory pressure or processing errors when handling multiple images through the vision encoder, potentially exacerbated by batching.

Note: The error "Engine core proc EngineCore_DP0 died unexpectedly" indicates the vLLM engine crashed rather than a test assertion failure, suggesting numerical instability or resource exhaustion specific to Qwen VL models on this configuration.

📝 History of failing test

AMD-CI build Buildkite references:

  • 1041
  • 1077
  • 1088
  • 1109
  • 1111

CC List.

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    ci-failureIssue about an unexpected test failure in CI

    Type

    No type

    Projects

    Status

    Done

    Status

    In review

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions