-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Description
Name of failing test
pip install git+https://github.com/TIGER-AI-Lab/Mantis.git && pip freeze | grep -E 'torch' && pytest -v -s models/multimodal -m core_model --ignore models/multimodal/generation/test_whisper.py --ignore models/multimodal/processing && cd .. && VLLM_WORKER_MULTIPROC_METHOD=spawn pytest -v -s tests/models/multimodal/generation/test_whisper.py -m core_model
Basic information
- Flaky test
- Can reproduce locally
- Caused by external libraries (e.g. bug in
transformers)
🧪 Describe the failing test
Failing Tests Summary:
test_single_image_models in test_common.py (6 failures)
- Tests: qwen2_5_vl (test_case60-62), qwen3_vl (test_case65-67)
- Failure: Engine process crash during multimodal generation
- Configuration: Single image input with various size factors and dtypes
- Likely cause: Memory or compute instability in Qwen vision models during image encoding, possibly related to EVS (Efficient Video Sampling) or vision tower processing differences between vLLM and HF implementations.
test_multi_image_models in test_common.py (1 failure)
- Tests: qwen3_vl (test_case66)
- Failure: Engine core process died unexpectedly
- Configuration: Multiple images with batched processing
- Likely cause: Similar to single-image failures, memory pressure or processing errors when handling multiple images through the vision encoder, potentially exacerbated by batching.
Note: The error "Engine core proc EngineCore_DP0 died unexpectedly" indicates the vLLM engine crashed rather than a test assertion failure, suggesting numerical instability or resource exhaustion specific to Qwen VL models on this configuration.
📝 History of failing test
AMD-CI build Buildkite references:
- 1041
- 1077
- 1088
- 1109
- 1111
CC List.
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status