Skip to content

Conversation

@pnthao
Copy link
Member

@pnthao pnthao commented Sep 26, 2025

Two deadlock situations, both related to drop table/cagg cascade while one of the relation's jobs is running.

First situation: drop table while one of its cagg policies was running.
The refresh job (P1) started to run and acquired RowShareLock on the refresh's job_id (advisory lock). Drop table cascade (P2) acquired AccessExclusiveLock on the table, then proceeded to drop the dependent jobs, including the refresh one. It tried to acquire AccessExclusiveLock on the refresh's job_id (advisory lock) before deleting the job, but was blocked by the RowShareLock on the job_id (advisory lock) that P1 held. P1 proceeded and later tried to acquire AccessShareLock on the table, but was blocked by the AccessExclusiveLock that P2 held.

Second situation (same as the one reported in #8636): drop table/cagg with at least 2 jobs to be dropped, while one of its policy is running, and the one running is not the first one dropped.
The drop process (P2) started its cascade drop, and after dropping the first job, it was holding ShareRowExclusiveLock on bgw_job_stat. The refresh job (P1), which had not been dropped yet, acquired RowShareLock on the refresh's job_id (advisory lock), then proceeded and asked for AccessShareLock on bgw_job_stat, but was blocked by the ShareRowExclusiveLock that P2 held. P2 continued to drop the refresh job and tried to acquire AccessExclusiveLock on the refresh's job_id (advisory lock), but was blocked by the RowShareLock that P1 held.

Two deadlock situations, both related to drop table/cagg cascade
while one of the relation's jobs is running.

First situation: drop table while one of its cagg policies is
running. The refresh job (P1) started to and acquired
RowShareLock on the refresh's job_id (advisory lock).
Drop table cascade (P2) acquired AccessExclusiveLock on
the table, then proceeded to drop the dependent jobs, including
the refresh one. It tried to acquire AccessExclusiveLock on the
refresh's job_id (advisory lock) before deleting the job, but is
blocked by the RowShareLock on the job_id (advisory lock) that
P1 held. P1 proceeded and later tried to acquire AccessShareLock
on the table, but is blocked by the AccessExclusiveLock that
P2 held.

Second situation: drop table/cagg with at least 2 jobs to be
dropped, while one of its policy is running, and the one
running is not the first one dropped.

The drop process (P2) started its cascade drop, and after
dropping the first job, it is holding ShareRowExclusiveLock
on bgw_job_stat, the refresh job (P1), which has not been
dropped yet, acquired RowShareLock on the refresh's
job_id (advisory lock), then proceeded and asked for
AccessShareLock on bgw_job_stat, but is blocked by the
ShareRowExclusiveLock that P2 held. P2 continued to drop
the refresh job and tried to acquire AccessExclusiveLock on
the refresh's job_id (advisory lock), but is blocked by the
RowShareLock that P1 held.
@github-actions
Copy link

This pull request has been automatically marked as stale due to lack of activity. This pull request will be closed in 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants