Reproduce two deadlocks during drop cascade #8697
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Two deadlock situations, both related to drop table/cagg cascade while one of the relation's jobs is running.
First situation: drop table while one of its cagg policies was running.
The refresh job (P1) started to run and acquired RowShareLock on the refresh's job_id (advisory lock). Drop table cascade (P2) acquired AccessExclusiveLock on the table, then proceeded to drop the dependent jobs, including the refresh one. It tried to acquire AccessExclusiveLock on the refresh's job_id (advisory lock) before deleting the job, but was blocked by the RowShareLock on the job_id (advisory lock) that P1 held. P1 proceeded and later tried to acquire AccessShareLock on the table, but was blocked by the AccessExclusiveLock that P2 held.
Second situation (same as the one reported in #8636): drop table/cagg with at least 2 jobs to be dropped, while one of its policy is running, and the one running is not the first one dropped.
The drop process (P2) started its cascade drop, and after dropping the first job, it was holding ShareRowExclusiveLock on bgw_job_stat. The refresh job (P1), which had not been dropped yet, acquired RowShareLock on the refresh's job_id (advisory lock), then proceeded and asked for AccessShareLock on bgw_job_stat, but was blocked by the ShareRowExclusiveLock that P2 held. P2 continued to drop the refresh job and tried to acquire AccessExclusiveLock on the refresh's job_id (advisory lock), but was blocked by the RowShareLock that P1 held.