Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

Goel, Dhruv

Hi,

Currently any DDL operations (Create Indexes, Drop Indexes etc.) when run during an existing concurrent index build on the same table causes the index build to fail with “deadlock detected”. This is a pain-point specially when we want to kick-off multiple concurrent index builds on the same table; the index build will reach phase 3 (consuming resources) and then fail with deadlock errors.

 

I have a patch that might improve the build times and reduce deadlock occurrences. Is this something the community would be interested in? I might be missing some documentation changes in the patch but wanted to get some feedback on the functional aspect of the patch first.

 

Problem:

In the Concurrent Index creation implementation there are three waits that are relevant:

  1. Wait 1 at start of Phase 2: Postgres waits for all transactions that started before this transaction and conflict with “Share Lock” on this relation. This is to make sure from this point forward all HOT updates to the table will be compatible with the new index. 
  2. Wait 2 at the start of Phase 3: Postgres waits for all transactions that started before this transaction and conflict with “Share Lock” on this relation. 
  3. Wait 3 at the end of Phase 3: PG waits for all transactions that started before this transaction primarily because they should not start using the index as they might be using an older snapshot and the index does not have all the entries (missing deleted tuples) for snapshot.

 

Typically, all the three wait states can cause deadlocks. Deadlocks due to the third wait state is reproduced by transactions that are waiting for a lock to be freed from “CREATE INDEX CONCURRENTLY” will cause deadlocks (primarily DDLs). The former 2 waits are much harder to reproduce with the test case being a Insert/Update/Delete as first statement of the transaction and then another DDL which causes lock escalation.

 

Proposed Solution:

We remove the third wait state completely from the concurrent index build. When we mark the index as ready, we also mark “indcheckxmin” to true which essentially enforces Postgres to not use this index for older snapshots.

 

Tests:

Added an isolation test which breaks without the patch. Manual test with a Repeatable Read Transaction that has an older snapshot with a tuple that has been deleted since and not part of the index.

 

 

May the force be with you,

Dhruv

 


patch-cic.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

Kuntal Ghosh
Hello,

On Wed, May 15, 2019 at 1:45 PM Goel, Dhruv <[hidden email]> wrote:

 

Proposed Solution:

We remove the third wait state completely from the concurrent index build. When we mark the index as ready, we also mark “indcheckxmin” to true which essentially enforces Postgres to not use this index for older snapshots.

 

I think there is a problem in the proposed solution. When phase 3 is reached, the index is valid. But it might not contain tuples deleted just before the reference snapshot was taken. Hence, we wait for those transactions that might have older snapshot. The TransactionXmin of these transactions can be greater than the xmin of the pg_index entry for this index.
Instead of waiting in the third phase, if we just set indcheckxmin as true, the above transactions will be able to use the index which is wrong. (because they won't find the recently deleted tuples from the index that are still live according to their snapshots)

The respective code from get_relation_info:
if (index->indcheckxmin &&                                                                                                                                                                 
                 !TransactionIdPrecedes(HeapTupleHeaderGetXmin(indexRelation->rd_indextuple->t_data), TransactionXmin))                                         
 { /* don't use this index */ }

Please let me know if I'm missing something.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com