[DOC] Document concurrent index builds waiting on each other

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[DOC] Document concurrent index builds waiting on each other

James Coleman
In my experience it's not immediately obvious (even after reading the
documentation) the implications of how concurrent index builds manage
transactions with respect to multiple concurrent index builds in
flight at the same time.

Specifically, as I understand multiple concurrent index builds running
at the same time will all return at the same time as the longest
running one.

I've attached a small patch to call this caveat out specifically in
the documentation. I think the description in the patch is accurate,
but please let me know if there's some intricacies around how the
various stages might change the results.

James Coleman

create-index-concurrently-docs-v1.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Bruce Momjian
On Wed, Sep 18, 2019 at 01:51:00PM -0400, James Coleman wrote:

> In my experience it's not immediately obvious (even after reading the
> documentation) the implications of how concurrent index builds manage
> transactions with respect to multiple concurrent index builds in
> flight at the same time.
>
> Specifically, as I understand multiple concurrent index builds running
> at the same time will all return at the same time as the longest
> running one.
>
> I've attached a small patch to call this caveat out specifically in
> the documentation. I think the description in the patch is accurate,
> but please let me know if there's some intricacies around how the
> various stages might change the results.

The CREATE INDEX docs already say:

    In a concurrent index build, the index is actually entered into
    the system catalogs in one transaction, then two table scans occur in
    two more transactions.  Before each table scan, the index build must
    wait for existing transactions that have modified the table to terminate.
    After the second scan, the index build must wait for any transactions
--> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
--> scan to terminate.  Then finally the index can be marked ready for use,

So, having multiple concurrent index scans is just a special case of
having to "wait for any transactions that have a snapshot", no?  I am
not sure adding a doc mention of other index builds really is helpful.

---------------------------------------------------------------------------

> commit 9e28e704820eebb81ff94c1c3cbfb7db087b2c45
> Author: James Coleman <[hidden email]>
> Date:   Wed Sep 18 13:36:22 2019 -0400
>
>     Document concurrent indexes waiting on each other
>    
>     It's not immediately obvious that because concurrent index building
>     waits on previously running transactions to complete, running multiple
>     concurrent index builds at the same time will result in each of them
>     taking as long to return as the longest takes, so, document this caveat.
>
> diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
> index 629a31ef79..35f15abb0e 100644
> --- a/doc/src/sgml/ref/create_index.sgml
> +++ b/doc/src/sgml/ref/create_index.sgml
> @@ -616,6 +616,13 @@ Indexes:
>      cannot.
>     </para>
>  
> +   <para>
> +    Because the second table scan must wait for any transactions having a
> +    snapshot preceding the start of that scan to finish before completing the
> +    scan, concurrent index builds on multiple tables at the same time will
> +    not return on any one table until all have completed.
> +   </para>
> +
>     <para>
>      Concurrent builds for indexes on partitioned tables are currently not
>      supported.  However, you may concurrently build the index on each


--
  Bruce Momjian  <[hidden email]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Alvaro Herrera-9
On 2019-Sep-28, Bruce Momjian wrote:

> The CREATE INDEX docs already say:
>
>     In a concurrent index build, the index is actually entered into
>     the system catalogs in one transaction, then two table scans occur in
>     two more transactions.  Before each table scan, the index build must
>     wait for existing transactions that have modified the table to terminate.
>     After the second scan, the index build must wait for any transactions
> --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> --> scan to terminate.  Then finally the index can be marked ready for use,
>
> So, having multiple concurrent index scans is just a special case of
> having to "wait for any transactions that have a snapshot", no?  I am
> not sure adding a doc mention of other index builds really is helpful.

I always thought that create index concurrently was prevented from
running concurrently in a table by the ShareUpdateExclusive lock that's
held during the operation.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
On Sat, Sep 28, 2019 at 9:22 PM Alvaro Herrera <[hidden email]> wrote:

>
> On 2019-Sep-28, Bruce Momjian wrote:
>
> > The CREATE INDEX docs already say:
> >
> >     In a concurrent index build, the index is actually entered into
> >     the system catalogs in one transaction, then two table scans occur in
> >     two more transactions.  Before each table scan, the index build must
> >     wait for existing transactions that have modified the table to terminate.
> >     After the second scan, the index build must wait for any transactions
> > --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> > --> scan to terminate.  Then finally the index can be marked ready for use,
> >
> > So, having multiple concurrent index scans is just a special case of
> > having to "wait for any transactions that have a snapshot", no?  I am
> > not sure adding a doc mention of other index builds really is helpful.
>
> I always thought that create index concurrently was prevented from
> running concurrently in a table by the ShareUpdateExclusive lock that's
> held during the operation.

You mean multiple CICs on a single table at the same time? Yes, that
(unfortunately) isn't possible, but I'm concerned in the patch with
the fact that CIC on table X blocks CIC on table Y.

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Bruce Momjian
On Sat, Sep 28, 2019 at 09:54:48PM -0400, James Coleman wrote:

> On Sat, Sep 28, 2019 at 9:22 PM Alvaro Herrera <[hidden email]> wrote:
> >
> > On 2019-Sep-28, Bruce Momjian wrote:
> >
> > > The CREATE INDEX docs already say:
> > >
> > >     In a concurrent index build, the index is actually entered into
> > >     the system catalogs in one transaction, then two table scans occur in
> > >     two more transactions.  Before each table scan, the index build must
> > >     wait for existing transactions that have modified the table to terminate.
> > >     After the second scan, the index build must wait for any transactions
> > > --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> > > --> scan to terminate.  Then finally the index can be marked ready for use,
> > >
> > > So, having multiple concurrent index scans is just a special case of
> > > having to "wait for any transactions that have a snapshot", no?  I am
> > > not sure adding a doc mention of other index builds really is helpful.
> >
> > I always thought that create index concurrently was prevented from
> > running concurrently in a table by the ShareUpdateExclusive lock that's
> > held during the operation.
>
> You mean multiple CICs on a single table at the same time? Yes, that
> (unfortunately) isn't possible, but I'm concerned in the patch with
> the fact that CIC on table X blocks CIC on table Y.

I think any open transaction will block CIC, which is my point.

--
  Bruce Momjian  <[hidden email]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
On Sat, Sep 28, 2019 at 9:56 PM Bruce Momjian <[hidden email]> wrote:

>
> On Sat, Sep 28, 2019 at 09:54:48PM -0400, James Coleman wrote:
> > On Sat, Sep 28, 2019 at 9:22 PM Alvaro Herrera <[hidden email]> wrote:
> > >
> > > On 2019-Sep-28, Bruce Momjian wrote:
> > >
> > > > The CREATE INDEX docs already say:
> > > >
> > > >     In a concurrent index build, the index is actually entered into
> > > >     the system catalogs in one transaction, then two table scans occur in
> > > >     two more transactions.  Before each table scan, the index build must
> > > >     wait for existing transactions that have modified the table to terminate.
> > > >     After the second scan, the index build must wait for any transactions
> > > > --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> > > > --> scan to terminate.  Then finally the index can be marked ready for use,
> > > >
> > > > So, having multiple concurrent index scans is just a special case of
> > > > having to "wait for any transactions that have a snapshot", no?  I am
> > > > not sure adding a doc mention of other index builds really is helpful.

While that may be technically true, as a co-worker of mine likes to
point out, being "technically correct" is the worst kind of correct.

Here's what I mean:

First, I believe the docs should aim to be as useful as possible to
even those with more entry-level understanding of PostgreSQL. The fact
the paragraph you cite actually links to the entire chapter on
concurrency control in Postgres demonstrates that there's some
not-so-immediate stuff here to consider. For one: is it obvious to all
users that the transaction held by CIC (or even that all transactions)
has an open snapshot?

Second, this is a difference from a regular CREATE INDEX, and we
already call out as caveats differences between CREATE INDEX
CONCURRENTLY and regular CREATE INDEX as I point out below re:
Alvaro's comment.

Third, related to the above point, many DDL commands only block DDL
against the table being operated on. The fact that CIC here is
different is, in my opinion, a fairly surprising break from that
pattern, and as such likely to catch users off guard. I can attest
that this surprised at least one entire database team a while back :)
including many people who've been operating Postgres at a large scale
for a long time.

I believe caveats like this are worth calling out rather than
expecting users to have to understand the implementation details an
work out the implications on their own.

> > > I always thought that create index concurrently was prevented from
> > > running concurrently in a table by the ShareUpdateExclusive lock that's
> > > held during the operation.
> >
> > You mean multiple CICs on a single table at the same time? Yes, that
> > (unfortunately) isn't possible, but I'm concerned in the patch with
> > the fact that CIC on table X blocks CIC on table Y.
>
> I think any open transaction will block CIC, which is my point.

I read Alvaro as referring to the fact that the docs already call out
the following:

> Regular index builds permit other regular index builds on the same table to occur simultaneously, but only one concurrent index build can occur on a table at a time.

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Alvaro Herrera-9
On 2019-Sep-28, James Coleman wrote:

> I believe caveats like this are worth calling out rather than
> expecting users to have to understand the implementation details an
> work out the implications on their own.

I agree.

> I read Alvaro as referring to the fact that the docs already call out
> the following:
>
> > Regular index builds permit other regular index builds on the same
> > table to occur simultaneously, but only one concurrent index build
> > can occur on a table at a time.

Yeah, that's what I was understanding.

BTW I think there's an approach that could alleviate part of this
problem, at least some of the time: whenever CIC runs for an index
that's not on expression and not partial, we could set the
PROC_IN_VACUUM flag.  That would cause it to get ignored by other
processes for snapshot purposes (including CIC itself), as well as by
vacuum.  I need to take some time to research the safety of this, but
intuitively it seems safe.

Even further, I think we could also do it for regular CREATE INDEX
(under the same conditions) provided that it's not run in a transaction
block.  But that requires even more research/proof.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Michael Paquier-2
In reply to this post by Alvaro Herrera-9
On Sat, Sep 28, 2019 at 10:22:28PM -0300, Alvaro Herrera wrote:
> I always thought that create index concurrently was prevented from
> running concurrently in a table by the ShareUpdateExclusive lock that's
> held during the operation.

REINDEX CONCURRENTLY and CIC can deadlock while waiting for each other
to finish after their validation phase, see:
https://www.postgresql.org/message-id/20190507030756.GD1499@...
https://www.postgresql.org/message-id/20190507032543.GH1499@...
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
I went ahead and registered this in the current only CF as
https://commitfest.postgresql.org/27/2454/

Alvaro: Would you like to be added as a reviewer?

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
In reply to this post by Michael Paquier-2
On Sun, Sep 29, 2019 at 9:24 PM Michael Paquier <[hidden email]> wrote:

>
> On Sat, Sep 28, 2019 at 10:22:28PM -0300, Alvaro Herrera wrote:
> > I always thought that create index concurrently was prevented from
> > running concurrently in a table by the ShareUpdateExclusive lock that's
> > held during the operation.
>
> REINDEX CONCURRENTLY and CIC can deadlock while waiting for each other
> to finish after their validation phase, see:
> https://www.postgresql.org/message-id/20190507030756.GD1499@...
> https://www.postgresql.org/message-id/20190507032543.GH1499@...

Michael,

Thanks for the cross-link. Do you think this would be valuable to
document at the same time? Or did you just want to ensure we were also
aware of this particular downfall? If the latter, I appreciate it,
it's helpful info. If the latter, let me know, and I'll try to update
the patch.

Thanks,
James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Andres Freund
In reply to this post by James Coleman
Hi,

On 2019-09-18 13:51:00 -0400, James Coleman wrote:

> In my experience it's not immediately obvious (even after reading the
> documentation) the implications of how concurrent index builds manage
> transactions with respect to multiple concurrent index builds in
> flight at the same time.
>
> Specifically, as I understand multiple concurrent index builds running
> at the same time will all return at the same time as the longest
> running one.
>
> I've attached a small patch to call this caveat out specifically in
> the documentation. I think the description in the patch is accurate,
> but please let me know if there's some intricacies around how the
> various stages might change the results.
>
> James Coleman

I'd much rather see effort spent fixing this issue as far as it relates
to concurrent CICs. For the snapshot waits we can add a procarray flag
(alongside PROCARRAY_VACUUM_FLAG) indicating that the backend is
doing. Which WaitForOlderSnapshots() can then use to ignore those CICs,
which is safe, because those transactions definitely don't insert into
relations targeted by CIC. The change to WaitForOlderSnapshots() would
just be to pass the new flag to GetCurrentVirtualXIDs, I think.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
On Wed, Mar 25, 2020 at 3:19 PM Andres Freund <[hidden email]> wrote:

>
> Hi,
>
> On 2019-09-18 13:51:00 -0400, James Coleman wrote:
> > In my experience it's not immediately obvious (even after reading the
> > documentation) the implications of how concurrent index builds manage
> > transactions with respect to multiple concurrent index builds in
> > flight at the same time.
> >
> > Specifically, as I understand multiple concurrent index builds running
> > at the same time will all return at the same time as the longest
> > running one.
> >
> > I've attached a small patch to call this caveat out specifically in
> > the documentation. I think the description in the patch is accurate,
> > but please let me know if there's some intricacies around how the
> > various stages might change the results.
> >
> > James Coleman
>
> I'd much rather see effort spent fixing this issue as far as it relates
> to concurrent CICs. For the snapshot waits we can add a procarray flag
> (alongside PROCARRAY_VACUUM_FLAG) indicating that the backend is
> doing. Which WaitForOlderSnapshots() can then use to ignore those CICs,
> which is safe, because those transactions definitely don't insert into
> relations targeted by CIC. The change to WaitForOlderSnapshots() would
> just be to pass the new flag to GetCurrentVirtualXIDs, I think.

Alvaro: I think you had some ideas on this too; any chance you've know
of a patch that anyone's got cooking?

Andres: If we got this fixed in current PG would you be opposed to
documenting the caveat in previous versions?

Thanks,
James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Alvaro Herrera-9
On 2020-Mar-25, James Coleman wrote:

> Alvaro: I think you had some ideas on this too; any chance you've know
> of a patch that anyone's got cooking?

I posted this in November
https://postgr.es/m/20191101203310.GA12239@... but I didn't
put time to go through the issues there.  I don't know if my approach is
exactly what Andres has in mind, but I was discouraged by the number of
gotchas for which the optimization I propose has to be turned off.

Maybe that preliminary patch can serve as a discussion starter, if
nothing else.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Andres Freund
In reply to this post by James Coleman
Hi,

On 2020-03-25 15:24:44 -0400, James Coleman wrote:
> Andres: If we got this fixed in current PG would you be opposed to
> documenting the caveat in previous versions?

Not really. I'm just not confident it's going to be useful, given the
amount of details needed to be provided to really make sense of the
issue (the earlier CIC phases don't wait for snapshots, but just
relation locks etc).

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Andres Freund
In reply to this post by Alvaro Herrera-9
Hi,

On 2020-03-25 16:30:10 -0300, Alvaro Herrera wrote:
> I posted this in November
> https://postgr.es/m/20191101203310.GA12239@... but I didn't
> put time to go through the issues there.

Oh, missed that.


> I don't know if my approach is exactly what Andres has in mind

Not quite. I don't think it's generally correct for CIC to set
PROC_IN_VACUUM. I'm doubtful it's the case even just for plain indexes -
we don't want rows to be pruned away from under us. I also think we'd
want to set such a flag during all of the CIC phases?

What I was thinking of was a new flag, with a distinct value from
PROC_IN_VACUUM. It'd currently just be specified in the
GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
needing to wait for other CICs on different relations. Since CIC is not
permitted on system tables, and CIC doesn't do DML on normal tables, it
seems fairly obviously correct to exclude other CICs.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Alvaro Herrera-9
On 2020-Mar-25, Andres Freund wrote:

> > I don't know if my approach is exactly what Andres has in mind
>
> Not quite. I don't think it's generally correct for CIC to set
> PROC_IN_VACUUM. I'm doubtful it's the case even just for plain indexes -
> we don't want rows to be pruned away from under us. I also think we'd
> want to set such a flag during all of the CIC phases?
>
> What I was thinking of was a new flag, with a distinct value from
> PROC_IN_VACUUM. It'd currently just be specified in the
> GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid
> needing to wait for other CICs on different relations. Since CIC is not
> permitted on system tables, and CIC doesn't do DML on normal tables, it
> seems fairly obviously correct to exclude other CICs.

Hmm, that sounds more promising.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Michael Paquier-2
On Wed, Mar 25, 2020 at 05:12:48PM -0300, Alvaro Herrera wrote:
> Hmm, that sounds more promising.

Haven't looked at that myself in details.  But as I doubt that this
would be backpatched, wouldn't it be better to document the issue for
now?
--
Michael

signature.asc (849 bytes) Download Attachment