[DOC] Document concurrent index builds waiting on each other

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
70 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: PROC_IN_ANALYZE stillborn 13 years ago

Tom Lane-2
Robert Haas <[hidden email]> writes:
> Thinking about it more, there are really two ways to think about an
> estimated row count.

> On the one hand, if you think of the row count estimate as the number
> of rows that are going to pop out of a node, then it's always right to
> think of a unique index as limiting the number of occurrences of a
> given value to 1. But, if you think of the row count estimate as a way
> of estimating the amount of work that the node has to do to produce
> that output, then it isn't.

The planner intends its row counts to be interpreted in the first way.
We do have a rather indirect way of accounting for the cost of scanning
dead tuples and such, which is that we scale scanning costs according
to the measured physical size of the relation.  That works better for
I/O costs than it does for CPU costs, but it's not completely useless
for the latter.  In any case, we'd certainly not want to increase the
scan's row count estimate for that, because that would falsely inflate
our estimate of how much work upper plan levels have to do.  Whatever
happens at the scan level, the upper levels aren't going to see those
dead tuples.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: PROC_IN_ANALYZE stillborn 13 years ago

Alvaro Herrera-9
In reply to this post by Andres Freund
On 2020-Aug-05, Andres Freund wrote:

> I'm mildly against that, because I'd really like to start making use of
> the flag. Not so much for cancellations, but to avoid the drastic impact
> analyze has on bloat.  In OLTP workloads with big tables, and without
> disabled cost limiting for analyze (or slow IO), the snapshot that
> analyze holds is often by far the transaction with the oldest xmin.

I pushed despite the objection because it seemed that downstream
discussion was largely favorable to the change, and there's a different
proposal to solve the bloat problem for analyze; and also:

> Only mildly against because it'd not be hard to reintroduce once we need
> it.

Thanks for the discussion!

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: PROC_IN_ANALYZE stillborn 13 years ago

Andres Freund
In reply to this post by Tom Lane-2
Hi,

On 2020-08-06 18:02:26 -0400, Tom Lane wrote:

> Andres Freund <[hidden email]> writes:
> > In fact using conceptually like a new snapshot for each sample tuple
> > actually seems like it'd be somewhat of an improvement over using a
> > single snapshot.
>
> Dunno, that feels like a fairly bad idea to me.  It seems like it would
> overemphasize the behavior of whatever queries happened to be running
> concurrently with the ANALYZE.  I do follow the argument that using a
> single snapshot for the whole ANALYZE overemphasizes a single instant
> in time, but I don't think that leads to the conclusion that we shouldn't
> use a snapshot at all.

I didn't actually want to suggest that we should take a separate
snapshot for every sampled row - that'd be excessively costly. What I
wanted to say was that I don't think that I don't see a clear accuraccy
benefit. E.g. not seeing any of the values inserted more recently will
under-emphasize those in the histogram.

What precisely do you mean with "overemphasize" above? I mean those will
e the rows most likely to live after the analyze is done, so including
them doesn't seem like a bad thing to me?


> Another angle that would be worth considering, aside from the issue
> of whether the sample used for pg_statistic becomes more or less
> representative, is what impact all this would have on the tuple count
> estimates that go to the stats collector and pg_class.reltuples.
> Right now, we don't have a great story at all on how the stats collector's
> count is affected by combining VACUUM/ANALYZE table-wide counts with
> the incremental deltas reported by transactions happening concurrently
> with VACUUM/ANALYZE.  Would changing this behavior make that better,
> or worse, or about the same?

Hm. Vacuum already counts rows that are inserted concurrently with the
vacuum scan, if it encounters them. Analyze doesn't. Seems like we'd at
least be wrong in a more consistent manner than before...

IIUC both analyze and vacuum will overwrite concurrent changes to
n_live_tuples. So taking concurrently committed changes into account
seems like it'd be the right thing?

We probably could make this more accurate by accounting separately for
"recently inserted and committed" rows, and taking the difference of
n_live_tuples before/after into account.  But I'm a bit doubtful that
it's worth it?

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: PROC_IN_ANALYZE stillborn 13 years ago

Tom Lane-2
In reply to this post by Alvaro Herrera-9
Alvaro Herrera <[hidden email]> writes:
> I pushed despite the objection because it seemed that downstream
> discussion was largely favorable to the change, and there's a different
> proposal to solve the bloat problem for analyze; and also:

Note that this quasi-related patch has pretty thoroughly hijacked
the CF entry for James' original docs patch proposal.  The cfbot
thinks that that's the latest patch in the original thread, and
unsurprisingly is failing to apply it.

Since the discussion was all over the place, I'm not sure whether
there's still a live docs patch proposal or not; but if so, somebody
should repost that patch (and go back to the original thread title).

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
In reply to this post by James Coleman
On Fri, Jul 31, 2020 at 2:51 PM James Coleman <[hidden email]> wrote:

>
> On Thu, Jul 16, 2020 at 7:34 PM David Johnston
> <[hidden email]> wrote:
> >
> > The following review has been posted through the commitfest application:
> > make installcheck-world:  not tested
> > Implements feature:       not tested
> > Spec compliant:           not tested
> > Documentation:            tested, passed
> >
> > James,
> >
> > I'm on board with the point of pointing out explicitly the "concurrent index builds on multiple tables at the same time will not return on any one table until all have completed", with back-patching.  I do not believe the new paragraph is necessary though.  I'd suggest trying to weave it into the existing paragraph ending "Even then, however, the index may not be immediately usable for queries: in the worst case, it cannot be used as long as transactions exist that predate the start of the index build."  Adding "Notably, " in front of the existing sentence fragment above and tacking it onto the end probably suffices.
>
> I'm not sure "the index may not be immediately usable for queries" is
> really accurate/sufficient: it seems to imply the CREATE INDEX has
> returned but for some reason the index isn't yet valid. The issue I'm
> trying to describe here is that the CREATE INDEX query itself will not
> return until all preceding queries have completed *including*
> concurrent index creations on unrelated tables.
>
> > I don't actually don't whether this is true behavior though.  Is it something our tests do, or could, demonstrate?
>
> It'd take tests that exercise parallelism, but it's pretty simple to
> demonstrate (but you do have to catch the first index build in a scan
> phase, so you either need lots of data or a hack). Here's an example
> that uses a bit of a hack to simulate a slow scan phase:
>
> Setup:
> create table items(i int);
> create table others(i int);
> create function slow_expr() returns text as $$ select pg_sleep(15);
> select '5'; $$ language sql immutable;
> insert into items(i) values (1), (2);
> insert into others(i) values (1), (2);
>
> Then the following in order:
> 1. In session A: create index concurrently on items((i::text || slow_expr()));
> 2. In session B (at the same time): create index concurrently on others(i);
>
> You'll notice that the 2nd command, which should be practically
> instantaneous, waits on the first ~30s scan phase of (1) before it
> returns. The same is true if after (2) completes you immediately run
> it again -- it waits on the second ~30s scan phase of (1).
>
> That does reveal a bit of complexity though that that the current
> patch doesn't address, which is that this can be phase dependent (and
> that complexity gets a lot more non-obvious when there's real live
> activity (particularly long-running transactions) in the system as
> well.
>
> I've attached a new patch series with two items:
> 1. A simpler (and I believe more correct) doc changes for "cic blocks
> cic on other tables".
> 2. A patch to document that all index builds can prevent tuples from
> being vacuumed away on other tables.
>
> If it's preferable we could commit the first and discuss the second
> separately, but since that limitation was also discussed up-thread, I
> decided to include them both here for now.
Álvaro's patch confused the current state of this thread, so I'm
reattaching (rebased) v2 as v3.

James

v3-0002-Document-vacuum-on-one-table-depending-on-concurr.patch (1K) Download Attachment
v3-0001-Document-concurrent-indexes-waiting-on-each-other.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PROC_IN_ANALYZE stillborn 13 years ago

James Coleman
In reply to this post by Tom Lane-2
On Sat, Aug 29, 2020 at 8:06 PM Tom Lane <[hidden email]> wrote:

>
> Alvaro Herrera <[hidden email]> writes:
> > I pushed despite the objection because it seemed that downstream
> > discussion was largely favorable to the change, and there's a different
> > proposal to solve the bloat problem for analyze; and also:
>
> Note that this quasi-related patch has pretty thoroughly hijacked
> the CF entry for James' original docs patch proposal.  The cfbot
> thinks that that's the latest patch in the original thread, and
> unsurprisingly is failing to apply it.
>
> Since the discussion was all over the place, I'm not sure whether
> there's still a live docs patch proposal or not; but if so, somebody
> should repost that patch (and go back to the original thread title).

I replied to the original email thread with reposted patches.

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Michael Paquier-2
In reply to this post by James Coleman
On Tue, Sep 08, 2020 at 01:25:21PM -0400, James Coleman wrote:
> Álvaro's patch confused the current state of this thread, so I'm
> reattaching (rebased) v2 as v3.

+  <para>
+   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
+   option) commands are included when <command>VACUUM</command> calculates what
+   dead tuples are safe to remove even on tables other than the one being indexed.
+  </para>
FWIW, this is true as well for REINDEX CONCURRENTLY because both use
the same code paths for index builds and validation, with basically
the same waiting phases.  But is CREATE INDEX the correct place for
that?  Wouldn't it be better to tell about such things on the VACUUM
doc?

0001 sounds fine to me.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

David G Johnston
On Wed, Sep 30, 2020 at 2:10 AM Michael Paquier <[hidden email]> wrote:
On Tue, Sep 08, 2020 at 01:25:21PM -0400, James Coleman wrote:
> Álvaro's patch confused the current state of this thread, so I'm
> reattaching (rebased) v2 as v3.

+  <para>
+   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
+   option) commands are included when <command>VACUUM</command> calculates what
+   dead tuples are safe to remove even on tables other than the one being indexed.
+  </para>
FWIW, this is true as well for REINDEX CONCURRENTLY because both use
the same code paths for index builds and validation, with basically
the same waiting phases.  But is CREATE INDEX the correct place for
that?  Wouldn't it be better to tell about such things on the VACUUM
doc?

0001 sounds fine to me.


v3-0002 needs a rebase over the create_index.sgml page due to the change of the nearby xref to link.  Attached as v4-0002 along with the original v3-0001.

I resisted the temptation to commit my word-smithing thoughts to the affected paragraph.  The word "phase" appearing out of nowhere struck me a bit oddly.  "Then finally the" feels like it is missing a couple of commas - or just drop the finally.  "then two table scans occur in separate transactions" reads better than "two more transactions" IMO.

For 0002 maybe focus on the fact that CREATE INDEX is a global concern even though it only names a single table in any one invocation.  As a consequence, while it is running, vacuum cannot bring the system's oldest xid more current than the oldest xid on any index-in-progress table (I don't know exactly how this works).  And, rehasing 0001, all concurrent indexing will finish at the same time.

In short maybe focus less on procedure and specific waiting states and more on the user-visible consequences.  0001 didn't really clear things up much in that regard.  It reads like we are introducing a deadlock situation even though that evidently is not the case.

I concur that vacuum's perspective on the create index global reach needs to be addressed there if it is not already.

<starts looking at vacuum>

I'm a bit confused as to why/whether create index transactions are somehow special in this regard, compared to other transactions.  I infer from the existence of 0002 that they somehow are...

My conclusion thus far is that with respect to the original complaint:

On 2019-09-18 13:51:00 -0400, James Coleman wrote:
> In my experience it's not immediately obvious (even after reading the
> documentation) the implications of how concurrent index builds manage
> transactions with respect to multiple concurrent index builds in
> flight at the same time.

These two limited scope patches have not materially moved the needle in understanding.  They are too technical when the underlying issue is comprehension by non-technical people in terms of how they see their system behave.

David J.

Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

David G Johnston
On Wed, Oct 21, 2020 at 3:25 PM David G. Johnston <[hidden email]> wrote:

v3-0002 needs a rebase over the create_index.sgml page due to the change of the nearby xref to link.  Attached as v4-0002 along with the original v3-0001.


attached...

Reading the commit message on 0002 - vacuum isn't a transaction-taking command so it wouldn't interfere with itself, create index does use transactions and thus it's not surprising that it interferes with vacuum - which looks at transactions, not commands (as most of the internals would I'd presume).

David J.


v3-0001-Document-concurrent-indexes-waiting-on-each-other.patch (2K) Download Attachment
v4-0002-Document-vacuum-on-one-table-depending-on-concurr.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Anastasia Lubennikova
Status update for a commitfest entry.

The commitfest is nearing the end and I wonder what is this discussion waiting for.
It looks like the proposed patch received its fair share of review, so I mark it as ReadyForCommitter and lay responsibility for the final decision on them.

The new status of this patch is: Ready for Committer
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Álvaro Herrera
On 2020-Nov-30, Anastasia Lubennikova wrote:

> The commitfest is nearing the end and I wonder what is this discussion waiting for.
> It looks like the proposed patch received its fair share of review, so
> I mark it as ReadyForCommitter and lay responsibility for the final
> decision on them.

I'll get these pushed now, thanks for the reminder.


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Álvaro Herrera
In reply to this post by Michael Paquier-2
On 2020-Sep-30, Michael Paquier wrote:

> +  <para>
> +   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
> +   option) commands are included when <command>VACUUM</command> calculates what
> +   dead tuples are safe to remove even on tables other than the one being indexed.
> +  </para>
> FWIW, this is true as well for REINDEX CONCURRENTLY because both use
> the same code paths for index builds and validation, with basically
> the same waiting phases.  But is CREATE INDEX the correct place for
> that?  Wouldn't it be better to tell about such things on the VACUUM
> doc?
Yeah, I think it might be more sensible to document this in
maintenance.sgml, as part of the paragraph that discusses removing
tuples "to save space".  But making it inline with the rest of the flow,
it seems to distract from higher-level considerations, so I suggest to
make it a footnote instead.

I'm not sure on the wording to use; what about this?


v5-0001-Note-CIC-and-RC-in-vacuum-s-doc.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <[hidden email]> wrote:

>
> On 2020-Sep-30, Michael Paquier wrote:
>
> > +  <para>
> > +   <command>CREATE INDEX</command> (including the <literal>CONCURRENTLY</literal>
> > +   option) commands are included when <command>VACUUM</command> calculates what
> > +   dead tuples are safe to remove even on tables other than the one being indexed.
> > +  </para>
> > FWIW, this is true as well for REINDEX CONCURRENTLY because both use
> > the same code paths for index builds and validation, with basically
> > the same waiting phases.  But is CREATE INDEX the correct place for
> > that?  Wouldn't it be better to tell about such things on the VACUUM
> > doc?
>
> Yeah, I think it might be more sensible to document this in
> maintenance.sgml, as part of the paragraph that discusses removing
> tuples "to save space".  But making it inline with the rest of the flow,
> it seems to distract from higher-level considerations, so I suggest to
> make it a footnote instead.

I have mixed feelings about wholesale moving it; users aren't likely
to read the vacuum doc when considering how running CIC might impact
their system, though I do understand why it otherwise fits there. Even
if the primary details are in the vacuum, I tend to think a reference
note (or link to the vacuum docs) in the create index docs would be
useful. The principle here is that 1.) vacuum is automatic/part of the
background of the system, not just something people trigger manually,
and 2.) we ought to document things where the user action triggering
the behavior is documented.

> I'm not sure on the wording to use; what about this?

The wording seems fine to me.

This is a replacement for what was 0002 earlier? And 0001 from earlier
still seems to be a useful standalone patch?

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Álvaro Herrera
On 2020-Nov-30, James Coleman wrote:

> On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <[hidden email]> wrote:
> >
> > On 2020-Sep-30, Michael Paquier wrote:

> > Yeah, I think it might be more sensible to document this in
> > maintenance.sgml, as part of the paragraph that discusses removing
> > tuples "to save space".  But making it inline with the rest of the flow,
> > it seems to distract from higher-level considerations, so I suggest to
> > make it a footnote instead.
>
> I have mixed feelings about wholesale moving it; users aren't likely
> to read the vacuum doc when considering how running CIC might impact
> their system, though I do understand why it otherwise fits there.

Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
it should go in REINDEX CONCURRENTLY as well.

> > I'm not sure on the wording to use; what about this?
>
> The wording seems fine to me.

Great, thanks.

> This is a replacement for what was 0002 earlier? And 0001 from earlier
> still seems to be a useful standalone patch?

0001 is the one that I got pushed yesterday, I think -- correct?
src/tools/git_changelog says:

Author: Alvaro Herrera <[hidden email]>
Branch: master [58ebe967f] 2020-11-30 18:24:55 -0300
Branch: REL_13_STABLE [3fe0e7c3f] 2020-11-30 18:24:55 -0300
Branch: REL_12_STABLE [b2603f16a] 2020-11-30 18:24:55 -0300
Branch: REL_11_STABLE [ed9c9b033] 2020-11-30 18:24:55 -0300
Branch: REL_10_STABLE [d3bd36a63] 2020-11-30 18:24:55 -0300
Branch: REL9_6_STABLE [b3d33bf59] 2020-11-30 18:24:55 -0300
Branch: REL9_5_STABLE [968a537b4] 2020-11-30 18:24:55 -0300

    Document concurrent indexes waiting on each other
   
    Because regular CREATE INDEX commands are independent, and there's no
    logical data dependency, it's not immediately obvious that transactions
    held by concurrent index builds on one table will block the second phase
    of concurrent index creation on an unrelated table, so document this
    caveat.
   
    Backpatch this all the way back.  In branch master, mention that only
    some indexes are involved.
   
    Author: James Coleman <[hidden email]>
    Reviewed-by: David Johnston <[hidden email]>
    Discussion: https://postgr.es/m/CAAaqYe994=PUrn8CJZ4UEo_S-FfRr_3ogERyhtdgHAb2WG_Ufg@...



Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
On Tue, Dec 1, 2020 at 6:51 PM Alvaro Herrera <[hidden email]> wrote:

>
> On 2020-Nov-30, James Coleman wrote:
>
> > On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <[hidden email]> wrote:
> > >
> > > On 2020-Sep-30, Michael Paquier wrote:
>
> > > Yeah, I think it might be more sensible to document this in
> > > maintenance.sgml, as part of the paragraph that discusses removing
> > > tuples "to save space".  But making it inline with the rest of the flow,
> > > it seems to distract from higher-level considerations, so I suggest to
> > > make it a footnote instead.
> >
> > I have mixed feelings about wholesale moving it; users aren't likely
> > to read the vacuum doc when considering how running CIC might impact
> > their system, though I do understand why it otherwise fits there.
>
> Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
> it should go in REINDEX CONCURRENTLY as well.

Agreed. Or, alternatively, a blurb something like "Please note how CIC
interacts with VACUUM <link>...", and then the primary language in
maintenance.sgml. That would have the benefit of maintaining the core
language in only one place.

> > > I'm not sure on the wording to use; what about this?
> >
> > The wording seems fine to me.
>
> Great, thanks.
>
> > This is a replacement for what was 0002 earlier? And 0001 from earlier
> > still seems to be a useful standalone patch?
>
> 0001 is the one that I got pushed yesterday, I think -- correct?
> src/tools/git_changelog says:
>
> Author: Alvaro Herrera <[hidden email]>
> Branch: master [58ebe967f] 2020-11-30 18:24:55 -0300
> Branch: REL_13_STABLE [3fe0e7c3f] 2020-11-30 18:24:55 -0300
> Branch: REL_12_STABLE [b2603f16a] 2020-11-30 18:24:55 -0300
> Branch: REL_11_STABLE [ed9c9b033] 2020-11-30 18:24:55 -0300
> Branch: REL_10_STABLE [d3bd36a63] 2020-11-30 18:24:55 -0300
> Branch: REL9_6_STABLE [b3d33bf59] 2020-11-30 18:24:55 -0300
> Branch: REL9_5_STABLE [968a537b4] 2020-11-30 18:24:55 -0300
>
>     Document concurrent indexes waiting on each other
>
>     Because regular CREATE INDEX commands are independent, and there's no
>     logical data dependency, it's not immediately obvious that transactions
>     held by concurrent index builds on one table will block the second phase
>     of concurrent index creation on an unrelated table, so document this
>     caveat.
>
>     Backpatch this all the way back.  In branch master, mention that only
>     some indexes are involved.
>
>     Author: James Coleman <[hidden email]>
>     Reviewed-by: David Johnston <[hidden email]>
>     Discussion: https://postgr.es/m/CAAaqYe994=PUrn8CJZ4UEo_S-FfRr_3ogERyhtdgHAb2WG_Ufg@...

Ah, yes, somehow I'd missed that that had been pushed.

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
On Tue, Dec 1, 2020 at 8:05 PM James Coleman <[hidden email]> wrote:

>
> On Tue, Dec 1, 2020 at 6:51 PM Alvaro Herrera <[hidden email]> wrote:
> >
> > On 2020-Nov-30, James Coleman wrote:
> >
> > > On Mon, Nov 30, 2020 at 4:53 PM Alvaro Herrera <[hidden email]> wrote:
> > > >
> > > > On 2020-Sep-30, Michael Paquier wrote:
> >
> > > > Yeah, I think it might be more sensible to document this in
> > > > maintenance.sgml, as part of the paragraph that discusses removing
> > > > tuples "to save space".  But making it inline with the rest of the flow,
> > > > it seems to distract from higher-level considerations, so I suggest to
> > > > make it a footnote instead.
> > >
> > > I have mixed feelings about wholesale moving it; users aren't likely
> > > to read the vacuum doc when considering how running CIC might impact
> > > their system, though I do understand why it otherwise fits there.
> >
> > Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
> > it should go in REINDEX CONCURRENTLY as well.
>
> Agreed. Or, alternatively, a blurb something like "Please note how CIC
> interacts with VACUUM <link>...", and then the primary language in
> maintenance.sgml. That would have the benefit of maintaining the core
> language in only one place.

Any thoughts on this?

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Álvaro Herrera
In reply to this post by James Coleman
On 2020-Dec-01, James Coleman wrote:

> On Tue, Dec 1, 2020 at 6:51 PM Alvaro Herrera <[hidden email]> wrote:

> > Makes sense.  ISTM that if we want to have a cautionary blurb CIC docs,
> > it should go in REINDEX CONCURRENTLY as well.
>
> Agreed. Or, alternatively, a blurb something like "Please note how CIC
> interacts with VACUUM <link>...", and then the primary language in
> maintenance.sgml. That would have the benefit of maintaining the core
> language in only one place.

I looked into this again, and I didn't like what I had added to
maintenance.sgml at all.  It seems out of place where I put it; and I
couldn't find any great spots.  Going back to your original proposal,
what about something like this?  It's just one more para in the "notes"
section in CREATE INDEX and REINDEX pages, without any additions to the
VACUUM pages.

--
Álvaro Herrera                            39°49'30"S 73°17'W

v6-0001-Highlight-vacuum-consideration-in-create-index-do.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Michael Paquier-2
On Tue, Jan 12, 2021 at 04:51:39PM -0300, Alvaro Herrera wrote:
> I looked into this again, and I didn't like what I had added to
> maintenance.sgml at all.  It seems out of place where I put it; and I
> couldn't find any great spots.  Going back to your original proposal,
> what about something like this?  It's just one more para in the "notes"
> section in CREATE INDEX and REINDEX pages, without any additions to the
> VACUUM pages.

+1.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

James Coleman
On Wed, Jan 13, 2021 at 12:58 AM Michael Paquier <[hidden email]> wrote:

>
> On Tue, Jan 12, 2021 at 04:51:39PM -0300, Alvaro Herrera wrote:
> > I looked into this again, and I didn't like what I had added to
> > maintenance.sgml at all.  It seems out of place where I put it; and I
> > couldn't find any great spots.  Going back to your original proposal,
> > what about something like this?  It's just one more para in the "notes"
> > section in CREATE INDEX and REINDEX pages, without any additions to the
> > VACUUM pages.
>
> +1.

I think one more para in the notes is good. But shouldn't we still
clarify the issue is specific to CONCURRENTLY?

Also that it's not just the table being indexed seems fairly significant.

How about something like:

---
Like any long-running transaction, <command>REINDEX CONCURRENTLY</command> can
affect which tuples can be removed by concurrent
<command>VACUUM</command> on any table.
---

James


Reply | Threaded
Open this post in threaded view
|

Re: [DOC] Document concurrent index builds waiting on each other

Álvaro Herrera
On 2021-Jan-13, James Coleman wrote:

> On Wed, Jan 13, 2021 at 12:58 AM Michael Paquier <[hidden email]> wrote:
> >
> > On Tue, Jan 12, 2021 at 04:51:39PM -0300, Alvaro Herrera wrote:
> > > I looked into this again, and I didn't like what I had added to
> > > maintenance.sgml at all.  It seems out of place where I put it; and I
> > > couldn't find any great spots.  Going back to your original proposal,
> > > what about something like this?  It's just one more para in the "notes"
> > > section in CREATE INDEX and REINDEX pages, without any additions to the
> > > VACUUM pages.
> >
> > +1.
>
> I think one more para in the notes is good. But shouldn't we still
> clarify the issue is specific to CONCURRENTLY?

How is it specific to concurrent builds?  What we're documenting here is
the behavior of vacuum, and that one is identical in both regular builds
and concurrent builds (since vacuum has to avoid removing rows from
under either of them).  The only reason concurrent builds are
interesting is because they take longer.

What was specific to concurrent builds was the fact that you can't have
more than one at a time, and that one is what was added in 58ebe967f.

> Also that it's not just the table being indexed seems fairly significant.

This is true.  So I propose

    Like any long-running transaction, <command>REINDEX</command> can
    affect which tuples can be removed by concurrent <command>VACUUM</command>
    on any table.

--
Álvaro Herrera                            39°49'30"S 73°17'W


1234