REINDEX CONCURRENTLY unexpectedly fails

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Thu, Dec 12, 2019 at 01:37:09PM -0800, Andres Freund wrote:

> On 2019-11-20 12:54:08 +0900, Michael Paquier wrote:
>> ON COMMIT DELETE ROWS does a physical truncation of the relation
>> files. And as DROP INDEX CONCURRENTLY cannot be run in a transaction
>> block, you would never face a case where you have no past TIDs which
>> could be referred to when setting the index as invalid.
>
> It's probably not reachable, but it strikes me as really fragile and
> dangerous. If e.g. somehow ON COMMIT DROP tables could exist when DROP
> CONCURRENTLY were run, the index_concurrently_set_dead() could very well
> target a row that's since been deleted in an earlier transaction.
Hmm.  That joins with your point downthread about future changes..

>> Now I don't actually object to enforce the non-concurrent path in
>> index_drop() for *all* temporary relations.  Anyway, that makes sense
>> in itself on performance grounds, similarly to the create path, so did
>> that by enforcing the flag in index_drop() (doDeletion would be
>> tempting but I took the problem at its root).  And added some tests
>> for the drop path and an extra assertion.
>
> Cool.
>
> I still think we'd be well served to add a few CheckTableNotInUse() type
> checks...
Sure.  We have already one in drop_index, so DROP INDEX is covered, as
well as reindex_index() which is taken by all non-concurrent REINDEX
commands.  Adding one in ReindexRelationConcurrently() may make
sense..

>> Considered that, but ON COMMIT DROP does not make sense because it
>> requires a transaction context, which is why I did not add one.  And
>> it seems to me that there is not much value to just check after CIC or
>> REINDEX's restriction to not run in a transaction block?  I added
>> tests for these two, but I am of the opinion that they don't bring
>> much.
>
> I think because CIC now falls back to non-concurrent mode, it's
> worthwhile to exercise this path. It seems far from unlikely that the
> code gets moved around enough that suddenly CIC is allowed in
> transactions when targetting temp tables.
That's a good point, we have no guarantee that nobody would play with
this area in the future.  Well, the patch has those tests anyway since
the last version, so I have not touched them.

>> I think that documenting it is good for the end-user as well.
>
> Why?

Even if using a temporary table, the commands are not allowed within a
transaction block, but we still track them in wait events so seeing
an event related only to a non-concurrent path when using CONCURRENTLY
can be confusing.

>> + /*
>> + * Enforce non-concurrent drop if the relation does not support this
>> + * option.
>> + */
>> + if (!RelationSupportsConcurrently(get_rel_persistence(indexId)))
>> + concurrent = false;
>> +
>
> Echoing Alvaro, I'm less than convinced by this name.

I would really keep "Relation" in this part of the naming as this can
be used for an index or its parent table, so in the updated attached I
have gone with RelationSupportsConcurrentIndexing(), which is a
suggestion from Alvaro.

> Copying this to some, but not all, the places where
> RelationSupportsConcurrently() is called doesn't seem helpful...

Not sure I follow your point here.  The following code paths are
currently checked in the patch using this routine:
- index_drop, both used by DROP INDEX and REINDEX CONCURRENTLY.  This
routine is called basically via performMultipleDeletions().  For
REINDEX CONCURRENTLY, this cannot be actually reached, but not for
DROP INDEX CONCURRENTLY.  The logic to decide which drop behavior to
choose is done in RemoveRelations().  And while we don't support
dropping multiple objects with CONCURRENTLY, we have no way to say now
for each object which lock level should be used for the drop, so it
seems safer to me now to enforce non-concurrent to be used directly in
index_drop rather than doing so at a higher level.
- ReindexIndex, ReindexTable and ReindexMultipleTables, to check if
the non-concurrent or concurrent paths need to be called for
respectively REINDEX INDEX, TABLE and SCHEMA/DATABASE/SYSTEM.
- RelationSupportsConcurrentIndexing, as the entry point for CREATE
INDEX.

+   /*
+    * Enforce non-concurrent build if the relation does not support this
+    * option.
+    */
Or are you suggesting to remove this comment from the two places where
it is used because it does not prove to help much?

> If we want to add docs, I'd say at most something like "For temporary
> tables index creation is always non-concurrent, as no other session can
> access them, and non-concurrent index creation is cheaper.".

Sounds like a better wording to me.  Documenting it still seems rather
important to me as I suspect that it could surprise some users.
--
Michael

reindex-conc-temp-v3.patch (15K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Fri, Dec 13, 2019 at 12:45:36PM +0900, Michael Paquier wrote:
> Sounds like a better wording to me.  Documenting it still seems rather
> important to me as I suspect that it could surprise some users.

So, are there any more comments/objections about this stuff?  I have
plans to look at that again and wrap it at the beginning of January.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Heikki Linnakangas
On 23/12/2019 03:59, Michael Paquier wrote:

> +/*
> + * RelationSupportsConcurrentIndexing
> + *
> + * Check if a relation supports concurrent builds or not.  This is
> + * used as a sanity check prior processing CREATE INDEX, DROP INDEX
> + * or REINDEX when using CONCURRENTLY.
> + */
> +bool
> +RelationSupportsConcurrentIndexing(char relpersistence)
> +{
> + /*
> + * Build indexes non-concurrently for temporary relations.  Such
> + * relations only work with the session assigned to them, so they are
> + * not subject to concurrent concerns, and a concurrent build would
> + * cause issues with ON COMMIT actions triggered by the transactions
> + * of the concurrent build.  A non-concurrent reindex is also more
> + * efficient in this case.
> + */
> + if (relpersistence == RELPERSISTENCE_TEMP)
> + return false;
> +
> + return true;
> +}
> +

The comment says "this is used as a sanity check". "Sanity check"
implies that it should never happen, but it is perfectly normal for it
to return false.

The caller in DefineIndex() calls RelationSupportsConcurrentIndexing()
only after choosing the lock mode. That's fine for temporary tables, but
if wouldn't work right if RelationSupportsConcurrentIndexing() started
to return false for some other tables. Maybe it would be clearer to just
check "relpersistence == RELPERSISTENCE_TEMP" directly in the callers,
and not have the RelationSupportsConcurrentIndexing() function.

The new text in drop_index.sgml talks about index creation, copy-pasted
from create_index.sgml.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Tue, Jan 07, 2020 at 05:33:23PM +0200, Heikki Linnakangas wrote:
> The comment says "this is used as a sanity check". "Sanity check" implies
> that it should never happen, but it is perfectly normal for it to return
> false.

Fixed, thanks.

> The caller in DefineIndex() calls RelationSupportsConcurrentIndexing() only
> after choosing the lock mode. That's fine for temporary tables, but if
> wouldn't work right if RelationSupportsConcurrentIndexing() started to
> return false for some other tables. Maybe it would be clearer to just check
> "relpersistence == RELPERSISTENCE_TEMP" directly in the callers, and not
> have the RelationSupportsConcurrentIndexing() function.

The routine has the advantage of avoiding extra duplication of
comments to justify the choice of enforcing the non-concurrent path as
mentioned upthread.  So I'd rather keep it.

> The new text in drop_index.sgml talks about index creation, copy-pasted from
> create_index.sgml.

Thanks.  Fixed.

I have spent a couple of hours poking at this code, and found two
problems:
1) The error reporting for PROGRESS_CREATEIDX_COMMAND would report a
concurrent build, but that's not the case if the work happens for a
temporary table in DefineIndex(), so the call to
RelationSupportsConcurrentIndexing needs to happen before any look at
the concurrent flag is done.  That's easy enough to fix.
2) The handling of the patch within index_drop is too weak.  As
presented, the patch first locks the OID using a RangeVar.  However
for a temporary relation we would first take ShareUpdateExclusiveLock
RemoveRelations() and then upgrade to a AccessExclusiveLock in
index_drop().  I think that actually the check in index_drop() is not
necessary, and that instead we had better do three things:
a) In RangeVarCallbackForDropRelation(), if the relation is temporary,
use AccessExclusiveLock all the time, and we know the OID of the
relation here.
b) After locking the OID with the RangeVar, re-check if the relation
is temporary, and then remove PERFORM_DELETION_CONCURRENTLY is.
c) Add an assertion in index_drop() to be sure that this code path is
never invoked concurrently with a temporary relation.

I am lacking of time today, I'll continue tomorrow.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Wed, Jan 08, 2020 at 05:19:30PM +0900, Michael Paquier wrote:

> I have spent a couple of hours poking at this code, and found two
> problems:
> 1) The error reporting for PROGRESS_CREATEIDX_COMMAND would report a
> concurrent build, but that's not the case if the work happens for a
> temporary table in DefineIndex(), so the call to
> RelationSupportsConcurrentIndexing needs to happen before any look at
> the concurrent flag is done.  That's easy enough to fix.
> 2) The handling of the patch within index_drop is too weak.  As
> presented, the patch first locks the OID using a RangeVar.  However
> for a temporary relation we would first take ShareUpdateExclusiveLock
> RemoveRelations() and then upgrade to a AccessExclusiveLock in
> index_drop().  I think that actually the check in index_drop() is not
> necessary, and that instead we had better do three things:
> a) In RangeVarCallbackForDropRelation(), if the relation is temporary,
> use AccessExclusiveLock all the time, and we know the OID of the
> relation here.
> b) After locking the OID with the RangeVar, re-check if the relation
> is temporary, and then remove PERFORM_DELETION_CONCURRENTLY is.
> c) Add an assertion in index_drop() to be sure that this code path is
> never invoked concurrently with a temporary relation.
>
> I am lacking of time today, I'll continue tomorrow.
Okay, so here is an updated patch fixing those issues, with several
modifications done to the patch (docs, updates for the assertions,
some redesign).  Considering again those aspects, I have come up with
the same conclusion as what's stated above, though you actually need
to make sure that it is RangeVarGetRelidExtended() that has to be
careful about the lock to use on the temporary relation, before
anything else is done.  The callback RangeVarCallbackForDropRelation()
also needs to be careful about the relation it looks at and check if
the relation supports concurrent indexing.  On the other hand, we
could also say that we don't care about lock upgrade risks when
working on temporary tables because these are not accessed by other
sessions, though that's not a sane base to rely on IMO.  A solution
involving RangeVarGetRelidExtended() feels also like a sledgehammer to
smash a peanut, because it has a wider impact.  If lock upgrade risks
are not worth bothering, this needs to be clearly documented in the
patch with more comments.

As the patch has been heavily modified, I am switching it back to
"Needs Review" for now and I'd like to discuss more about the lock
upgrade risks, particularly if it is considered worth the effort for
temporary relations.  Thoughts are welcome.
--
Michael

reindex-conc-temp-v4.patch (17K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
Hi Heikki,

On Thu, Jan 09, 2020 at 12:06:19PM +0900, Michael Paquier wrote:
> As the patch has been heavily modified, I am switching it back to
> "Needs Review" for now and I'd like to discuss more about the lock
> upgrade risks, particularly if it is considered worth the effort for
> temporary relations.  Thoughts are welcome.

You are registered as a reviewer for this patch:
https://commitfest.postgresql.org/26/2358/

Are you planning to look at it?  Do you have some thoughts to share
about what I wrote previously?

Discarding the lock upgrade considerations is possible.  Another
approach would be, instead of ignoring CONCURRENTLY for temporary
tables, to fail if the operation is run, and ignore these when a
dabatase-wide reindex concurrently happens.  This was not liked much
at the beginning of the thread though.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Heikki Linnakangas
In reply to this post by Michael Paquier-2
On 09/01/2020 05:06, Michael Paquier wrote:

> On Wed, Jan 08, 2020 at 05:19:30PM +0900, Michael Paquier wrote:
>> 2) The handling of the patch within index_drop is too weak.  As
>> presented, the patch first locks the OID using a RangeVar.  However
>> for a temporary relation we would first take ShareUpdateExclusiveLock
>> RemoveRelations() and then upgrade to a AccessExclusiveLock in
>> index_drop().  I think that actually the check in index_drop() is not
>> necessary, and that instead we had better do three things:
>> a) In RangeVarCallbackForDropRelation(), if the relation is temporary,
>> use AccessExclusiveLock all the time, and we know the OID of the
>> relation here.
>> b) After locking the OID with the RangeVar, re-check if the relation
>> is temporary, and then remove PERFORM_DELETION_CONCURRENTLY is.
>> c) Add an assertion in index_drop() to be sure that this code path is
>> never invoked concurrently with a temporary relation.
>>
>> I am lacking of time today, I'll continue tomorrow.
>
> Okay, so here is an updated patch fixing those issues, with several
> modifications done to the patch (docs, updates for the assertions,
> some redesign).  Considering again those aspects, I have come up with
> the same conclusion as what's stated above, though you actually need
> to make sure that it is RangeVarGetRelidExtended() that has to be
> careful about the lock to use on the temporary relation, before
> anything else is done.  The callback RangeVarCallbackForDropRelation()
> also needs to be careful about the relation it looks at and check if
> the relation supports concurrent indexing.  On the other hand, we
> could also say that we don't care about lock upgrade risks when
> working on temporary tables because these are not accessed by other
> sessions, though that's not a sane base to rely on IMO.  A solution
> involving RangeVarGetRelidExtended() feels also like a sledgehammer to
> smash a peanut, because it has a wider impact.

I'm not a fan of all those changes in RangeVarCallbackForDropRelation()
to ensure that you get an AccessExclusiveLock to begin with. It gets
pretty complicated, and it feels like you need to special-case temporary
tables in dozen different places. I liked the v3 of this patch better.
It's true that you're upgrading the ShareUpdateExclusiveLock to
AccessExclusiveLock, but since it's a temporary table, there really
should be no other backend holding a lock on it.

> If lock upgrade risks are not worth bothering, this needs to be
> clearly documented in the patch with more comments.
Yes, definitely.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Tue, Jan 14, 2020 at 11:41:11PM +0200, Heikki Linnakangas wrote:
> I'm not a fan of all those changes in RangeVarCallbackForDropRelation() to
> ensure that you get an AccessExclusiveLock to begin with. It gets pretty
> complicated, and it feels like you need to special-case temporary tables in
> dozen different places. I liked the v3 of this patch better. It's true that
> you're upgrading the ShareUpdateExclusiveLock to AccessExclusiveLock, but
> since it's a temporary table, there really should be no other backend
> holding a lock on it.

Thanks for taking the time to share your opinion.  That was as well my
feeling with the peanut and the sledgehammer.  I liked the peanuts,
but not the hammer part.

There are still some parts I liked about v4 (doc wording, tweaks about
the shape of RelationSupportsConcurrentIndexing and its use in
assertions, setting up the concurrent flag in RemoveRelation and use an
assert in index_drop is also cleaner), so I kept a good portion of
v4.  Attached is an updated patch, v5, that removes the parts
enforcing the lock when looking at the relation OID based on its
RangeVar.

Any thoughts?
--
Michael

reindex-conc-temp-v5.patch (16K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Heikki Linnakangas
On 15/01/2020 03:39, Michael Paquier wrote:
> Thanks for taking the time to share your opinion.  That was as well my
> feeling with the peanut and the sledgehammer.  I liked the peanuts,
> but not the hammer part.

Heh, yeah :-).

> There are still some parts I liked about v4 (doc wording, tweaks about
> the shape of RelationSupportsConcurrentIndexing and its use in
> assertions, setting up the concurrent flag in RemoveRelation and use an
> assert in index_drop is also cleaner), so I kept a good portion of
> v4.  Attached is an updated patch, v5, that removes the parts
> enforcing the lock when looking at the relation OID based on its
> RangeVar.
>
> Any thoughts?

Some comments below:

> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
> index 3e59e647e5..4139232b51 100644
> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
> @@ -2016,6 +2016,13 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
>   LOCKTAG heaplocktag;
>   LOCKMODE lockmode;
>  
> + /*
> + * A relation not supporting concurrent indexing should never do
> + * a concurrent index drop or try to use a concurrent lock mode.
> + */
> + Assert(RelationSupportsConcurrentIndexing(indexId) ||
> +   (!concurrent && !concurrent_lock_mode));
> +
>   /*
>   * To drop an index safely, we must grab exclusive lock on its parent
>   * table.  Exclusive lock on the index alone is insufficient because
Or maybe decide to do non-current drop within index_drop() itself,
instead of requiring the caller to set 'concurrent' differently for
temporary tables?

> @@ -2490,6 +2500,30 @@ CompareIndexInfo(IndexInfo *info1, IndexInfo *info2,
>   return true;
>  }
>  
> +/*
> + * RelationSupportsConcurrentIndexing
> + *
> + * Check if a relation supports concurrent builds or not.  This is used
> + * before processing CREATE INDEX, DROP INDEX or REINDEX when using
> + * CONCURRENTLY to decide if the operation is supported.
> + */
> +bool
> +RelationSupportsConcurrentIndexing(Oid relid)
> +{
> + /*
> + * Build indexes non-concurrently for temporary relations.  Such
> + * relations only work with the session assigned to them, so they are
> + * not subject to concurrent concerns, and a concurrent build would
> + * cause issues with ON COMMIT actions triggered by the transactions
> + * of the concurrent build.  A non-concurrent reindex is also more
> + * efficient in this case.
> + */
> + if (get_rel_persistence(relid) == RELPERSISTENCE_TEMP)
> + return false;
> +
> + return true;
> +}
> +
Sorry to beat a dead hore, but I still don't like this function:

* Does it take a table OID or index OID? (Answer: both)

* There's a hidden assumption that if
RelationSupportsConcurrentIndexing() returns false, then it's OK to
upgrade the lock. That's true today, but if we added other conditions
when RelationSupportsConcurrentIndexing() returned false, there would be
trouble. It seems like a bad abstraction.

This would be better if the function was renamed to something like "Is
it OK to upgrade a CONCURRENTLY build to non-CONCURRENTLY?", but meh. I
understand that it's nice to have a place for this comment, so that it
doesn't need to be repeated in so many places. But I feel that a little
bit of repetition is better in this case. The reasoning isn't exactly
the same for CREATE INDEX, DROP INDEX, and REINDEX anyway.

> diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
> index 52ce02f898..d63a885638 100644
> --- a/src/backend/commands/indexcmds.c
> +++ b/src/backend/commands/indexcmds.c
> @@ -485,6 +485,13 @@ DefineIndex(Oid relationId,
>   GUC_ACTION_SAVE, true, 0, false);
>   }
>  
> + /*
> + * Enforce non-concurrent build if the relation does not support this
> + * option.  Do this before any use of the concurrent option is done.
> + */
> + if (!RelationSupportsConcurrentIndexing(relationId))
> + stmt->concurrent = false;
> +
Is it OK to scribble on the original 'stmt' here? Doesn't seem kosher,
although it probably works fine in practice.

> @@ -2769,6 +2778,9 @@ ReindexRelationConcurrently(Oid relationOid, int options)
>   /* Open relation to get its indexes */
>   heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
>  
> + /* Relation had better support concurrent indexing */
> + Assert(RelationSupportsConcurrentIndexing(relationOid));
> +
>   /* Add all the valid indexes of relation to list */
>   foreach(lc, RelationGetIndexList(heapRelation))
>   {
Do we care whether the *table* supports concurrent indexing, rather than
individual indexes? I guess that's academic, since you can't have
temporary indexes on a permanent table, or vice versa.

> @@ -2937,6 +2952,13 @@ ReindexRelationConcurrently(Oid relationOid, int options)
>   heapRel = table_open(indexRel->rd_index->indrelid,
>   ShareUpdateExclusiveLock);
>  
> + /*
> + * Also check for active uses of the relation in the current
> + * transaction, including open scans and pending AFTER trigger
> + * events.
> + */
> + CheckTableNotInUse(indexRel, "REINDEX");
> +
>   pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
>    RelationGetRelid(heapRel));
>   pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
I don't understand why this is required for this patch. It seems like a
good thing to check, I think otherwise you get an error from the
CheckTableNotInUse() call in index_drop(), in phase 6 where the old
indexes are dropped. But it seems unrelated to the rest of the patch.
Maybe commit it as a separate patch?

I came up with the attached version. It seems a bit more clear to me.
I'm not 100% wedded to this, though, so if you want to proceed based on
your version instead, feel free. The docs and the tests are unchanged.

- Heikki

reindex-conc-temp-v5-heikki.patch (18K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Fri, Jan 17, 2020 at 02:53:11PM +0200, Heikki Linnakangas wrote:

> On 15/01/2020 03:39, Michael Paquier wrote:
>> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
>> index 3e59e647e5..4139232b51 100644
>> --- a/src/backend/catalog/index.c
>> +++ b/src/backend/catalog/index.c
>> @@ -2016,6 +2016,13 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
>>   LOCKTAG heaplocktag;
>>   LOCKMODE lockmode;
>> + /*
>> + * A relation not supporting concurrent indexing should never do
>> + * a concurrent index drop or try to use a concurrent lock mode.
>> + */
>> + Assert(RelationSupportsConcurrentIndexing(indexId) ||
>> +   (!concurrent && !concurrent_lock_mode));
>> +
>>   /*
>>   * To drop an index safely, we must grab exclusive lock on its parent
>>   * table.  Exclusive lock on the index alone is insufficient because
>
> Or maybe decide to do non-current drop within index_drop() itself, instead
> of requiring the caller to set 'concurrent' differently for temporary
> tables?
A portion which stresses me with your approach (and that I actually
used in the first versions of my patch upthread), is that we have some
extra work related to PERFORM_DELETION_CONCURRENTLY being done in
dependency.c, which becomes basically useless if you enforce the flag
only in index_drop() without making sure that the root flag is set in
RemoveRelations() (see for example the top of deleteOneObject()),
and this causes the index drop to actually mix more concurrent and
non-concurrent concepts than just the lock upgrade issue.

>> +bool
>> +RelationSupportsConcurrentIndexing(Oid relid)
>> +{
>
> Sorry to beat a dead hore, but I still don't like this function:
>
> * Does it take a table OID or index OID? (Answer: both)

Yes, the persistency is inherited.  The function mentioned a relation,
so that applied to any relkind actually.  Perhaps the function should
have made that clearer with a assertion using a relkind check or
such.  But the original sounded pretty clear to me.

> * There's a hidden assumption that if RelationSupportsConcurrentIndexing()
> returns false, then it's OK to upgrade the lock. That's true today, but if
> we added other conditions when RelationSupportsConcurrentIndexing() returned
> false, there would be trouble. It seems like a bad abstraction.
>
> This would be better if the function was renamed to something like "Is it OK
> to upgrade a CONCURRENTLY build to non-CONCURRENTLY?", but meh. I understand
> that it's nice to have a place for this comment, so that it doesn't need to
> be repeated in so many places. But I feel that a little bit of repetition is
> better in this case. The reasoning isn't exactly the same for CREATE INDEX,
> DROP INDEX, and REINDEX anyway.
Okay, I see your points.  So I am fine with your line of arguments.

>> + /*
>> + * Enforce non-concurrent build if the relation does not support this
>> + * option.  Do this before any use of the concurrent option is done.
>> + */
>> + if (!RelationSupportsConcurrentIndexing(relationId))
>> + stmt->concurrent = false;
>> +
>
> Is it OK to scribble on the original 'stmt' here? Doesn't seem kosher,
> although it probably works fine in practice.
The idea was to avoid any future issues if any code refactored in this
area passes down IndexStmt to a subroutine and uses the concurrent
flag.  I guess that would be hard to miss if using a local variable as
you do anyway..

> Do we care whether the *table* supports concurrent indexing, rather than
> individual indexes? I guess that's academic, since you can't have temporary
> indexes on a permanent table, or vice versa.

I cared about that enough, but that's a very defensive position :)
But I agree that having a check only for individual indexes would be
just but fine.

>> + /*
>> + * Also check for active uses of the relation in the current
>> + * transaction, including open scans and pending AFTER trigger
>> + * events.
>> + */
>> + CheckTableNotInUse(indexRel, "REINDEX");
>> +
>>   pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
>
> I don't understand why this is required for this patch. It seems like a good
> thing to check, I think otherwise you get an error from the
> CheckTableNotInUse() call in index_drop(), in phase 6 where the old indexes
> are dropped. But it seems unrelated to the rest of the patch. Maybe commit
> it as a separate patch?
I added that per a suggestion from Andres while touching this area of
the code.  But you are right that it would make sense to remove it
from this patch, and get that committed separately.  I don't mind
starting a new thread for this matter instead of having this
discussing buried within this thread.  Does it make sense?

> @@ -943,8 +962,11 @@ DefineIndex(Oid relationId,
>   /*
>   * A valid stmt->oldNode implies that we already have a built form of the
>   * index.  The caller should also decline any index build.
> + *
> + * FIXME: should this check 'concurrent' or 'stmt->concurrent'? I don't
> + * quite understand the conditions here.
>   */
> - Assert(!OidIsValid(stmt->oldNode) || (skip_build && !stmt->concurrent));
> + Assert(!OidIsValid(stmt->oldNode) || (skip_build && !concurrent));
[ .. thinks .. ]
It seems to me that this should be using the local variable.
SKIP_BUILD is used in some cases for ALTER TABLE, CIC and REINDEX
CONCURRENTLY (for the creation of the duplicate index entry).  oldNode
gets used in ALTER TABLE when attempting to reuse an existing index
when changing a column type for example.

> - if (concurrent)
> + /*
> + * Obtain the current persistence of the existing table.  We already hold
> + * lock on it.
> + */
> + heapRel = table_open(heapOid, NoLock);
> + persistence = heapRel->rd_rel->relpersistence;
> + table_close(heapRel, NoLock);

Why not just using get_rel_persistence() here, as done in
ReindexMultipleTables()?

> + /* This function shouldn't be called for temporary relations. */
> + if (indexRel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
> + elog(ERROR, "cannot reindex a temporary table concurrently");

You are right that an elog() is better than an assertion here as this
uses a catalog data.

> I came up with the attached version. It seems a bit more clear to me. I'm
> not 100% wedded to this, though, so if you want to proceed based on your
> version instead, feel free. The docs and the tests are unchanged.

Except for the part with index_drop() where I would rather do the
decision-making for the concurrent behavior in RemoveRelations()
rather than index_drop() (as I did in v4), what you have here looks
fine to me.  Would you prefer wrapping up this stuff yourself or
should I?  This needs a backpatch down to 9.4 for the CIC/DIC part.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Mon, Jan 20, 2020 at 10:59:13AM +0900, Michael Paquier wrote:
> Except for the part with index_drop() where I would rather do the
> decision-making for the concurrent behavior in RemoveRelations()
> rather than index_drop() (as I did in v4), what you have here looks
> fine to me.  Would you prefer wrapping up this stuff yourself or
> should I?  This needs a backpatch down to 9.4 for the CIC/DIC part.

Same feeling after sleeping on it.  I have worked more this morning on
this stuff and I am finishing with the attached, which is a gathering
of everything that has been discussed, and based on Heikki's v5:
- Changed the part for DROP INDEX CONCURRENTLY to do the
decision-making in RemoveRelations() at the earliest stage possible.
- Removed the call to CheckTableNotInUse() in
ReindexRelationConcurrently().  Let's use a separate patch/thread for
that.
- Found one typo in the docs of REINDEX.

If there are no objections, I would like to wrap that in the next day
or so (still need to do the work for the back-branches).
--
Michael

reindex-conc-temp-v6.patch (18K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: REINDEX CONCURRENTLY unexpectedly fails

Michael Paquier-2
On Tue, Jan 21, 2020 at 11:43:03AM +0900, Michael Paquier wrote:
> If there are no objections, I would like to wrap that in the next day
> or so (still need to do the work for the back-branches).

There were various conflicts across the various back-branches, but
nothing depressing either.  Committed and back-patched down to 9.4.
--
Michael

signature.asc (849 bytes) Download Attachment
12