Inadequate executor locking of indexes

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Inadequate executor locking of indexes

Tom Lane-2
I discovered that it's possible to trigger relation_open's new assertion
about having a lock on the relation by the simple expedient of running
the core regression tests with plan_cache_mode = force_generic_plan.
(This doesn't give me a terribly warm feeling about how thoroughly we've
exercised the relevant code, but I don't know what to do about that.)

The reason for the crash is that nodeIndexscan.c and friends all open
their index-to-scan with code like

    /*
     * Open the index relation.
     *
     * If the parent table is one of the target relations of the query, then
     * InitPlan already opened and write-locked the index, so we can avoid
     * taking another lock here.  Otherwise we need a normal reader's lock.
     */
    relistarget = ExecRelationIsTargetRelation(estate, node->scan.scanrelid);
    indexstate->iss_RelationDesc = index_open(node->indexid,
                                              relistarget ? NoLock : AccessShareLock);

Now, at the time this code was written, InitPlan did indeed ensure that
indexes of a plan's result relation were already opened and locked,
because it calls InitResultRelInfo which used to call ExecOpenIndices.
Nowadays, the ExecOpenIndices part is postponed to ExecInitModifyTable,
which figures it can optimize matters:

        /*
         * If there are indices on the result relation, open them and save
         * descriptors in the result relation info, so that we can add new
         * index entries for the tuples we add/update.  We need not do this
         * for a DELETE, however, since deletion doesn't affect indexes. Also,
         * inside an EvalPlanQual operation, the indexes might be open
         * already, since we share the resultrel state with the original
         * query.
         */
        if (resultRelInfo->ri_RelationDesc->rd_rel->relhasindex &&
            operation != CMD_DELETE &&
            resultRelInfo->ri_IndexRelationDescs == NULL)
            ExecOpenIndices(resultRelInfo,
                            node->onConflictAction != ONCONFLICT_NONE);

Therefore, in a plan consisting of a DELETE ModifyTable atop an indexscan
of the target table, we are opening the index without the executor
acquiring any lock on the index.  The only thing that saves us from
triggering that Assert is that in most cases the planner has already
taken a lock on any index it considered (and not released that lock).
But using a prepared plan breaks that.

This problem is ancient; it's unrelated to the recent changes to reduce
executor locking, because that only concerned table locks not index locks.
I'm not certain how much real-world impact it has, since holding a lock
on the index's parent table is probably enough to prevent dire things
from happening in most cases.  Still, it seems like a bug.

There are several things we might do to fix this:

1. Drop the "operation != CMD_DELETE" condition from the above-quoted bit
in ExecInitModifyTable.  We might be forced into that someday anyway if
we want to support non-heap-style tables, since most other peoples'
versions of indexes do want to know about deletions.

2. Drop the target-table optimization from nodeIndexscan.c and friends,
and just always open the scan index with AccessShareLock.  (BTW, it's
a bit inconsistent that these nodes don't release their index locks
at the end; ExecCloseIndices does.)

3. Teach nodeIndexscan.c and friends about the operation != CMD_DELETE
exception, so that they get the lock for themselves in that case.  This
seems pretty ugly/fragile, but it's about the only option that won't end
in adding index lock-acquisition overhead in some cases.  (But given the
planner's behavior, it's not clear to me how often that really matters.)

BTW, another thing that is bothering me about this is that, both with
the current code and options #1 and #3, there's an assumption that any
indexscan on a query's target table must be underneath the relevant
ModifyTable plan node.  Given some other plan tree structure it might
be possible to initialize the indexscan node before the ModifyTable,
breaking the assumption that the latter already got index locks.  I'm
not sure if that's actually possible at present, but it doesn't seem
like I'd want to bet on it always being so in the future.  This might
be an argument for going with option #2, which eliminates all such
assumptions.

Another idea is to make things work more like we just made them work for
tables, which is to ensure that all index locks are taken before reaching
the executor, or at least as part of some other processing than the plan
tree walk.  That would be a good deal more work than any of the options
listed above, though, and it would definitely not be back-patchable if we
decide we need a back-patched fix.

Thoughts?

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
On Thu, 8 Nov 2018 at 13:14, Tom Lane <[hidden email]> wrote:
>
> I discovered that it's possible to trigger relation_open's new assertion
> about having a lock on the relation by the simple expedient of running
> the core regression tests with plan_cache_mode = force_generic_plan.


> There are several things we might do to fix this:
>
> 1. Drop the "operation != CMD_DELETE" condition from the above-quoted bit
> in ExecInitModifyTable.  We might be forced into that someday anyway if
> we want to support non-heap-style tables, since most other peoples'
> versions of indexes do want to know about deletions.
>
> 2. Drop the target-table optimization from nodeIndexscan.c and friends,
> and just always open the scan index with AccessShareLock.  (BTW, it's
> a bit inconsistent that these nodes don't release their index locks
> at the end; ExecCloseIndices does.)
>
> 3. Teach nodeIndexscan.c and friends about the operation != CMD_DELETE
> exception, so that they get the lock for themselves in that case.  This
> seems pretty ugly/fragile, but it's about the only option that won't end
> in adding index lock-acquisition overhead in some cases.  (But given the
> planner's behavior, it's not clear to me how often that really matters.)

Since I missed this and only discovered this was a problem when
someone else reported it today, and since I already did my own
analysis separately in [1], then my vote is for #2.  For partitioned
table updates, there may be many result relations which can cause
ExecRelationIsTargetRelation() to become very slow, in such a case any
additional redundant lock would be cheap by comparison.

Ideally, the locking code would realise we already hold a stronger
lock and skip the lock, but I don't see how that's realistically
possible without probing the hash table for all stronger lock types
first, which would likely damage the performance of locking.

[1] https://www.postgresql.org/message-id/CAKJS1f-DyKTYyMf9oxn1PQ%3DWyEOOjfVcV-dCc6eB9eat_MYPeA%40mail.gmail.com

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

Tom Lane-2
David Rowley <[hidden email]> writes:

> On Thu, 8 Nov 2018 at 13:14, Tom Lane <[hidden email]> wrote:
>> There are several things we might do to fix this:
>>
>> 1. Drop the "operation != CMD_DELETE" condition from the above-quoted bit
>> in ExecInitModifyTable.  We might be forced into that someday anyway if
>> we want to support non-heap-style tables, since most other peoples'
>> versions of indexes do want to know about deletions.
>>
>> 2. Drop the target-table optimization from nodeIndexscan.c and friends,
>> and just always open the scan index with AccessShareLock.  (BTW, it's
>> a bit inconsistent that these nodes don't release their index locks
>> at the end; ExecCloseIndices does.)
>>
>> 3. Teach nodeIndexscan.c and friends about the operation != CMD_DELETE
>> exception, so that they get the lock for themselves in that case.  This
>> seems pretty ugly/fragile, but it's about the only option that won't end
>> in adding index lock-acquisition overhead in some cases.  (But given the
>> planner's behavior, it's not clear to me how often that really matters.)

> Since I missed this and only discovered this was a problem when
> someone else reported it today, and since I already did my own
> analysis separately in [1], then my vote is for #2.

Thinking more about this, the problem I noted previously about two of
these solutions not working if the index scan node is not physically
underneath the ModifyTable node actually applies to all three :-(.
It's a slightly different issue for #2, namely that what we risk is
first taking AccessShareLock and then upgrading to RowExclusiveLock.
Since there are places (not many) that take ShareLock on indexes,
this would pose a deadlock risk.

Now, after more thought, I believe that it's probably impossible
to have a plan tree in which ExecRelationIsTargetRelation would
return true at an indexscan node that's not underneath the ModifyTable
node.  What *is* possible is that we have such an indexscan on a
different RTE for the same table.  I constructed this admittedly
artificial example in the regression database:

# explain with x1 as (select * from tenk1 t1 where unique1 = 42),
x2 as (update tenk1 t2 set two = 2 where unique2 = 11 returning *)
select * from x1,x2 where x1.ten = x2.ten;
                                          QUERY PLAN                                          
----------------------------------------------------------------------------------------------
 Nested Loop  (cost=16.61..16.66 rows=1 width=488)
   Join Filter: (x1.ten = x2.ten)
   CTE x1
     ->  Index Scan using tenk1_unique1 on tenk1 t1  (cost=0.29..8.30 rows=1 width=244)
           Index Cond: (unique1 = 42)
   CTE x2
     ->  Update on tenk1 t2  (cost=0.29..8.30 rows=1 width=250)
           ->  Index Scan using tenk1_unique2 on tenk1 t2  (cost=0.29..8.30 rows=1 width=250)
                 Index Cond: (unique2 = 11)
   ->  CTE Scan on x1  (cost=0.00..0.02 rows=1 width=244)
   ->  CTE Scan on x2  (cost=0.00..0.02 rows=1 width=244)
(11 rows)

Because the CTEs will be initialized in order, this presents a case
where the lock-upgrade hazard exists today: the x1 indexscan will
open tenk1_unique1 with AccessShareLock and then x2's ModifyTable
node will upgrade that to RowExclusiveLock.  None of the proposed
fixes improve this.

I'm beginning to think that postponing target-index locking to
ExecInitModifyTable was a damfool idea and we should undo it.

> For partitioned
> table updates, there may be many result relations which can cause
> ExecRelationIsTargetRelation() to become very slow, in such a case any
> additional redundant lock would be cheap by comparison.

Yeah, it'd be nice to get rid of the need for that.

                        regards, tom lane

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
On Sat, 24 Nov 2018 at 05:25, Tom Lane <[hidden email]> wrote:

> Now, after more thought, I believe that it's probably impossible
> to have a plan tree in which ExecRelationIsTargetRelation would
> return true at an indexscan node that's not underneath the ModifyTable
> node.  What *is* possible is that we have such an indexscan on a
> different RTE for the same table.  I constructed this admittedly
> artificial example in the regression database:
>
> # explain with x1 as (select * from tenk1 t1 where unique1 = 42),
> x2 as (update tenk1 t2 set two = 2 where unique2 = 11 returning *)
> select * from x1,x2 where x1.ten = x2.ten;

Well, that problem exists with more than just indexes. Relations could
be problematic too. An even more simple artificial example would be:

select * from t1 inner join t1 t2 on t1.a=t2.a for update of t2;

We could fix that in the executor by processing the rangetable in the
planner, first throwing the whole thing into a hash table by Oid and
finding the strongest lock level and applying that level to each
(non-dummy) matching RangeTblEntry's rellockmode.  While we're there
we could add a new field for indexlockmode and do the same process on
that.   However... there might not be much point as this does nothing
for the same problem that exists in parse analyze.  That may be much
harder or even impossible to fix.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

Amit Langote-2
In reply to this post by Tom Lane-2
Sorry for jumping in late on this.

On 2018/11/24 1:25, Tom Lane wrote:

> David Rowley <[hidden email]> writes:
> Thinking more about this, the problem I noted previously about two of
> these solutions not working if the index scan node is not physically
> underneath the ModifyTable node actually applies to all three :-(.
> It's a slightly different issue for #2, namely that what we risk is
> first taking AccessShareLock and then upgrading to RowExclusiveLock.
> Since there are places (not many) that take ShareLock on indexes,
> this would pose a deadlock risk.
>
> Now, after more thought, I believe that it's probably impossible
> to have a plan tree in which ExecRelationIsTargetRelation would
> return true at an indexscan node that's not underneath the ModifyTable
> node.  What *is* possible is that we have such an indexscan on a
> different RTE for the same table.  I constructed this admittedly
> artificial example in the regression database:
>
> # explain with x1 as (select * from tenk1 t1 where unique1 = 42),
> x2 as (update tenk1 t2 set two = 2 where unique2 = 11 returning *)
> select * from x1,x2 where x1.ten = x2.ten;
>                                           QUERY PLAN                                          
> ----------------------------------------------------------------------------------------------
>  Nested Loop  (cost=16.61..16.66 rows=1 width=488)
>    Join Filter: (x1.ten = x2.ten)
>    CTE x1
>      ->  Index Scan using tenk1_unique1 on tenk1 t1  (cost=0.29..8.30 rows=1 width=244)
>            Index Cond: (unique1 = 42)
>    CTE x2
>      ->  Update on tenk1 t2  (cost=0.29..8.30 rows=1 width=250)
>            ->  Index Scan using tenk1_unique2 on tenk1 t2  (cost=0.29..8.30 rows=1 width=250)
>                  Index Cond: (unique2 = 11)
>    ->  CTE Scan on x1  (cost=0.00..0.02 rows=1 width=244)
>    ->  CTE Scan on x2  (cost=0.00..0.02 rows=1 width=244)
> (11 rows)
>
> Because the CTEs will be initialized in order, this presents a case
> where the lock-upgrade hazard exists today: the x1 indexscan will
> open tenk1_unique1 with AccessShareLock and then x2's ModifyTable
> node will upgrade that to RowExclusiveLock.  None of the proposed
> fixes improve this.
Provided we want to keep ExecRelationIsTargetRelation going forward, how
about modifying it such that we compare the scan relation's OID with that
of the result relations, not the RT index?  Like in the attached.

> I'm beginning to think that postponing target-index locking to
> ExecInitModifyTable was a damfool idea and we should undo it.

+1

Also as already proposed, InitPlan should lock result relation indexes
even for DELETEs.

>> For partitioned
>> table updates, there may be many result relations which can cause
>> ExecRelationIsTargetRelation() to become very slow, in such a case any
>> additional redundant lock would be cheap by comparison.
>
> Yeah, it'd be nice to get rid of the need for that.

As David mentioned elsewhere, can we add the ResultRelInfos to a hash
table if there are at least a certain number of result relations?
3f2393edefa did that for UPDATE tuple routing efficiency.

Thanks,
Amit

ExecRelationIsTargetRelation-compare-OID.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
On Mon, 26 Nov 2018 at 17:37, Amit Langote
<[hidden email]> wrote:
> On 2018/11/24 1:25, Tom Lane wrote:
> > I'm beginning to think that postponing target-index locking to
> > ExecInitModifyTable was a damfool idea and we should undo it.
>
> +1
>
> Also as already proposed, InitPlan should lock result relation indexes
> even for DELETEs.

I'd rather see it fixed another way.  The reason being, if we get [1],
then that opens the door to run-time partition pruning for
UPDATE/DELETE, which is currently blocked due to lack of knowledge
about baserestrictinfos for the base partitioned relation because
inheritance_planner() does not plan for the partitioned table, only
its partitions.  Doing the index opening work during InitPlan() means
we do that for all partitions, even the ones that will later be
run-time pruned. If we can doing it during init of a ModifyTable node,
then we can likely do it after the run-time pruning takes place.
Since Amit and I are both working towards making partitioning faster,
I imagined he would also not want to do something that could slow it
down significantly, if there was some alternative way to fix it, at
least.

I'm making efforts to delay per-partition work even further in the
executor, for example locking of the per-partition result relations
until after run-time pruning would be a significant win for
partitioned tables with many partitions when generic plans are in use.
Moving things back to InitPlan() flies in the face of that work.

[1] https://commitfest.postgresql.org/20/1778/

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

Amit Langote-2
On 2018/11/26 14:25, David Rowley wrote:

> On Mon, 26 Nov 2018 at 17:37, Amit Langote
> <[hidden email]> wrote:
>> On 2018/11/24 1:25, Tom Lane wrote:
>>> I'm beginning to think that postponing target-index locking to
>>> ExecInitModifyTable was a damfool idea and we should undo it.
>>
>> +1
>>
>> Also as already proposed, InitPlan should lock result relation indexes
>> even for DELETEs.
>
> I'd rather see it fixed another way.  The reason being, if we get [1],
> then that opens the door to run-time partition pruning for
> UPDATE/DELETE, which is currently blocked due to lack of knowledge
> about baserestrictinfos for the base partitioned relation because
> inheritance_planner() does not plan for the partitioned table, only
> its partitions.  Doing the index opening work during InitPlan() means
> we do that for all partitions, even the ones that will later be
> run-time pruned. If we can doing it during init of a ModifyTable node,
> then we can likely do it after the run-time pruning takes place.
> Since Amit and I are both working towards making partitioning faster,
> I imagined he would also not want to do something that could slow it
> down significantly, if there was some alternative way to fix it, at
> least.
>
> I'm making efforts to delay per-partition work even further in the
> executor, for example locking of the per-partition result relations
> until after run-time pruning would be a significant win for
> partitioned tables with many partitions when generic plans are in use.
> Moving things back to InitPlan() flies in the face of that work.
>
> [1] https://commitfest.postgresql.org/20/1778/

That's an interesting point.  Although, considering the concerns that Tom
raised about the same index relation being locked such that lock-strength
upgrade occurs (his example containing two CTEs), we'll have to find a way
to do the ModifyTable run-time pruning such that result relations and
their indexes (possibly under multiple ModifyTable nodes) are locked with
RowExclusiveLock before they're locked with AccessShareLock lock as scan
relations.  For example, we might be able to find a way to do the
ModifyTable run-time pruning in InitPlan before initializing plan trees.

That said, I don't quite understand how the lock-strength upgrade
occurring in the way being discussed here (AccessShareLock for scanning to
RowExclusiveLock for modifying) leads to deadlock hazard?

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
On Mon, 26 Nov 2018 at 18:57, Amit Langote
<[hidden email]> wrote:

> On 2018/11/26 14:25, David Rowley wrote:
> > I'm making efforts to delay per-partition work even further in the
> > executor, for example locking of the per-partition result relations
> > until after run-time pruning would be a significant win for
> > partitioned tables with many partitions when generic plans are in use.
> > Moving things back to InitPlan() flies in the face of that work.
> >
> > [1] https://commitfest.postgresql.org/20/1778/
>
> That's an interesting point.  Although, considering the concerns that Tom
> raised about the same index relation being locked such that lock-strength
> upgrade occurs (his example containing two CTEs), we'll have to find a way
> to do the ModifyTable run-time pruning such that result relations and
> their indexes (possibly under multiple ModifyTable nodes) are locked with
> RowExclusiveLock before they're locked with AccessShareLock lock as scan
> relations.  For example, we might be able to find a way to do the
> ModifyTable run-time pruning in InitPlan before initializing plan trees.

I thought my idea of the processing the rangetable at the end of
planning to determine the strongest lock per relation would have
solved that.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

Amit Langote-2
On 2018/11/27 6:19, David Rowley wrote:

> On Mon, 26 Nov 2018 at 18:57, Amit Langote
> <[hidden email]> wrote:
>> On 2018/11/26 14:25, David Rowley wrote:
>>> I'm making efforts to delay per-partition work even further in the
>>> executor, for example locking of the per-partition result relations
>>> until after run-time pruning would be a significant win for
>>> partitioned tables with many partitions when generic plans are in use.
>>> Moving things back to InitPlan() flies in the face of that work.
>>>
>>> [1] https://commitfest.postgresql.org/20/1778/
>>
>> That's an interesting point.  Although, considering the concerns that Tom
>> raised about the same index relation being locked such that lock-strength
>> upgrade occurs (his example containing two CTEs), we'll have to find a way
>> to do the ModifyTable run-time pruning such that result relations and
>> their indexes (possibly under multiple ModifyTable nodes) are locked with
>> RowExclusiveLock before they're locked with AccessShareLock lock as scan
>> relations.  For example, we might be able to find a way to do the
>> ModifyTable run-time pruning in InitPlan before initializing plan trees.
>
> I thought my idea of the processing the rangetable at the end of
> planning to determine the strongest lock per relation would have
> solved that.

Yeah, would be nice to make that work.

Thanks,
Amit


Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
On Tue, 27 Nov 2018 at 19:00, Amit Langote
<[hidden email]> wrote:

> On 2018/11/27 6:19, David Rowley wrote:
> > On Mon, 26 Nov 2018 at 18:57, Amit Langote
> >> That's an interesting point.  Although, considering the concerns that Tom
> >> raised about the same index relation being locked such that lock-strength
> >> upgrade occurs (his example containing two CTEs), we'll have to find a way
> >> to do the ModifyTable run-time pruning such that result relations and
> >> their indexes (possibly under multiple ModifyTable nodes) are locked with
> >> RowExclusiveLock before they're locked with AccessShareLock lock as scan
> >> relations.  For example, we might be able to find a way to do the
> >> ModifyTable run-time pruning in InitPlan before initializing plan trees.
> >
> > I thought my idea of the processing the rangetable at the end of
> > planning to determine the strongest lock per relation would have
> > solved that.
>
> Yeah, would be nice to make that work.
Here's a very rough and incomplete patch just to demo what I had in
mind. The finalize_lockmodes() is likely more or less complete, just
minus me testing it works.  What's mostly missing is changing all the
places that grab index locks to use the new idxlockmode field. I've
really just changed index scan and index only scan and just stared a
bit at nodeModifyTable.c wondering what I should do with that
operation != CMD_DELETE test before the ExecOpenIndices() call.

The patch also includes code to determine the strongest rellockmode
per relation.  Perhaps it's not really worth doing that since parse
analyze could still cause some lock upgrade hazards. The code that's
there just fixes things for the executor, so really would only have an
effect when executing cached plans.

If this looks like a good path to go in, then I can produce something
a bit more finished. I'm just a bit unsure when exactly I can do that
as I'm on leave and have other commitments to take care of.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

wip_idxlockmode_and_lock_upgrade_fix.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

akapila
In reply to this post by Tom Lane-2
On Fri, Nov 23, 2018 at 9:55 PM Tom Lane <[hidden email]> wrote:
>
> Thinking more about this, the problem I noted previously about two of
> these solutions not working if the index scan node is not physically
> underneath the ModifyTable node actually applies to all three :-(.
> It's a slightly different issue for #2, namely that what we risk is
> first taking AccessShareLock and then upgrading to RowExclusiveLock.
> Since there are places (not many) that take ShareLock on indexes,
> this would pose a deadlock risk.
>

Can you be a bit more specific on what exact deadlock risk you are
seeing here as Amit L asked about it and I am also curious to know?
One way I could see is:

Session-1
begin;
Lock table foo in Access Share Mode;

Session-2
begin;
Lock table foo in Share Mode;

Session-1
Lock table foo in Row Exclusive Mode;  --here it will wait for session-2

Session-2
Lock table foo in Access Exclusive Mode;  --here it will lead to deadlock


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
In reply to this post by David Rowley-3
On Wed, 28 Nov 2018 at 01:55, David Rowley <[hidden email]> wrote:
> If this looks like a good path to go in, then I can produce something
> a bit more finished. I'm just a bit unsure when exactly I can do that
> as I'm on leave and have other commitments to take care of.

This patch is still on my list, so I had another look at what I did
back in November...

I've changed a couple of things:

1. Changed nodeBitmapIndexscan.c now properly uses the RangeTblEntry's
idxlockmode field.
2. Renamed a few variables in finalize_lockmodes().

I'm keen to get some feedback if we should go about fixing things this
way.  One thing that's still on my mind is that the parser is still at
risk of lock upgrade hazards. This patch only fixes the executor.  I
don't quite see how it would be possible to fix the same in the
parser.

I was also looking at each call site that calls ExecOpenIndices(). I
don't think it's great that ExecInitModifyTable() has its own logic to
skip calling that function for DELETE.  I wondered if it shouldn't
somehow depend on what the idxlockmode is set to.  I also saw that
apply_handle_delete() makes a call to ExecOpenIndices(). I don't think
that one is needed, but I didn't test anything to make sure. Maybe
that's for another thread anyway.

Updated patch is attached.

Adding to the March commitfest as a bug fix.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

idxlockmode_and_lock_upgrade_fix_v2.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

Julien Rouhaud
Hi,

On Tue, Feb 5, 2019 at 5:16 AM David Rowley
<[hidden email]> wrote:

>
> I've changed a couple of things:
>
> 1. Changed nodeBitmapIndexscan.c now properly uses the RangeTblEntry's
> idxlockmode field.
> 2. Renamed a few variables in finalize_lockmodes().
>
> I'm keen to get some feedback if we should go about fixing things this
> way.  One thing that's still on my mind is that the parser is still at
> risk of lock upgrade hazards. This patch only fixes the executor.  I
> don't quite see how it would be possible to fix the same in the
> parser.

+        /*
+         * If there are multiple instances of the same rel with varying lock
+         * strengths then set the strongest lock level to each instance of
+         * that relation.
+         */
+        if (applystrongest)
[...]

The patch is quite straightforward, so I don't have general comments
on it.  However, I think that the idxlockmode initialization is
problematic: you're using the statement's commandType so this doesn't
work with CTE.  For instance, with this artificial query

WITH s AS (UPDATE test set id = 1 WHERE id =1) select 1;

will take an AccessShareLock on test's index while it should have an
RowExclusiveLock.  I guess that you have to code similar lock upgrade
logic for the indexes, inspecting the planTree and subplans to find
the correct command type.

>
> I was also looking at each call site that calls ExecOpenIndices(). I
> don't think it's great that ExecInitModifyTable() has its own logic to
> skip calling that function for DELETE.  I wondered if it shouldn't
> somehow depend on what the idxlockmode is set to.

I don't think that it's great either.  However for DELETE we shouldn't
simply call ExecOpenIndices(), but open only the used indexes right?

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
On Mon, 11 Feb 2019 at 01:22, Julien Rouhaud <[hidden email]> wrote:

> The patch is quite straightforward, so I don't have general comments
> on it.  However, I think that the idxlockmode initialization is
> problematic: you're using the statement's commandType so this doesn't
> work with CTE.  For instance, with this artificial query
>
> WITH s AS (UPDATE test set id = 1 WHERE id =1) select 1;
>
> will take an AccessShareLock on test's index while it should have an
> RowExclusiveLock.  I guess that you have to code similar lock upgrade
> logic for the indexes, inspecting the planTree and subplans to find
> the correct command type.

Good catch. I'm a bit stuck on the best way to fix this.  So far I can
only think of, either;

1. Adding a new field to RangeTblEntry to indicate the operation type
that's being performed on the relation; or
2. Adding a Bitmapset field to PlannerGlobal that sets the rtable
indexes of RangeTblEntry items that belong to DELETEs and ignore these
when setting resultRelids in finalize_lockmodes().

For #2, the only place I can see to do this is
add_rtes_to_flat_rtable(), which would require either passing the
PlannerInfo into the function, or at least its parse's commandType.

I don't really like either, but don't have any other ideas at the moment.

> > I was also looking at each call site that calls ExecOpenIndices(). I
> > don't think it's great that ExecInitModifyTable() has its own logic to
> > skip calling that function for DELETE.  I wondered if it shouldn't
> > somehow depend on what the idxlockmode is set to.
>
> I don't think that it's great either.  However for DELETE we shouldn't
> simply call ExecOpenIndices(), but open only the used indexes right?

No, I don't think so. The "used" index(es) will be opened in the scan
node(s).   The reason I didn't like it much is that I wanted to keep
the logic for deciding what lock level to use in the planner.  The
executor seems to have more knowledge than I think maybe it should.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

Julien Rouhaud
On Mon, Feb 11, 2019 at 5:32 AM David Rowley
<[hidden email]> wrote:

>
> On Mon, 11 Feb 2019 at 01:22, Julien Rouhaud <[hidden email]> wrote:
> > The patch is quite straightforward, so I don't have general comments
> > on it.  However, I think that the idxlockmode initialization is
> > problematic: you're using the statement's commandType so this doesn't
> > work with CTE.  For instance, with this artificial query
> >
> > WITH s AS (UPDATE test set id = 1 WHERE id =1) select 1;
> >
> > will take an AccessShareLock on test's index while it should have an
> > RowExclusiveLock.  I guess that you have to code similar lock upgrade
> > logic for the indexes, inspecting the planTree and subplans to find
> > the correct command type.
>
> Good catch. I'm a bit stuck on the best way to fix this.  So far I can
> only think of, either;
>
> 1. Adding a new field to RangeTblEntry to indicate the operation type
> that's being performed on the relation; or
> 2. Adding a Bitmapset field to PlannerGlobal that sets the rtable
> indexes of RangeTblEntry items that belong to DELETEs and ignore these
> when setting resultRelids in finalize_lockmodes().
>
> For #2, the only place I can see to do this is
> add_rtes_to_flat_rtable(), which would require either passing the
> PlannerInfo into the function, or at least its parse's commandType.
>
> I don't really like either, but don't have any other ideas at the moment.

But we would still need the same lock level upgrade logic on indexes
for cases like CTE with a mix of INSERT, UPDATE and DELETE on the same
relation I think.  #1 seems like a better solution to me.

> > > I was also looking at each call site that calls ExecOpenIndices(). I
> > > don't think it's great that ExecInitModifyTable() has its own logic to
> > > skip calling that function for DELETE.  I wondered if it shouldn't
> > > somehow depend on what the idxlockmode is set to.
> >
> > I don't think that it's great either.  However for DELETE we shouldn't
> > simply call ExecOpenIndices(), but open only the used indexes right?
>
> No, I don't think so. The "used" index(es) will be opened in the scan
> node(s).   The reason I didn't like it much is that I wanted to keep
> the logic for deciding what lock level to use in the planner.  The
> executor seems to have more knowledge than I think maybe it should.

Ah, I see.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Inadequate executor locking of indexes

David Rowley-3
On Tue, 12 Feb 2019 at 09:58, Julien Rouhaud <[hidden email]> wrote:

>
> On Mon, Feb 11, 2019 at 5:32 AM David Rowley
> <[hidden email]> wrote:
> > 1. Adding a new field to RangeTblEntry to indicate the operation type
> > that's being performed on the relation; or
> > 2. Adding a Bitmapset field to PlannerGlobal that sets the rtable
> > indexes of RangeTblEntry items that belong to DELETEs and ignore these
> > when setting resultRelids in finalize_lockmodes().
> >
> > For #2, the only place I can see to do this is
> > add_rtes_to_flat_rtable(), which would require either passing the
> > PlannerInfo into the function, or at least its parse's commandType.
> >
> > I don't really like either, but don't have any other ideas at the moment.
>
> But we would still need the same lock level upgrade logic on indexes
> for cases like CTE with a mix of INSERT, UPDATE and DELETE on the same
> relation I think.  #1 seems like a better solution to me.

I think I'd rather find some way to do it that didn't denormalise the
parse nodes like that.  It seems very strange to have a CmdType in the
Query struct, and then another set of them in RangeTblEntry. Besides
bloating the size of the RangeTblEntry struct a bit, it also could
lead to inconsistency bugs where the two CmdTypes differ, for some
reason.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services