Parallel INSERT (INTO ... SELECT ...)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
86 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Parallel INSERT (INTO ... SELECT ...)

Greg Nancarrow
Hi Hackers,

Following on from Dilip Kumar's POC patch for allowing parallelism of
the SELECT part of "INSERT INTO ... SELECT ...", I have attached a POC
patch for allowing parallelism of both the INSERT and SELECT parts,
where it can be allowed.
For cases where it can't be allowed (e.g. INSERT into a table with
foreign keys, or INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE
...") it at least allows parallelism of the SELECT part.
Obviously I've had to update the planner and executor and
parallel-worker code to make this happen, hopefully not breaking too
many things along the way.

Examples with patch applied:


(1) non-parallel:

test=# explain analyze insert into primary_tbl select * from third_tbl;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Insert on primary_tbl  (cost=0.00..154.99 rows=9999 width=12) (actual
time=108.445..108.446 rows=0 loops=1)
   ->  Seq Scan on third_tbl  (cost=0.00..154.99 rows=9999 width=12)
(actual time=0.009..5.282 rows=9999 loops=1)
 Planning Time: 0.132 ms
 Execution Time: 108.596 ms
(4 rows)


(2) parallel:

test=# explain analyze insert into primary_tbl select * from third_tbl;
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..16.00 rows=9999 width=12) (actual
time=69.870..70.310 rows=0 loops=1)
   Workers Planned: 5
   Workers Launched: 5
   ->  Parallel Insert on primary_tbl  (cost=0.00..16.00 rows=500
width=12) (actual time=59.948..59.949 rows=0 loops=6)
         ->  Parallel Seq Scan on third_tbl  (cost=0.00..80.00
rows=2500 width=12) (actual time=0.014..0.922 rows=1666 loops=6)
 Planning Time: 0.121 ms
 Execution Time: 70.438 ms
(7 rows)


(3) parallel select only (insert into table with foreign key)

test=# explain analyze insert into secondary_tbl select * from third_tbl;
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Insert on secondary_tbl  (cost=0.00..80.00 rows=9999 width=12)
(actual time=33.864..33.926 rows=0 loops=1)
   ->  Gather  (cost=0.00..80.00 rows=9999 width=12) (actual
time=0.451..5.201 rows=9999 loops=1)
         Workers Planned: 4
         Workers Launched: 4
         ->  Parallel Seq Scan on third_tbl  (cost=0.00..80.00
rows=2500 width=12) (actual time=0.013..0.717 rows=2000 loops=5)
 Planning Time: 0.127 ms
 Trigger for constraint secondary_tbl_index_fkey: time=331.834 calls=9999
 Execution Time: 367.342 ms
(8 rows)


Known issues/TODOs:
- Currently only for "INSERT INTO ... SELECT ...". To support "INSERT
INTO ... VALUES ..." would need additional Table AM functions for
dividing up the INSERT work amongst the workers (currently only exists
for scans).
- When INSERTs are made parallel, currently the reported row-count in
the "INSERT 0 <row-count>" status only reflects the rows that the
leader has processed (not the workers) - so it is obviously less than
the actual number of rows inserted.
- Functions relating to computing the number of parallel workers for
an INSERT, and the cost of an INSERT, need work.
- "force_parallel_mode" handling was updated so that it only affects
SELECT (not INSERT) - can't allow it for INSERT because we're only
supporting "INSERT INTO .. SELECT ..." and don't support other types
of INSERTs, and also can't allow attempted parallel UPDATEs resulting
from "INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE" etc.


Thoughts and feedback?

Regards,
Greg Nancarrow
Fujitsu Australia

0001-ParallelInsertSelect.patch (38K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Greg Nancarrow
> - When INSERTs are made parallel, currently the reported row-count in
> the "INSERT 0 <row-count>" status only reflects the rows that the
> leader has processed (not the workers) - so it is obviously less than
> the actual number of rows inserted.

Attached an updated patch which fixes this issue (for parallel
INSERTs, each worker's processed tuple count is communicated in shared
memory back to the leader, where it is added to the global
"es_processed" count).

0002-ParallelInsertSelect.patch (44K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

vignesh C
In reply to this post by Greg Nancarrow
On Tue, Sep 22, 2020 at 10:26 AM Greg Nancarrow <[hidden email]> wrote:

>
> Hi Hackers,
>
> Following on from Dilip Kumar's POC patch for allowing parallelism of
> the SELECT part of "INSERT INTO ... SELECT ...", I have attached a POC
> patch for allowing parallelism of both the INSERT and SELECT parts,
> where it can be allowed.
> For cases where it can't be allowed (e.g. INSERT into a table with
> foreign keys, or INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE
> ...") it at least allows parallelism of the SELECT part.
> Obviously I've had to update the planner and executor and
> parallel-worker code to make this happen, hopefully not breaking too
> many things along the way.

I feel this will be a very good performance improvement. +1 for this.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Andres Freund
In reply to this post by Greg Nancarrow
Hi,

On 2020-09-22 14:55:21 +1000, Greg Nancarrow wrote:
> Following on from Dilip Kumar's POC patch for allowing parallelism of
> the SELECT part of "INSERT INTO ... SELECT ...", I have attached a POC
> patch for allowing parallelism of both the INSERT and SELECT parts,
> where it can be allowed.

Cool!

I think it'd be good if you outlined what your approach is to make this
safe.


> For cases where it can't be allowed (e.g. INSERT into a table with
> foreign keys, or INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE
> ...") it at least allows parallelism of the SELECT part.

I think it'd be good to do this part separately and first, independent
of whether the insert part can be parallelized.


> Obviously I've had to update the planner and executor and
> parallel-worker code to make this happen, hopefully not breaking too
> many things along the way.

Hm, it looks like you've removed a fair bit of checks, it's not clear to
me why that's safe in each instance.


> - Currently only for "INSERT INTO ... SELECT ...". To support "INSERT
> INTO ... VALUES ..." would need additional Table AM functions for
> dividing up the INSERT work amongst the workers (currently only exists
> for scans).

Hm, not entirely following. What precisely are you thinking of here?

I doubt it's really worth adding parallelism support for INSERT
... VALUES, the cost of spawning workers will almost always higher than
the benefit.





> @@ -116,7 +117,7 @@ toast_save_datum(Relation rel, Datum value,
>   TupleDesc toasttupDesc;
>   Datum t_values[3];
>   bool t_isnull[3];
> - CommandId mycid = GetCurrentCommandId(true);
> + CommandId mycid = GetCurrentCommandId(!IsParallelWorker());
>   struct varlena *result;
>   struct varatt_external toast_pointer;
>   union

Hm? Why do we need this in the various places you have made this change?


> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 1585861..94c8507 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -2049,11 +2049,6 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
>   * inserts in general except for the cases where inserts generate a new
>   * CommandId (eg. inserts into a table having a foreign key column).
>   */
> - if (IsParallelWorker())
> - ereport(ERROR,
> - (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
> - errmsg("cannot insert tuples in a parallel worker")));
> -

I'm afraid that this weakens our checks more than I'd like. What if this
ends up being invoked from inside C code?


> @@ -822,19 +822,14 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
>   isdead = false;
>   break;
>   case HEAPTUPLE_INSERT_IN_PROGRESS:
> -
>   /*
>   * Since we hold exclusive lock on the relation, normally the
>   * only way to see this is if it was inserted earlier in our
>   * own transaction.  However, it can happen in system
>   * catalogs, since we tend to release write lock before commit
> - * there.  Give a warning if neither case applies; but in any
> - * case we had better copy it.
> + * there. In any case we had better copy it.
>   */
> - if (!is_system_catalog &&
> - !TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple->t_data)))
> - elog(WARNING, "concurrent insert in progress within table \"%s\"",
> - RelationGetRelationName(OldHeap));
> +
>   /* treat as live */
>   isdead = false;
>   break;
> @@ -1434,16 +1429,11 @@ heapam_index_build_range_scan(Relation heapRelation,
>   * the only way to see this is if it was inserted earlier
>   * in our own transaction.  However, it can happen in
>   * system catalogs, since we tend to release write lock
> - * before commit there.  Give a warning if neither case
> - * applies.
> + * before commit there.
>   */
>   xwait = HeapTupleHeaderGetXmin(heapTuple->t_data);
>   if (!TransactionIdIsCurrentTransactionId(xwait))
>   {
> - if (!is_system_catalog)
> - elog(WARNING, "concurrent insert in progress within table \"%s\"",
> - RelationGetRelationName(heapRelation));
> -
>   /*
>   * If we are performing uniqueness checks, indexing
>   * such a tuple could lead to a bogus uniqueness

Huh, I don't think this should be necessary?


> diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
> index a4944fa..9d3f100 100644
> --- a/src/backend/access/transam/varsup.c
> +++ b/src/backend/access/transam/varsup.c
> @@ -53,13 +53,6 @@ GetNewTransactionId(bool isSubXact)
>   TransactionId xid;
>  
>   /*
> - * Workers synchronize transaction state at the beginning of each parallel
> - * operation, so we can't account for new XIDs after that point.
> - */
> - if (IsInParallelMode())
> - elog(ERROR, "cannot assign TransactionIds during a parallel operation");
> -
> - /*
>   * During bootstrap initialization, we return the special bootstrap
>   * transaction id.
>   */

Same thing, this code cannot just be allowed to be reachable. What
prevents you from assigning two different xids from different workers
etc?


> @@ -577,13 +608,6 @@ AssignTransactionId(TransactionState s)
>   Assert(s->state == TRANS_INPROGRESS);
>  
>   /*
> - * Workers synchronize transaction state at the beginning of each parallel
> - * operation, so we can't account for new XIDs at this point.
> - */
> - if (IsInParallelMode() || IsParallelWorker())
> - elog(ERROR, "cannot assign XIDs during a parallel operation");
> -
> - /*
>   * Ensure parent(s) have XIDs, so that a child always has an XID later
>   * than its parent.  Mustn't recurse here, or we might get a stack
>   * overflow if we're at the bottom of a huge stack of subtransactions none

Dito.


Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Thomas Munro-5
In reply to this post by Greg Nancarrow
On Tue, Sep 22, 2020 at 4:56 PM Greg Nancarrow <[hidden email]> wrote:
>  Gather  (cost=0.00..16.00 rows=9999 width=12) (actual
> time=69.870..70.310 rows=0 loops=1)
>    Workers Planned: 5
>    Workers Launched: 5
>    ->  Parallel Insert on primary_tbl  (cost=0.00..16.00 rows=500
> width=12) (actual time=59.948..59.949 rows=0 loops=6)

Nice.  I took it for a quick spin.  I was initially surprised to see
Gather.  I suppose I thought that Parallel {Insert|Update|Delete}
might be a top level node itself, because in such a plan there is no
need to gather tuples per se.  I understand exactly why you have it
that way though: Gather is needed to control workers and handle their
errors etc, and we don't want to have to terminate parallelism anyway
(thinking of some kind of plan with multiple write subqueries).


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

akapila
On Thu, Sep 24, 2020 at 7:57 AM Thomas Munro <[hidden email]> wrote:

>
> On Tue, Sep 22, 2020 at 4:56 PM Greg Nancarrow <[hidden email]> wrote:
> >  Gather  (cost=0.00..16.00 rows=9999 width=12) (actual
> > time=69.870..70.310 rows=0 loops=1)
> >    Workers Planned: 5
> >    Workers Launched: 5
> >    ->  Parallel Insert on primary_tbl  (cost=0.00..16.00 rows=500
> > width=12) (actual time=59.948..59.949 rows=0 loops=6)
>
> Nice.  I took it for a quick spin.  I was initially surprised to see
> Gather.  I suppose I thought that Parallel {Insert|Update|Delete}
> might be a top level node itself, because in such a plan there is no
> need to gather tuples per se.  I understand exactly why you have it
> that way though: Gather is needed to control workers and handle their
> errors etc, and we don't want to have to terminate parallelism anyway
> (thinking of some kind of plan with multiple write subqueries).
>

I have not checked the patch but I guess if we parallelise Inserts
with Returning then isn't it better to have Gather node above Parallel
Inserts?

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Greg Nancarrow
On Thu, Sep 24, 2020 at 12:38 PM Amit Kapila <[hidden email]> wrote:
>
> I have not checked the patch but I guess if we parallelise Inserts
> with Returning then isn't it better to have Gather node above Parallel
> Inserts?
>

This is indeed the case with the patch applied.

For example:

test=# explain insert into primary_tbl select * from third_tbl
returning index, height;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Gather  (cost=0.00..28.15 rows=9999 width=12)
   Workers Planned: 3
   ->  Parallel Insert on primary_tbl  (cost=0.00..28.15 rows=1040 width=12)
         ->  Parallel Seq Scan on third_tbl  (cost=0.00..87.25
rows=3225 width=12)
(4 rows)

test=# insert into primary_tbl select * from third_tbl returning index, height;
 index | height
-------+--------
     1 |    1.2
     2 |    1.2
     3 |    1.2
     4 |    1.2
     5 |    1.2
     6 |    1.2
     7 |    1.2

...

  9435 |    1.2
  9619 |    1.2
  9620 |    1.2
(9999 rows)

INSERT 0 9999


Regards,
Greg Nancarrow
Fujitsu Australia


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

akapila
In reply to this post by Andres Freund
On Thu, Sep 24, 2020 at 7:51 AM Andres Freund <[hidden email]> wrote:

>
> On 2020-09-22 14:55:21 +1000, Greg Nancarrow wrote:
>
>
> > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> > index 1585861..94c8507 100644
> > --- a/src/backend/access/heap/heapam.c
> > +++ b/src/backend/access/heap/heapam.c
> > @@ -2049,11 +2049,6 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
> >        * inserts in general except for the cases where inserts generate a new
> >        * CommandId (eg. inserts into a table having a foreign key column).
> >        */
> > -     if (IsParallelWorker())
> > -             ereport(ERROR,
> > -                             (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
> > -                              errmsg("cannot insert tuples in a parallel worker")));
> > -
>
> I'm afraid that this weakens our checks more than I'd like.
>

I think we need to change/remove this check to allow inserts by
parallel workers. I am not sure but maybe we can add an Assert to
ensure that it is safe to perform insert via parallel worker.

> What if this
> ends up being invoked from inside C code?
>

I think it shouldn't be a problem unless one is trying to do something
like insert into foreign key table. So, probably we can have an Assert
to catch it if possible. Do you have any other idea?

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Greg Nancarrow
> > What if this
> > ends up being invoked from inside C code?
> >
>
> I think it shouldn't be a problem unless one is trying to do something
> like insert into foreign key table. So, probably we can have an Assert
> to catch it if possible. Do you have any other idea?
>

Note that the planner code updated by the patch does avoid creating a
Parallel INSERT plan in the case of inserting into a table with a
foreign key (so commandIds won't be created in the parallel-worker
code).
I'm not sure how to distinguish the "invoked from inside C code" case though.

Regards,
Greg Nancarrow
Fujitsu Australia


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Greg Nancarrow
In reply to this post by Andres Freund
Hi Andres,

On Thu, Sep 24, 2020 at 12:21 PM Andres Freund <[hidden email]> wrote:
>
>I think it'd be good if you outlined what your approach is to make this
>safe.

Some prior work has already been done to establish the necessary
infrastructure to allow parallel INSERTs, in general, to be safe,
except for cases where new commandIds would be generated in the
parallel-worker code (such as inserts into a table having a foreign
key) - these cases need to be avoided.
See the following commits.

85f6b49 Allow relation extension lock to conflict among parallel group members
3ba59cc Allow page lock to conflict among parallel group members

The planner code updated by the patch avoids creating a Parallel
INSERT plan in the case of inserting into a table that has a foreign
key.


>> For cases where it can't be allowed (e.g. INSERT into a table with
>> foreign keys, or INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE
>> ...") it at least allows parallelism of the SELECT part.
>
>I think it'd be good to do this part separately and first, independent
>of whether the insert part can be parallelized.

OK then, I'll try to extract that as a separate patch.


>> Obviously I've had to update the planner and executor and
>> parallel-worker code to make this happen, hopefully not breaking too
>> many things along the way.
>
>Hm, it looks like you've removed a fair bit of checks, it's not clear to
>me why that's safe in each instance.

It should be safe for Parallel INSERT - but you are right, these are
brute force removals (for the purpose of a POC patch) that should be
tightened up wherever possible to disallow unsafe paths into that
code. Problem is, currently there's not a lot of context information
available to easily allow that, so some work needs to be done.


>> - Currently only for "INSERT INTO ... SELECT ...". To support "INSERT
>> INTO ... VALUES ..." would need additional Table AM functions for
>> dividing up the INSERT work amongst the workers (currently only exists
>> for scans).
>
>Hm, not entirely following. What precisely are you thinking of here?

All I was saying is that for SELECTs, the work done by each parallel
worker is effectively divided up by parallel-worker-related functions
in tableam.c and indexam.c, and no such technology currently exists
for dividing up work for the "INSERT ... VALUES" case.


>I doubt it's really worth adding parallelism support for INSERT
>... VALUES, the cost of spawning workers will almost always higher than
>the benefit.

You're probably right in doubting any benefit, but I wasn't entirely sure.


>> @@ -116,7 +117,7 @@ toast_save_datum(Relation rel, Datum value,
>>       TupleDesc       toasttupDesc;
>>       Datum           t_values[3];
>>       bool            t_isnull[3];
>> -     CommandId       mycid = GetCurrentCommandId(true);
>> +     CommandId       mycid = GetCurrentCommandId(!IsParallelWorker());
>>       struct varlena *result;
>>       struct varatt_external toast_pointer;
>>       union
>
>Hm? Why do we need this in the various places you have made this change?

It's because for Parallel INSERT, we're assigning the same command-id
to each worker up-front during worker initialization (the commandId
has been retrieved by the leader and passed through to each worker)
and "currentCommandIdUsed" has been set true. See the
AssignCommandIdForWorker() function in the patch.
If you see the code of GetCurrentCommandId(), you'll see it Assert
that it's not being run by a parallel worker if the parameter is true.
I didn't want to remove yet another check, without being able to know
the context of the caller, because only for Parallel INSERT do I know
that "currentCommandIdUsed was already true at the start of the
parallel operation". See the comment in that function. Anyway, that's
why I'm passing "false" to relevant GetCurrentCommandId() calls if
they're being run by a parallel (INSERT) worker.


>> @@ -822,19 +822,14 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
>>                               isdead = false;
>>                               break;
>>                       case HEAPTUPLE_INSERT_IN_PROGRESS:
>> -
>>                               /*
>>                                * Since we hold exclusive lock on the relation, normally the
>>                                * only way to see this is if it was inserted earlier in our
>>                                * own transaction.  However, it can happen in system
>>                                * catalogs, since we tend to release write lock before >commit
>> -                              * there.  Give a warning if neither case applies; but in any
>> -                              * case we had better copy it.
>> +                              * there. In any case we had better copy it.
>>                                */
>> -                             if (!is_system_catalog &&
>> -                                     !TransactionIdIsCurrentTransactionId>(HeapTupleHeaderGetXmin(tuple->t_data)))
>> -                                     elog(WARNING, "concurrent insert in progress within >table \"%s\"",
>> -                                              RelationGetRelationName(OldHeap));
>> +
>>                               /* treat as live */
>>                               isdead = false;
>>                               break;
>> @@ -1434,16 +1429,11 @@ heapam_index_build_range_scan(Relation heapRelation,
>>                                        * the only way to see this is if it was inserted >earlier
>>                                        * in our own transaction.  However, it can happen in
>>                                        * system catalogs, since we tend to release write >lock
>> -                                      * before commit there.  Give a warning if neither >case
>> -                                      * applies.
>> +                                      * before commit there.
>>                                        */
>>                                       xwait = HeapTupleHeaderGetXmin(heapTuple->t_data);
>>                                       if (!TransactionIdIsCurrentTransactionId(xwait))
>>                                       {
>> -                                             if (!is_system_catalog)
>> -                                                     elog(WARNING, "concurrent insert in >progress within table \"%s\"",
>> -                                                              RelationGetRelationName>(heapRelation));
>> -
>>                                               /*
>>                                                * If we are performing uniqueness checks, >>indexing
>>                                                * such a tuple could lead to a bogus >uniqueness
>
>Huh, I don't think this should be necessary?

Yes, I think you're right, I perhaps got carried away removing checks
on concurrent inserts. I will revert those changes.


>> diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
>> index a4944fa..9d3f100 100644
>> --- a/src/backend/access/transam/varsup.c
>> +++ b/src/backend/access/transam/varsup.c
>> @@ -53,13 +53,6 @@ GetNewTransactionId(bool isSubXact)
>>       TransactionId xid;
>>
>>       /*
>> -      * Workers synchronize transaction state at the beginning of each parallel
>> -      * operation, so we can't account for new XIDs after that point.
>> -      */
>> -     if (IsInParallelMode())
>> -             elog(ERROR, "cannot assign TransactionIds during a parallel operation");
>> -
>> -     /*
>>        * During bootstrap initialization, we return the special bootstrap
>>        * transaction id.
>>        */
>
>Same thing, this code cannot just be allowed to be reachable. What
>prevents you from assigning two different xids from different workers
>etc?

At least in the case of Parallel INSERT, the leader for the Parallel
INSERT gets a new xid (GetCurrentFullTransactionId) and it is passed
through and assigned to each of the workers during their
initialization (so they are assigned the same xid).


Regards,
Greg Nancarrow
Fujitsu Australia


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Bharath Rupireddy
In reply to this post by Greg Nancarrow
On Tue, Sep 22, 2020 at 10:26 AM Greg Nancarrow <[hidden email]> wrote:
>
> For cases where it can't be allowed (e.g. INSERT into a table with
> foreign keys, or INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE
> ...") it at least allows parallelism of the SELECT part.
>

Thanks Greg for the patch.

I have few points (inspired from parallel copy feature work) to mention:

1. What if the target table is a foreign table or partitioned table?
2. What happens if the target table has triggers(before statement,
after statement, before row, after row) that are parallel unsafe?
3. Will each worker be doing single row insertions or multi inserts?
If single row insertions, will the buffer lock contentions be more?
5. How does it behave with toast columns values?
6. How does it behave if we have a RETURNING clause with INSERT INTO SELECT?

I'm looking forward to seeing some initial numbers on execution times
with and without patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Greg Nancarrow
On Fri, Sep 25, 2020 at 7:01 PM Bharath Rupireddy
<[hidden email]> wrote:

> I have few points (inspired from parallel copy feature work) to mention:
>
> 1. What if the target table is a foreign table or partitioned table?
> 2. What happens if the target table has triggers(before statement,
> after statement, before row, after row) that are parallel unsafe?
> 3. Will each worker be doing single row insertions or multi inserts?
> If single row insertions, will the buffer lock contentions be more?
> 5. How does it behave with toast columns values?
> 6. How does it behave if we have a RETURNING clause with INSERT INTO SELECT?
>

Hi Bharath,

Thanks for pointing out more cases I need to exclude and things I need
to investigate further.
I have taken note of them, and will do more testing and improvement.
At least RETURNING clause with INSERT INTO SELECT is working!

Regards,
Greg Nancarrow
Fujitsu Australia


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

akapila
In reply to this post by Greg Nancarrow
On Fri, Sep 25, 2020 at 10:02 AM Greg Nancarrow <[hidden email]> wrote:

>
> Hi Andres,
>
> On Thu, Sep 24, 2020 at 12:21 PM Andres Freund <[hidden email]> wrote:
> >
>
>
> >> @@ -116,7 +117,7 @@ toast_save_datum(Relation rel, Datum value,
> >>       TupleDesc       toasttupDesc;
> >>       Datum           t_values[3];
> >>       bool            t_isnull[3];
> >> -     CommandId       mycid = GetCurrentCommandId(true);
> >> +     CommandId       mycid = GetCurrentCommandId(!IsParallelWorker());
> >>       struct varlena *result;
> >>       struct varatt_external toast_pointer;
> >>       union
> >
> >Hm? Why do we need this in the various places you have made this change?
>
> It's because for Parallel INSERT, we're assigning the same command-id
> to each worker up-front during worker initialization (the commandId
> has been retrieved by the leader and passed through to each worker)
> and "currentCommandIdUsed" has been set true. See the
> AssignCommandIdForWorker() function in the patch.
> If you see the code of GetCurrentCommandId(), you'll see it Assert
> that it's not being run by a parallel worker if the parameter is true.
> I didn't want to remove yet another check, without being able to know
> the context of the caller, because only for Parallel INSERT do I know
> that "currentCommandIdUsed was already true at the start of the
> parallel operation". See the comment in that function. Anyway, that's
> why I'm passing "false" to relevant GetCurrentCommandId() calls if
> they're being run by a parallel (INSERT) worker.
>

But we can tighten the condition in GetCurrentCommandId() such that it
Asserts for parallel worker only when currentCommandIdUsed is not set
before start of parallel operation. I also find these changes in the
callers of GetCurrentCommandId() quite adhoc and ugly even if they are
correct. Also, why we don't face a similar problems for parallel copy?

>
> >> diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
> >> index a4944fa..9d3f100 100644
> >> --- a/src/backend/access/transam/varsup.c
> >> +++ b/src/backend/access/transam/varsup.c
> >> @@ -53,13 +53,6 @@ GetNewTransactionId(bool isSubXact)
> >>       TransactionId xid;
> >>
> >>       /*
> >> -      * Workers synchronize transaction state at the beginning of each parallel
> >> -      * operation, so we can't account for new XIDs after that point.
> >> -      */
> >> -     if (IsInParallelMode())
> >> -             elog(ERROR, "cannot assign TransactionIds during a parallel operation");
> >> -
> >> -     /*
> >>        * During bootstrap initialization, we return the special bootstrap
> >>        * transaction id.
> >>        */
> >
> >Same thing, this code cannot just be allowed to be reachable. What
> >prevents you from assigning two different xids from different workers
> >etc?
>
> At least in the case of Parallel INSERT, the leader for the Parallel
> INSERT gets a new xid (GetCurrentFullTransactionId) and it is passed
> through and assigned to each of the workers during their
> initialization (so they are assigned the same xid).
>

So are you facing problems in this area because we EnterParallelMode
before even assigning the xid in the leader? Because I don't think we
should ever reach this code in the worker. If so, there are two
possibilities that come to my mind (a) assign xid in leader before
entering parallel mode or (b) change the check so that we don't assign
the new xid in workers. In this case, I am again wondering how does
parallel copy dealing this?

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Bharath Rupireddy
On Fri, Sep 25, 2020 at 5:47 PM Amit Kapila <[hidden email]> wrote:

>
> >
> > At least in the case of Parallel INSERT, the leader for the Parallel
> > INSERT gets a new xid (GetCurrentFullTransactionId) and it is passed
> > through and assigned to each of the workers during their
> > initialization (so they are assigned the same xid).
> >
>
> So are you facing problems in this area because we EnterParallelMode
> before even assigning the xid in the leader? Because I don't think we
> should ever reach this code in the worker. If so, there are two
> possibilities that come to my mind (a) assign xid in leader before
> entering parallel mode or (b) change the check so that we don't assign
> the new xid in workers. In this case, I am again wondering how does
> parallel copy dealing this?
>

In parallel copy, we are doing option (a) i.e. the leader gets the
full txn id before entering parallel mode and passes it to all
workers.
In the leader:
    full_transaction_id = GetCurrentFullTransactionId();
    EnterParallelMode();
    shared_info_ptr->full_transaction_id = full_transaction_id;
In the workers:
    AssignFullTransactionIdForWorker(pcshared_info->full_transaction_id);

Hence below part of the code doesn't get hit.
    if (IsInParallelMode() || IsParallelWorker())
        elog(ERROR, "cannot assign XIDs during a parallel operation");

We also deal with the commandid similarly i.e. the leader gets the
command id, and workers would use it while insertion.
In the leader:
    shared_info_ptr->mycid = GetCurrentCommandId(true);
In the workers:
    AssignCommandIdForWorker(pcshared_info->mycid, true);

[1]
void
AssignFullTransactionIdForWorker(FullTransactionId fullTransactionId)
{
    TransactionState s = CurrentTransactionState;

    Assert((IsInParallelMode() || IsParallelWorker()));
    s->fullTransactionId = fullTransactionId;
}

void
AssignCommandIdForWorker(CommandId commandId, bool used)
{
    Assert((IsInParallelMode() || IsParallelWorker()));

    /* this is global to a transaction, not subtransaction-local */
    if (used)
        currentCommandIdUsed = true;

    currentCommandId = commandId;
}

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Greg Nancarrow
In reply to this post by akapila
On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <[hidden email]> wrote:
> But we can tighten the condition in GetCurrentCommandId() such that it
> Asserts for parallel worker only when currentCommandIdUsed is not set
> before start of parallel operation. I also find these changes in the
> callers of GetCurrentCommandId() quite adhoc and ugly even if they are
> correct. Also, why we don't face a similar problems for parallel copy?
>

For Parallel Insert, as part of query plan execution,
GetCurrentCommandId(true) is being called as part of INSERT statement
execution.
Parallel Copy of course doesn't have to deal with this. That's why
there's a difference. And also, it has its own parallel entry point
(ParallelCopyMain), so it's in full control, it's not trying to fit in
with the infrastructure for plan execution.

> So are you facing problems in this area because we EnterParallelMode
> before even assigning the xid in the leader? Because I don't think we
> should ever reach this code in the worker. If so, there are two
> possibilities that come to my mind (a) assign xid in leader before
> entering parallel mode or (b) change the check so that we don't assign
> the new xid in workers. In this case, I am again wondering how does
> parallel copy dealing this?
>

Again, there's a fundamental difference in the Parallel Insert case.
Right at the top of ExecutePlan it calls EnterParallelMode().
For ParallelCopy(), there is no such problem. EnterParallelMode() is
only called just before ParallelCopyMain() is called. So it can easily
acquire the xid before this, because then parallel mode is not set.

As it turns out, I think I have solved the commandId issue (and almost
the xid issue) by realising that both the xid and cid are ALREADY
being included as part of the serialized transaction state in the
Parallel DSM. So actually I don't believe that there is any need for
separately passing them in the DSM, and having to use those
AssignXXXXForWorker() functions in the worker code - not even in the
Parallel Copy case (? - need to check). GetCurrentCommandId(true) and
GetFullTransactionId() need to be called prior to Parallel DSM
initialization, so they are included in the serialized transaction
state.
I just needed to add a function to set currentCommandIdUsed=true in
the worker initialization (for INSERT case) and make a small tweak to
the Assert in GetCurrentCommandId() to ensure that
currentCommandIdUsed, in a parallel worker, never gets set to true
when it is false. This is in line with the comment in that function,
because we know that "currentCommandId was already true at the start
of the parallel operation". With this in place, I don't need to change
any of the original calls to GetCurrentCommandId(), so this addresses
that issue raised by Andres.

I am not sure yet how to get past the issue of the parallel mode being
set at the top of ExecutePlan(). With that in place, it doesn't allow
a xid to be acquired for the leader, without removing/changing that
parallel-mode check in GetNewTransactionId().

Regards,
Greg Nancarrow
Fujitsu Australia


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

akapila
On Fri, Sep 25, 2020 at 9:23 PM Greg Nancarrow <[hidden email]> wrote:

>
> On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <amit.kapila16@g
>
> As it turns out, I think I have solved the commandId issue (and almost
> the xid issue) by realising that both the xid and cid are ALREADY
> being included as part of the serialized transaction state in the
> Parallel DSM. So actually I don't believe that there is any need for
> separately passing them in the DSM, and having to use those
> AssignXXXXForWorker() functions in the worker code - not even in the
> Parallel Copy case (? - need to check). GetCurrentCommandId(true) and
> GetFullTransactionId() need to be called prior to Parallel DSM
> initialization, so they are included in the serialized transaction
> state.
> I just needed to add a function to set currentCommandIdUsed=true in
> the worker initialization (for INSERT case) and make a small tweak to
> the Assert in GetCurrentCommandId() to ensure that
> currentCommandIdUsed, in a parallel worker, never gets set to true
> when it is false. This is in line with the comment in that function,
> because we know that "currentCommandId was already true at the start
> of the parallel operation". With this in place, I don't need to change
> any of the original calls to GetCurrentCommandId(), so this addresses
> that issue raised by Andres.
>
> I am not sure yet how to get past the issue of the parallel mode being
> set at the top of ExecutePlan(). With that in place, it doesn't allow
> a xid to be acquired for the leader, without removing/changing that
> parallel-mode check in GetNewTransactionId().
>

I think now there is no fundamental problem in allocating xid in the
leader and then sharing it with workers who can use it to perform the
insert. So we can probably tweak that check so that it is true for
only parallel workers.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

akapila
In reply to this post by Bharath Rupireddy
On Fri, Sep 25, 2020 at 2:31 PM Bharath Rupireddy
<[hidden email]> wrote:

>
> On Tue, Sep 22, 2020 at 10:26 AM Greg Nancarrow <[hidden email]> wrote:
> >
> > For cases where it can't be allowed (e.g. INSERT into a table with
> > foreign keys, or INSERT INTO ... SELECT ... ON CONFLICT ... DO UPDATE
> > ...") it at least allows parallelism of the SELECT part.
> >
>
> Thanks Greg for the patch.
>
> 2. What happens if the target table has triggers(before statement,
> after statement, before row, after row) that are parallel unsafe?
>

In such a case, the parallel insert shouldn't be selected. However, we
should still be able to execute the Select part in parallel.

> 3. Will each worker be doing single row insertions or multi inserts?
> If single row insertions, will the buffer lock contentions be more?
>

I don't think the purpose of this patch is to change the basic flow of
how Insert works and also I am not sure if it is worth the effort as
well. I have answered this earlier in a bit more detailed way [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1Ks8Sqs29VHPS6koNj5E9YQdkGCzgGsSrQMeUbQfe28yg%40mail.gmail.com

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

Bharath Rupireddy
In reply to this post by Greg Nancarrow
On Fri, Sep 25, 2020 at 9:23 PM Greg Nancarrow <[hidden email]> wrote:

>
> On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <[hidden email]> wrote:
> >
>
> Again, there's a fundamental difference in the Parallel Insert case.
> Right at the top of ExecutePlan it calls EnterParallelMode().
> For ParallelCopy(), there is no such problem. EnterParallelMode() is
> only called just before ParallelCopyMain() is called. So it can easily
> acquire the xid before this, because then parallel mode is not set.
>
> As it turns out, I think I have solved the commandId issue (and almost
> the xid issue) by realising that both the xid and cid are ALREADY
> being included as part of the serialized transaction state in the
> Parallel DSM. So actually I don't believe that there is any need for
> separately passing them in the DSM, and having to use those
> AssignXXXXForWorker() functions in the worker code - not even in the
> Parallel Copy case (? - need to check).
>

Thanks Gred for the detailed points.

I further checked on full txn id and command id. Yes, these are
getting passed to workers  via InitializeParallelDSM() ->
SerializeTransactionState(). I tried to summarize what we need to do
in case of parallel inserts in general i.e. parallel COPY, parallel
inserts in INSERT INTO and parallel inserts in CTAS.

In the leader:
    GetCurrentFullTransactionId()
    GetCurrentCommandId(true)
    EnterParallelMode();
    InitializeParallelDSM() --> calls SerializeTransactionState()
(both full txn id and command id are serialized into parallel DSM)

In the workers:
ParallelWorkerMain() -->  calls StartParallelWorkerTransaction() (both
full txn id and command id are restored into workers'
CurrentTransactionState->fullTransactionId and currentCommandId)
If the parallel workers are meant for insertions, then we need to set
currentCommandIdUsed = true; Maybe we can lift the assert in
GetCurrentCommandId(), if we don't want to touch that function, then
we can have a new function GetCurrentCommandidInWorker() whose
functionality will be same as GetCurrentCommandId() without the
Assert(!IsParallelWorker());.

Am I missing something?

If the above points are true, we might have to update the parallel
copy patch set, test the use cases and post separately in the parallel
copy thread in coming days.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

akapila
On Sat, Sep 26, 2020 at 11:00 AM Bharath Rupireddy
<[hidden email]> wrote:

>
> On Fri, Sep 25, 2020 at 9:23 PM Greg Nancarrow <[hidden email]> wrote:
> >
> > On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <[hidden email]> wrote:
> > >
> >
> > Again, there's a fundamental difference in the Parallel Insert case.
> > Right at the top of ExecutePlan it calls EnterParallelMode().
> > For ParallelCopy(), there is no such problem. EnterParallelMode() is
> > only called just before ParallelCopyMain() is called. So it can easily
> > acquire the xid before this, because then parallel mode is not set.
> >
> > As it turns out, I think I have solved the commandId issue (and almost
> > the xid issue) by realising that both the xid and cid are ALREADY
> > being included as part of the serialized transaction state in the
> > Parallel DSM. So actually I don't believe that there is any need for
> > separately passing them in the DSM, and having to use those
> > AssignXXXXForWorker() functions in the worker code - not even in the
> > Parallel Copy case (? - need to check).
> >
>
> Thanks Gred for the detailed points.
>
> I further checked on full txn id and command id. Yes, these are
> getting passed to workers  via InitializeParallelDSM() ->
> SerializeTransactionState(). I tried to summarize what we need to do
> in case of parallel inserts in general i.e. parallel COPY, parallel
> inserts in INSERT INTO and parallel inserts in CTAS.
>
> In the leader:
>     GetCurrentFullTransactionId()
>     GetCurrentCommandId(true)
>     EnterParallelMode();
>     InitializeParallelDSM() --> calls SerializeTransactionState()
> (both full txn id and command id are serialized into parallel DSM)
>

This won't be true for Parallel Insert patch as explained by Greg as
well because we enter-parallel-mode much before we assign xid.


--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Parallel INSERT (INTO ... SELECT ...)

akapila
In reply to this post by Greg Nancarrow
On Fri, Sep 25, 2020 at 9:11 AM Greg Nancarrow <[hidden email]> wrote:

>
> > > What if this
> > > ends up being invoked from inside C code?
> > >
> >
> > I think it shouldn't be a problem unless one is trying to do something
> > like insert into foreign key table. So, probably we can have an Assert
> > to catch it if possible. Do you have any other idea?
> >
>
> Note that the planner code updated by the patch does avoid creating a
> Parallel INSERT plan in the case of inserting into a table with a
> foreign key (so commandIds won't be created in the parallel-worker
> code).
> I'm not sure how to distinguish the "invoked from inside C code" case though.
>

I think if possible we can have an Assert to check if it is a
parallel-worker and relation has a foreign-key. Similarly, we can
enhance the check for any other un-safe use. This will prevent the
illegal usage of inserts via parallel workers.

--
With Regards,
Amit Kapila.


12345