Parallel copy

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
127 messages Options
1 ... 4567
Reply | Threaded
Open this post in threaded view
|

Re: Parallel copy

Ashutosh Sharma
I think, when doing the performance testing for partitioned table, it would be good to also mention about the distribution of data in the input file. One possible data distribution could be that we have let's say 100 tuples in the input file, and every consecutive tuple belongs to a different partition.

On Thu, Jul 23, 2020 at 8:51 AM Bharath Rupireddy <[hidden email]> wrote:
On Wed, Jul 22, 2020 at 7:56 PM vignesh C <[hidden email]> wrote:

>
> Thanks for reviewing and providing the comments Ashutosh.
> Please find my thoughts below:
>
> On Fri, Jul 17, 2020 at 7:18 PM Ashutosh Sharma <[hidden email]> wrote:
> >
> > Some review comments (mostly) from the leader side code changes:  
> >
> > 3) Should we allow Parallel Copy when the insert method is CIM_MULTI_CONDITIONAL?
> >
> > +   /* Check if the insertion mode is single. */
> > +   if (FindInsertMethod(cstate) == CIM_SINGLE)
> > +       return false;
> >
> > I know we have added checks in CopyFrom() to ensure that if any trigger (before row or instead of) is found on any of partition being loaded with data, then COPY FROM operation would fail, but does it mean that we are okay to perform parallel copy on partitioned table. Have we done some performance testing with the partitioned table where the data in the input file needs to be routed to the different partitions?
> >
>
> Partition data is handled like what Amit had told in one of earlier mails [1].  My colleague Bharath has run performance test with partition table, he will be sharing the results.
>

I ran tests for partitioned use cases - results are similar to that of non partitioned cases[1].

parallel workers test case 1(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, 3 indexes on integer columns unique data test case 2(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, unique data
0 205.403(1X) 135(1X)
2 114.724(1.79X) 59.388(2.27X)
4 99.017(2.07X) 56.742(2.34X)
8 99.722(2.06X) 66.323(2.03X)
16 98.147(2.09X) 66.054(2.04X)
20 97.723(2.1X) 66.389(2.03X)
30 97.048(2.11X) 70.568(1.91X)

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: Parallel copy

Bharath Rupireddy
In reply to this post by akapila
On Thu, Jul 23, 2020 at 9:22 AM Amit Kapila <[hidden email]> wrote:
>
>>
>> I ran tests for partitioned use cases - results are similar to that of non partitioned cases[1].
>
>
> I could see the gain up to 10-11 times for non-partitioned cases [1], can we use similar test case here as well (with one of the indexes on text column or having gist index) to see its impact?
>
> [1] - https://www.postgresql.org/message-id/CALj2ACVR4WE98Per1H7ajosW8vafN16548O2UV8bG3p4D3XnPg%40mail.gmail.com
>

Thanks Amit! Please find the results of detailed testing done for partitioned use cases:

Range Partitions: consecutive rows go into the same partitions.
parallel workers test case 1(exec time in sec): copy from csv file, 2 indexes on integer columns and 1 index on text column, 4 range partitions test case 2(exec time in sec): copy from csv file, 1 gist index on text column, 4 range partitions test case 3(exec time in sec): copy from csv file, 3 indexes on integer columns, 4 range partitions
0 1051.924(1X) 785.052(1X) 205.403(1X)
2 589.576(1.78X) 421.974(1.86X) 114.724(1.79X)
4 321.960(3.27X) 230.997(3.4X) 99.017(2.07X)
8 199.245(5.23X) 156.132(5.02X) 99.722(2.06X)
16 127.343(8.26X) 173.696(4.52X) 98.147(2.09X)
20 122.029(8.62X) 186.418(4.21X) 97.723(2.1X)
30 142.876(7.36X) 214.598(3.66X) 97.048(2.11X)

On Thu, Jul 23, 2020 at 10:21 AM Ashutosh Sharma <[hidden email]> wrote:
>
> I think, when doing the performance testing for partitioned table, it would be good to also mention about the distribution of data in the input file. One possible data distribution could be that we have let's say 100 tuples in the input file, and every consecutive tuple belongs to a different partition.
>

To address Ashutosh's point, I used hash partitioning. Hope this helps to clear the doubt.

Hash Partitions: where there are high chances that consecutive rows may go into different partitions.
parallel workers test case 1(exec time in sec): copy from csv file, 2 indexes on integer columns and 1 index on text column, 4 hash partitions test case 2(exec time in sec): copy from csv file, 1 gist index on text column, 4 hash partitions test case 3(exec time in sec): copy from csv file, 3 indexes on integer columns, 4 hash partitions
0 1060.884(1X) 812.283(1X) 207.745(1X)
2 572.542(1.85X) 418.454(1.94X) 107.850(1.93X)
4 298.132(3.56X) 227.367(3.57X) 83.895(2.48X)
8 169.449(6.26X) 137.993(5.89X) 85.411(2.43X)
16 112.297(9.45X) 95.167(8.53X) 96.136(2.16X)
20 101.546(10.45X) 90.552(8.97X) 97.066(2.14X)
30 113.877(9.32X) 127.17(6.38X) 96.819(2.14X)


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: Parallel copy

vignesh C
The patches were not applying because of the recent commits.
I have rebased the patch over head & attached.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

On Thu, Jul 23, 2020 at 6:07 PM Bharath Rupireddy <[hidden email]> wrote:
On Thu, Jul 23, 2020 at 9:22 AM Amit Kapila <[hidden email]> wrote:
>
>>
>> I ran tests for partitioned use cases - results are similar to that of non partitioned cases[1].
>
>
> I could see the gain up to 10-11 times for non-partitioned cases [1], can we use similar test case here as well (with one of the indexes on text column or having gist index) to see its impact?
>
> [1] - https://www.postgresql.org/message-id/CALj2ACVR4WE98Per1H7ajosW8vafN16548O2UV8bG3p4D3XnPg%40mail.gmail.com
>

Thanks Amit! Please find the results of detailed testing done for partitioned use cases:

Range Partitions: consecutive rows go into the same partitions.
parallel workers test case 1(exec time in sec): copy from csv file, 2 indexes on integer columns and 1 index on text column, 4 range partitions test case 2(exec time in sec): copy from csv file, 1 gist index on text column, 4 range partitions test case 3(exec time in sec): copy from csv file, 3 indexes on integer columns, 4 range partitions
0 1051.924(1X) 785.052(1X) 205.403(1X)
2 589.576(1.78X) 421.974(1.86X) 114.724(1.79X)
4 321.960(3.27X) 230.997(3.4X) 99.017(2.07X)
8 199.245(5.23X) 156.132(5.02X) 99.722(2.06X)
16 127.343(8.26X) 173.696(4.52X) 98.147(2.09X)
20 122.029(8.62X) 186.418(4.21X) 97.723(2.1X)
30 142.876(7.36X) 214.598(3.66X) 97.048(2.11X)

On Thu, Jul 23, 2020 at 10:21 AM Ashutosh Sharma <[hidden email]> wrote:
>
> I think, when doing the performance testing for partitioned table, it would be good to also mention about the distribution of data in the input file. One possible data distribution could be that we have let's say 100 tuples in the input file, and every consecutive tuple belongs to a different partition.
>

To address Ashutosh's point, I used hash partitioning. Hope this helps to clear the doubt.

Hash Partitions: where there are high chances that consecutive rows may go into different partitions.
parallel workers test case 1(exec time in sec): copy from csv file, 2 indexes on integer columns and 1 index on text column, 4 hash partitions test case 2(exec time in sec): copy from csv file, 1 gist index on text column, 4 hash partitions test case 3(exec time in sec): copy from csv file, 3 indexes on integer columns, 4 hash partitions
0 1060.884(1X) 812.283(1X) 207.745(1X)
2 572.542(1.85X) 418.454(1.94X) 107.850(1.93X)
4 298.132(3.56X) 227.367(3.57X) 83.895(2.48X)
8 169.449(6.26X) 137.993(5.89X) 85.411(2.43X)
16 112.297(9.45X) 95.167(8.53X) 96.136(2.16X)
20 101.546(10.45X) 90.552(8.97X) 97.066(2.14X)
30 113.877(9.32X) 127.17(6.38X) 96.819(2.14X)


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

v2-0001-Copy-code-readjustment-to-support-parallel-copy.patch (22K) Download Attachment
v2-0002-Framework-for-leader-worker-in-parallel-copy.patch (43K) Download Attachment
v2-0003-Allow-copy-from-command-to-process-data-from-file.patch (58K) Download Attachment
v2-0004-Documentation-for-parallel-copy.patch (2K) Download Attachment
v2-0005-Tests-for-parallel-copy.patch (27K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Parallel copy

Bharath Rupireddy
On Sat, Aug 1, 2020 at 9:55 AM vignesh C <[hidden email]> wrote:
>
> The patches were not applying because of the recent commits.
> I have rebased the patch over head & attached.
>
I rebased v2-0006-Parallel-Copy-For-Binary-Format-Files.patch.

Putting together all the patches rebased on to the latest commit
b8fdee7d0ca8bd2165d46fb1468f75571b706a01. Patches from 0001 to 0005
are rebased by Vignesh, that are from the previous mail and the patch
0006 is rebased by me.

Please consider this patch set for further review.


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

v2-0001-Copy-code-readjustment-to-support-parallel-copy.patch (22K) Download Attachment
v2-0002-Framework-for-leader-worker-in-parallel-copy.patch (43K) Download Attachment
v2-0003-Allow-copy-from-command-to-process-data-from-file.patch (58K) Download Attachment
v2-0004-Documentation-for-parallel-copy.patch (2K) Download Attachment
v2-0005-Tests-for-parallel-copy.patch (27K) Download Attachment
v2-0006-Parallel-Copy-For-Binary-Format-Files.patch (37K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Parallel copy

Tomas Vondra-4
On Mon, Aug 03, 2020 at 12:33:48PM +0530, Bharath Rupireddy wrote:

>On Sat, Aug 1, 2020 at 9:55 AM vignesh C <[hidden email]> wrote:
>>
>> The patches were not applying because of the recent commits.
>> I have rebased the patch over head & attached.
>>
>I rebased v2-0006-Parallel-Copy-For-Binary-Format-Files.patch.
>
>Putting together all the patches rebased on to the latest commit
>b8fdee7d0ca8bd2165d46fb1468f75571b706a01. Patches from 0001 to 0005
>are rebased by Vignesh, that are from the previous mail and the patch
>0006 is rebased by me.
>
>Please consider this patch set for further review.
>

I'd suggest incrementing the version every time an updated version is
submitted, even if it's just a rebased version. It makes it clearer
which version of the code is being discussed, etc.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Parallel copy

vignesh C
On Tue, Aug 4, 2020 at 9:51 PM Tomas Vondra
<[hidden email]> wrote:

>
> On Mon, Aug 03, 2020 at 12:33:48PM +0530, Bharath Rupireddy wrote:
> >On Sat, Aug 1, 2020 at 9:55 AM vignesh C <[hidden email]> wrote:
> >>
> >> The patches were not applying because of the recent commits.
> >> I have rebased the patch over head & attached.
> >>
> >I rebased v2-0006-Parallel-Copy-For-Binary-Format-Files.patch.
> >
> >Putting together all the patches rebased on to the latest commit
> >b8fdee7d0ca8bd2165d46fb1468f75571b706a01. Patches from 0001 to 0005
> >are rebased by Vignesh, that are from the previous mail and the patch
> >0006 is rebased by me.
> >
> >Please consider this patch set for further review.
> >
>
> I'd suggest incrementing the version every time an updated version is
> submitted, even if it's just a rebased version. It makes it clearer
> which version of the code is being discussed, etc.

Sure, we will take care of this when we are sending the next set of patches.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Parallel copy

Greg Nancarrow
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, failed

Hi,

I don't claim to yet understand all of the Postgres internals that this patch is updating and interacting with, so I'm still testing and debugging portions of this patch, but would like to give feedback on what I've noticed so far.
I have done some ad-hoc testing of the patch using parallel copies from text/csv/binary files and have not yet struck any execution problems other than some option validation and associated error messages on boundary cases.

One general question that I have: is there a user benefit (over the normal non-parallel COPY) to allowing "COPY ... FROM ... WITH (PARALLEL 1)"?


My following comments are broken down by patch:

(1) v2-0001-Copy-code-readjustment-to-support-parallel-copy.patch

(i) Whilst I can't entirely blame these patches for it (as they are following what is already there), I can't help noticing the use of numerous macros in src/backend/commands/copy.c which paste in multiple lines of code in various places.
It's getting a little out-of-hand. Surely the majority of these would be best inline functions instead?
Perhaps hasn't been done because too many parameters need to be passed - thoughts?


(2) v2-0002-Framework-for-leader-worker-in-parallel-copy.patch

(i) minor point: there are some tabbing/spacing issues in this patch (and the other patches), affecting alignment.
e.g. mixed tabs/spaces and misalignment in PARALLEL_COPY_KEY_xxx definitions

(ii)

+/*
+ * Each worker will be allocated WORKER_CHUNK_COUNT of records from DSM data
+ * block to process to avoid lock contention. This value should be mode of
+ * RINGSIZE, as wrap around cases is currently not handled while selecting the
+ * WORKER_CHUNK_COUNT by the worker.
+ */
+#define WORKER_CHUNK_COUNT 50


"This value should be mode of RINGSIZE ..."

-> typo: mode  (mod?  should evenly divide into RINGSIZE?)


(iii)
+ *    using pg_atomic_compare_exchange_u32, worker will change the sate to

->typo: sate  (should be "state")


(iv)

+ errmsg("parallel option supported only for copy from"),

-> suggest change to: errmsg("parallel option is supported only for COPY FROM"),

(v)

+ errno = 0; /* To distinguish success/failure after call */
+ val = strtol(str, &endptr, 10);
+
+ /* Check for various possible errors */
+ if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN))
+ || (errno != 0 && val == 0) ||
+ *endptr)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("improper use of argument to option \"%s\"",
+ defel->defname),
+ parser_errposition(pstate, defel->location)));
+
+ if (endptr == str)
+   ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("no digits were found in argument to option \"%s\"",
+ defel->defname),
+ parser_errposition(pstate, defel->location)));
+
+ cstate->nworkers = (int) val;
+
+ if (cstate->nworkers <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a positive integer greater than zero",
+ defel->defname),
+ parser_errposition(pstate, defel->location)));


I think this validation code needs to be improved, including the error messages (e.g. when can a "positive integer" NOT be greater than zero?)

There is some overlap in the "no digits were found" case between the two conditions above, depending, for example, if the argument is quoted.
Also, "improper use of argument to option" sounds a bit odd and vague to me.
Finally, not range checking before casting long to int can lead to allowing out-of-range int values like in the following case:

test=# copy mytable from '/myspace/test_pcopy/tmp.dat' (parallel '-2147483648');
ERROR:  argument to option "parallel" must be a positive integer greater than zero
LINE 1: copy mytable from '/myspace/test_pcopy/tmp.dat' (parallel '-2...
                                                        ^
BUT the following is allowed...

test=# copy mytable from '/myspace/test_pcopy/tmp.dat' (parallel '-2147483649');
COPY 1000000


I'd suggest to change the above validation code to do similar validation to that for the CREATE TABLE parallel_workers storage parameter (case RELOPT_TYPE_INT in reloptions.c). Like that code, wouldn't it be best to range-check the integer option value to be within a reasonable range, say 1 to 1024, with a corresponding errdetail message if possible?


(3) v2-0003-Allow-copy-from-command-to-process-data-from-file.patch

(i)

Patch comment says:

"This feature allows the copy from to leverage multiple CPUs in order to copy
data from file/STDIN to a table. This adds a PARALLEL option to COPY FROM
command where the user can specify the number of workers that can be used
to perform the COPY FROM command. Specifying zero as number of workers will
disable parallelism."

BUT - the changes to ProcessCopyOptions() specified in "v2-0002-Framework-for-leader-worker-in-parallel-copy.patch" do not allow zero workers to be specified - you get an error in that case. Patch comment should be updated accordingly.

(ii)

#define GETPROCESSED(processed) \
-return processed;
+if (!IsParallelCopy()) \
+ return processed; \
+else \
+ return pg_atomic_read_u64(&cstate->pcdata->pcshared_info->processed);
+

I think GETPROCESSED would be better named "RETURNPROCESSED".

(iii)

The below comment seems out- of-date with the current code - is it referring to the loop embedded at the bottom of the current loop that the comment is within?

+ /*
+ * There is a possibility that the above loop has come out because
+ * data_blk_ptr->curr_blk_completed is set, but dataSize read might
+ * be an old value, if data_blk_ptr->curr_blk_completed and the line is
+ * completed, line_size will be set. Read the line_size again to be
+ * sure if it is complete or partial block.
+ */

(iv)

I may be wrong here, but in the following block of code, isn't there a window of opportunity (however small) in which the line_state might be updated (LINE_WORKER_PROCESSED) by another worker just AFTER pg_atomic_read_u32() returns the current line_state which is put into curr_line_state, such that a write_pos update might be missed? And then a race-condition exists for reading/setting line_size (since line_size gets atomically set after line_state is set)?
If I am wrong in thinking this synchronization might not be correct, maybe the comments could be improved here to explain how this code is safe in that respect.


+ /* Get the current line information. */
+ lineInfo = &pcshared_info->line_boundaries.ring[write_pos];
+ curr_line_state = pg_atomic_read_u32(&lineInfo->line_state);
+ if ((write_pos % WORKER_CHUNK_COUNT == 0) &&
+ (curr_line_state == LINE_WORKER_PROCESSED ||
+ curr_line_state == LINE_WORKER_PROCESSING))
+ {
+ pcdata->worker_processed_pos = write_pos;
+ write_pos = (write_pos + WORKER_CHUNK_COUNT) %  RINGSIZE;
+ continue;
+ }
+
+ /* Get the size of this line. */
+ dataSize = pg_atomic_read_u32(&lineInfo->line_size);
+
+ if (dataSize != 0) /* If not an empty line. */
+ {
+ /* Get the block information. */
+ data_blk_ptr = &pcshared_info->data_blocks[lineInfo->first_block];
+
+ if (!data_blk_ptr->curr_blk_completed && (dataSize == -1))
+ {
+ /* Wait till the current line or block is added. */
+ COPY_WAIT_TO_PROCESS()
+ continue;
+ }
+ }
+
+ /* Make sure that no worker has consumed this element. */
+ if (pg_atomic_compare_exchange_u32(&lineInfo->line_state,
+   &line_state, LINE_WORKER_PROCESSING))
+ break;


(4) v2-0004-Documentation-for-parallel-copy.patch

(i) I think that it is necessary to mention the "max_worker_processes" option in the description of the COPY statement PARALLEL option.

For example, something like:

+      Perform <command>COPY FROM</command> in parallel using <replaceable
+      class="parameter"> integer</replaceable> background workers.  Please
+      note that it is not guaranteed that the number of parallel workers
+      specified in <replaceable class="parameter">integer</replaceable> will
+      be used during execution.  It is possible for a copy to run with fewer
+      workers than specified, or even with no workers at all (for example,
+      due to the setting of max_worker_processes).  This option is allowed
+      only in <command>COPY FROM</command>.
 

(5) v2-0005-Tests-for-parallel-copy.patch

(i) None of the provided tests seem to test beyond "PARALLEL 2"


(6) v2-0006-Parallel-Copy-For-Binary-Format-Files.patch

(i) In the ParallelCopyFrom() function, "cstate->raw_buf" is pfree()d:

+ /* raw_buf is not used in parallel copy, instead data blocks are used.*/
+ pfree(cstate->raw_buf);


This comment doesn't seem to be entirely true.
At least for text/csv file COPY FROM, cstate->raw_buf is subsequently referenced in the SetRawBufForLoad() function, which is called by CopyReadLineText():

    cur_data_blk_ptr = (cstate->raw_buf) ? &pcshared_info->data_blocks[cur_block_pos] : NULL;

So I think cstate->raw_buf should be set to NULL after being pfree()d, and the comment fixed/adjusted.


(ii) This patch adds some macros (involving parallel copy checks) AFTER the comment:

/* End parallel copy Macros */


Regards,
Greg Nancarrow
Fujitsu Australia
1 ... 4567