[POC] Fast COPY FROM command for the table with foreign partitions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
Hi, hackers!

Currently i see, COPY FROM insertion into the partitioned table with
foreign partitions is not optimal: even if table constraints allows can
do multi insert copy, we will flush the buffers and prepare new INSERT
query for each tuple, routed into the foreign partition.
To solve this problem i tried to use the multi insert buffers for
foreign tuples too. Flushing of these buffers performs by the analogy
with 'COPY .. FROM STDIN' machinery as it is done by the psql '\copy'
command.
The patch in attachment was prepared from the private scratch developed
by Arseny Sher a couple of years ago.
Benchmarks shows that it speeds up COPY FROM operation:
Command "COPY pgbench_accounts FROM ..." (test file contains 1e7 tuples,
copy to three partitions) executes on my laptop in 14 minutes without
the patch and in 1.5 minutes with the patch. Theoretical minimum here
(with infinite buffer size) is 40 seconds.

A couple of questions:
1. Can this feature be interesting for the PostgreSQL core or not?
2. If this is a useful feature, is the correct way chosen?

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Etsuro Fujita-2
Hi Andrey,

On Mon, Jun 1, 2020 at 6:29 PM Andrey Lepikhov
<[hidden email]> wrote:

> Currently i see, COPY FROM insertion into the partitioned table with
> foreign partitions is not optimal: even if table constraints allows can
> do multi insert copy, we will flush the buffers and prepare new INSERT
> query for each tuple, routed into the foreign partition.
> To solve this problem i tried to use the multi insert buffers for
> foreign tuples too. Flushing of these buffers performs by the analogy
> with 'COPY .. FROM STDIN' machinery as it is done by the psql '\copy'
> command.
> The patch in attachment was prepared from the private scratch developed
> by Arseny Sher a couple of years ago.
> Benchmarks shows that it speeds up COPY FROM operation:
> Command "COPY pgbench_accounts FROM ..." (test file contains 1e7 tuples,
> copy to three partitions) executes on my laptop in 14 minutes without
> the patch and in 1.5 minutes with the patch. Theoretical minimum here
> (with infinite buffer size) is 40 seconds.

Great!

> A couple of questions:
> 1. Can this feature be interesting for the PostgreSQL core or not?

Yeah, I think this is especially useful for sharding.

> 2. If this is a useful feature, is the correct way chosen?

I think I also thought something similar to this before [1].  Will take a look.

Thanks!

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
Thank you for the answer,

02.06.2020 05:02, Etsuro Fujita пишет:
> I think I also thought something similar to this before [1].  Will take a look.

> [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp
>
I have looked into the thread.
My first version of the patch was like your idea. But when developing
the “COPY FROM” code, the following features were discovered:
1. Two or more partitions can be placed at the same node. We need to
finish COPY into one partition before start COPY into another partition
at the same node.
2. On any error we need to send EOF to all started "COPY .. FROM STDIN"
operations. Otherwise FDW can't cancel operation.

Hiding the COPY code under the buffers management machinery allows us to
generalize buffers machinery, execute one COPY operation on each buffer
and simplify error handling.

As i understand, main idea of the thread, mentioned by you, is to add
"COPY FROM" support without changes in FDW API.
It is possible to remove BeginForeignCopy() and EndForeignCopy() from
the patch. But it is not trivial to change ExecForeignInsert() for the
COPY purposes.
All that I can offer in this place now is to introduce one new
ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
STDIN" operation, send tuples and close the operation. We can use the
ExecForeignInsert() routine for each buffer tuple if
ExecForeignBulkInsert() is not supported.

One of main questions here is to use COPY TO machinery for serializing a
tuple. It is needed (if you will take a look into the patch) to
transform the CopyTo() routine to an iterative representation:
start/next/finish. May it be acceptable?

In the attachment there is a patch with the correction of a stupid error.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Ashutosh Bapat-2
Thanks Andrey for the patch. I am glad that the patch has taken care
of some corner cases already but there exist still more.

COPY command constructed doesn't take care of dropped columns. There
is code in deparseAnalyzeSql which constructs list of columns for a
given foreign relation. 0002 patch attached here, moves that code to a
separate function and reuses it for COPY. If you find that code change
useful please include it in the main patch.

While working on that, I found two issues
1. The COPY command constructed an empty columns list when there were
no non-dropped columns in the relation. This caused a syntax error.
Fixed that in 0002.
2. In the same case, if the foreign table declared locally didn't have
any non-dropped columns but the relation that it referred to on the
foreign server had some non-dropped columns, COPY command fails. I
added a test case for this in 0002 but haven't fixed it.

I think this work is useful. Please add it to the next commitfest so
that it's tracked.

On Tue, Jun 2, 2020 at 11:21 AM Andrey Lepikhov
<[hidden email]> wrote:

>
> Thank you for the answer,
>
> 02.06.2020 05:02, Etsuro Fujita пишет:
> > I think I also thought something similar to this before [1].  Will take a look.
>
> > [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp
> >
> I have looked into the thread.
> My first version of the patch was like your idea. But when developing
> the “COPY FROM” code, the following features were discovered:
> 1. Two or more partitions can be placed at the same node. We need to
> finish COPY into one partition before start COPY into another partition
> at the same node.
> 2. On any error we need to send EOF to all started "COPY .. FROM STDIN"
> operations. Otherwise FDW can't cancel operation.
>
> Hiding the COPY code under the buffers management machinery allows us to
> generalize buffers machinery, execute one COPY operation on each buffer
> and simplify error handling.
>
> As i understand, main idea of the thread, mentioned by you, is to add
> "COPY FROM" support without changes in FDW API.
> It is possible to remove BeginForeignCopy() and EndForeignCopy() from
> the patch. But it is not trivial to change ExecForeignInsert() for the
> COPY purposes.
> All that I can offer in this place now is to introduce one new
> ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
> STDIN" operation, send tuples and close the operation. We can use the
> ExecForeignInsert() routine for each buffer tuple if
> ExecForeignBulkInsert() is not supported.
>
> One of main questions here is to use COPY TO machinery for serializing a
> tuple. It is needed (if you will take a look into the patch) to
> transform the CopyTo() routine to an iterative representation:
> start/next/finish. May it be acceptable?
>
> In the attachment there is a patch with the correction of a stupid error.
>
> --
> Andrey Lepikhov
> Postgres Professional
> https://postgrespro.com
> The Russian Postgres Company


--
Best Wishes,
Ashutosh Bapat

0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch (27K) Download Attachment
0002-Separate-code-to-list-all-columns-of-a-foreign-relat.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
On 6/15/20 10:26 AM, Ashutosh Bapat wrote:
> Thanks Andrey for the patch. I am glad that the patch has taken care
> of some corner cases already but there exist still more.
>
> COPY command constructed doesn't take care of dropped columns. There
> is code in deparseAnalyzeSql which constructs list of columns for a
> given foreign relation. 0002 patch attached here, moves that code to a
> separate function and reuses it for COPY. If you find that code change
> useful please include it in the main patch.

Thanks, i included it.

> 2. In the same case, if the foreign table declared locally didn't have
> any non-dropped columns but the relation that it referred to on the
> foreign server had some non-dropped columns, COPY command fails. I
> added a test case for this in 0002 but haven't fixed it.

I fixed it.
This is very special corner case. The problem was that COPY FROM does
not support semantics like the "INSERT INTO .. DEFAULT VALUES". To
simplify the solution, i switched off bulk copying for this case.

 > I think this work is useful. Please add it to the next commitfest so
 > that it's tracked.
Ok.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com

v2-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch (25K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Etsuro Fujita-2
In reply to this post by Andrey V. Lepikhov
On Tue, Jun 2, 2020 at 2:51 PM Andrey Lepikhov
<[hidden email]> wrote:
> 02.06.2020 05:02, Etsuro Fujita пишет:
> > I think I also thought something similar to this before [1].  Will take a look.

I'm still reviewing the patch, but let me comment on it.

> > [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp

> I have looked into the thread.
> My first version of the patch was like your idea. But when developing
> the “COPY FROM” code, the following features were discovered:
> 1. Two or more partitions can be placed at the same node. We need to
> finish COPY into one partition before start COPY into another partition
> at the same node.
> 2. On any error we need to send EOF to all started "COPY .. FROM STDIN"
> operations. Otherwise FDW can't cancel operation.
>
> Hiding the COPY code under the buffers management machinery allows us to
> generalize buffers machinery, execute one COPY operation on each buffer
> and simplify error handling.

I'm not sure that it's really a good idea that the bulk-insert API is
designed the way it's tightly coupled with the bulk-insert machinery
in the core, because 1) some FDWs might want to send tuples provided
by the core to the remote, one by one, without storing them in a
buffer, or 2) some other FDWs might want to store the tuples in the
buffer and send them in a lump as postgres_fdw in the proposed patch
but might want to do so independently of MAX_BUFFERED_TUPLES and/or
MAX_BUFFERED_BYTES defined in the bulk-insert machinery.

I agree that we would need special handling for cases you mentioned
above if we design this API based on something like the idea I
proposed in that thread.

> As i understand, main idea of the thread, mentioned by you, is to add
> "COPY FROM" support without changes in FDW API.

I don't think so; I think we should introduce new API for this feature
to keep the ExecForeignInsert() API simple.

> All that I can offer in this place now is to introduce one new
> ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
> STDIN" operation, send tuples and close the operation. We can use the
> ExecForeignInsert() routine for each buffer tuple if
> ExecForeignBulkInsert() is not supported.

Agreed.

> One of main questions here is to use COPY TO machinery for serializing a
> tuple. It is needed (if you will take a look into the patch) to
> transform the CopyTo() routine to an iterative representation:
> start/next/finish. May it be acceptable?

+1 for the general idea.

> In the attachment there is a patch with the correction of a stupid error.

Thanks for the patch!

Sorry for the delay.

Best regards,
Etsuro Fujita


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov


19.06.2020 19:58, Etsuro Fujita пишет:

> On Tue, Jun 2, 2020 at 2:51 PM Andrey Lepikhov
> <[hidden email]> wrote:
>> Hiding the COPY code under the buffers management machinery allows us to
>> generalize buffers machinery, execute one COPY operation on each buffer
>> and simplify error handling.
>
> I'm not sure that it's really a good idea that the bulk-insert API is
> designed the way it's tightly coupled with the bulk-insert machinery
> in the core, because 1) some FDWs might want to send tuples provided
> by the core to the remote, one by one, without storing them in a
> buffer, or 2) some other FDWs might want to store the tuples in the
> buffer and send them in a lump as postgres_fdw in the proposed patch
> but might want to do so independently of MAX_BUFFERED_TUPLES and/or
> MAX_BUFFERED_BYTES defined in the bulk-insert machinery.
>
> I agree that we would need special handling for cases you mentioned
> above if we design this API based on something like the idea I
> proposed in that thread.
Agreed
>
>> As i understand, main idea of the thread, mentioned by you, is to add
>> "COPY FROM" support without changes in FDW API.
>
> I don't think so; I think we should introduce new API for this feature
> to keep the ExecForeignInsert() API simple.
Ok
>
>> All that I can offer in this place now is to introduce one new
>> ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
>> STDIN" operation, send tuples and close the operation. We can use the
>> ExecForeignInsert() routine for each buffer tuple if
>> ExecForeignBulkInsert() is not supported.
>
> Agreed.
In the next version (see attachment) of the patch i removed Begin/End
fdwapi routines. Now we have only the ExecForeignBulkInsert() routine.

--
Andrey Lepikhov
Postgres Professional

v3-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch (24K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Ashutosh Bapat-3
In reply to this post by Andrey V. Lepikhov


On Wed, 17 Jun 2020 at 11:54, Andrey V. Lepikhov <[hidden email]> wrote:
On 6/15/20 10:26 AM, Ashutosh Bapat wrote:
> Thanks Andrey for the patch. I am glad that the patch has taken care
> of some corner cases already but there exist still more.
>
> COPY command constructed doesn't take care of dropped columns. There
> is code in deparseAnalyzeSql which constructs list of columns for a
> given foreign relation. 0002 patch attached here, moves that code to a
> separate function and reuses it for COPY. If you find that code change
> useful please include it in the main patch.

Thanks, i included it.

> 2. In the same case, if the foreign table declared locally didn't have
> any non-dropped columns but the relation that it referred to on the
> foreign server had some non-dropped columns, COPY command fails. I
> added a test case for this in 0002 but haven't fixed it.

I fixed it.
This is very special corner case. The problem was that COPY FROM does
not support semantics like the "INSERT INTO .. DEFAULT VALUES". To
simplify the solution, i switched off bulk copying for this case.

 > I think this work is useful. Please add it to the next commitfest so
 > that it's tracked.
Ok.

It looks like we call BeginForeignInsert and EndForeignInsert even though actual copy is performed using BeginForeignCopy, ExecForeignCopy and EndForeignCopy. BeginForeignInsert constructs the INSERT query which looks unnecessary. Also some of the other PgFdwModifyState members are initialized unnecessarily. It also gives an impression that we are using INSERT underneath the copy. Instead a better way would be to call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy instead of EndForeignInsert, if we are going to use COPY protocol to copy data to the foreign server. Corresponding postgres_fdw implementations need to change in order to do that.

This isn't a full review. I will continue reviewing this patch further.
--
Best Wishes,
Ashutosh
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
On 6/22/20 5:11 PM, Ashutosh Bapat wrote:

> <[hidden email] <mailto:[hidden email]>> wrote:
> It looks like we call BeginForeignInsert and EndForeignInsert even
> though actual copy is performed using BeginForeignCopy, ExecForeignCopy
> and EndForeignCopy. BeginForeignInsert constructs the INSERT query which
> looks unnecessary. Also some of the other PgFdwModifyState members are
> initialized unnecessarily. It also gives an impression that we are using
> INSERT underneath the copy. Instead a better way would be to
> call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy
> instead of EndForeignInsert, if we are going to use COPY protocol to
> copy data to the foreign server. Corresponding postgres_fdw
> implementations need to change in order to do that.
>
I did not answer for a long time, because of waiting for the results of
the discussion on Tomas approach to bulk INSERT/UPDATE/DELETE. It seems
more general.
I can move the query construction into the first execution of INSERT or
COPY operation. But another changes seems more invasive because
BeginForeignInsert/EndForeignInsert are used in the execPartition.c
module. We will need to pass copy/insert state of operation into
ExecFindPartition() and ExecCleanupTupleRouting().

--
regards,
Andrey Lepikhov
Postgres Professional


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
In reply to this post by Ashutosh Bapat-3


22.06.2020 17:11, Ashutosh Bapat пишет:

>
>
> On Wed, 17 Jun 2020 at 11:54, Andrey V. Lepikhov
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On 6/15/20 10:26 AM, Ashutosh Bapat wrote:
>      > Thanks Andrey for the patch. I am glad that the patch has taken care
>      > of some corner cases already but there exist still more.
>      >
>      > COPY command constructed doesn't take care of dropped columns. There
>      > is code in deparseAnalyzeSql which constructs list of columns for a
>      > given foreign relation. 0002 patch attached here, moves that code
>     to a
>      > separate function and reuses it for COPY. If you find that code
>     change
>      > useful please include it in the main patch.
>
>     Thanks, i included it.
>
>      > 2. In the same case, if the foreign table declared locally didn't
>     have
>      > any non-dropped columns but the relation that it referred to on the
>      > foreign server had some non-dropped columns, COPY command fails. I
>      > added a test case for this in 0002 but haven't fixed it.
>
>     I fixed it.
>     This is very special corner case. The problem was that COPY FROM does
>     not support semantics like the "INSERT INTO .. DEFAULT VALUES". To
>     simplify the solution, i switched off bulk copying for this case.
>
>       > I think this work is useful. Please add it to the next commitfest so
>       > that it's tracked.
>     Ok.
>
>
> It looks like we call BeginForeignInsert and EndForeignInsert even
> though actual copy is performed using BeginForeignCopy, ExecForeignCopy
> and EndForeignCopy. BeginForeignInsert constructs the INSERT query which
> looks unnecessary. Also some of the other PgFdwModifyState members are
> initialized unnecessarily. It also gives an impression that we are using
> INSERT underneath the copy. Instead a better way would be to
> call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy
> instead of EndForeignInsert, if we are going to use COPY protocol to
> copy data to the foreign server. Corresponding postgres_fdw
> implementations need to change in order to do that.
Fixed.
I replaced names of CopyIn FDW API. Also the partition routing
initializer calls BeginForeignInsert or BeginForeignCopyIn routines in
accordance with value of ResultRelInfo::UseBulkModifying.
I introduced this parameter because foreign partitions can be placed at
foreign servers with different types of foreign wrapper. Not all
wrappers can support CopyIn API.
Also I ran the Tomas Vondra benchmark. At my laptop we have results:
* regular: 5000 ms.
* Tomas buffering patch: 11000 ms.
* This CopyIn patch: 8000 ms.

--
regards,
Andrey Lepikhov
Postgres Professional

v4-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch (31K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Amit Langote
Hi Andrey,

Thanks for this work.  I have been reading through your patch and
here's a what I understand it does and how:

The patch aims to fix the restriction that COPYing into a foreign
table can't use multi-insert buffer mechanism effectively.  That's
because copy.c currently uses the ExecForeignInsert() FDW API which
can be passed only 1 row at a time.  postgres_fdw's implementation
issues an `INSERT INTO remote_table VALUES (...)` statement to the
remote side for each row, which is pretty inefficient for bulk loads.
The patch introduces a new FDW API ExecForeignCopyIn() that can
receive multiple rows and copy.c now calls it every time it flushes
the multi-insert buffer so that all the flushed rows can be sent to
the remote side in one go.  postgres_fdw's now issues a `COPY
remote_table FROM STDIN` to the remote server and
postgresExecForeignCopyIn() funnels the tuples flushed by the local
copy to the server side waiting for tuples on the COPY protocol.

Here are some comments on the patch.

* Why the "In" in these API names?

+   /* COPY a bulk of tuples into a foreign relation */
+   BeginForeignCopyIn_function BeginForeignCopyIn;
+   EndForeignCopyIn_function EndForeignCopyIn;
+   ExecForeignCopyIn_function ExecForeignCopyIn;

* fdwhandler.sgml should be updated with the description of these new APIs.

* As far as I can tell, the following copy.h additions are for an FDW
to use copy.c to obtain an external representation (char string) to
send to the remote side of the individual rows that are passed to
ExecForeignCopyIn():

+typedef void (*copy_data_dest_cb) (void *outbuf, int len);
+extern CopyState BeginForeignCopyTo(Relation rel);
+extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot);
+extern void EndForeignCopyTo(CopyState cstate);

So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow()
which in turn calls copy.c: CopyOneRowTo() which fills
CopyState.fe_msgbuf.  The data_dest_cb() callback that runs after
fe_msgbuf contains the full row simply copies it into a palloc'd char
buffer whose pointer is returned back to ExecForeignCopyIn().  I
wonder why not let FDWs implement the callback and pass it to copy.c
through BeginForeignCopyTo()?  For example, you could implement a
pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct
pointer of fe_msgbuf to send it to the remote server.

Do you think all FDWs would want to use copy,c like above?  If not,
maybe the above APIs are really postgres_fdw-specific?  Anyway, adding
comments above the definitions of these functions would be helpful.

* I see that the remote copy is performed from scratch on every call
of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
send the `COPY remote_table FROM STDIN` in
postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
when there are no errors during the copy?

I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
and sending `COPY remote_table FROM STDIN` only once instead of on
every flush -- and I see significant speedup.  Please check the
attached patch that applies on top of yours.  One problem I spotted
when trying my patch but didn't spend much time debugging is that
local COPY cannot be interrupted by Ctrl+C anymore, but that should be
fixable by adjusting PG_TRY() blocks.

* ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

pgfdw-copy-buffering-amit-suggests.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
On 7/16/20 2:14 PM, Amit Langote wrote:

> Hi Andrey,
>
> Thanks for this work.  I have been reading through your patch and
> here's a what I understand it does and how:
>
> The patch aims to fix the restriction that COPYing into a foreign
> table can't use multi-insert buffer mechanism effectively.  That's
> because copy.c currently uses the ExecForeignInsert() FDW API which
> can be passed only 1 row at a time.  postgres_fdw's implementation
> issues an `INSERT INTO remote_table VALUES (...)` statement to the
> remote side for each row, which is pretty inefficient for bulk loads.
> The patch introduces a new FDW API ExecForeignCopyIn() that can
> receive multiple rows and copy.c now calls it every time it flushes
> the multi-insert buffer so that all the flushed rows can be sent to
> the remote side in one go.  postgres_fdw's now issues a `COPY
> remote_table FROM STDIN` to the remote server and
> postgresExecForeignCopyIn() funnels the tuples flushed by the local
> copy to the server side waiting for tuples on the COPY protocol.

Fine

> Here are some comments on the patch.
>
> * Why the "In" in these API names?
>
> +   /* COPY a bulk of tuples into a foreign relation */
> +   BeginForeignCopyIn_function BeginForeignCopyIn;
> +   EndForeignCopyIn_function EndForeignCopyIn;
> +   ExecForeignCopyIn_function ExecForeignCopyIn;

I used an analogy from copy.c.

> * fdwhandler.sgml should be updated with the description of these new APIs.


> * As far as I can tell, the following copy.h additions are for an FDW
> to use copy.c to obtain an external representation (char string) to
> send to the remote side of the individual rows that are passed to
> ExecForeignCopyIn():
>
> +typedef void (*copy_data_dest_cb) (void *outbuf, int len);
> +extern CopyState BeginForeignCopyTo(Relation rel);
> +extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot);
> +extern void EndForeignCopyTo(CopyState cstate);
>
> So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow()
> which in turn calls copy.c: CopyOneRowTo() which fills
> CopyState.fe_msgbuf.  The data_dest_cb() callback that runs after
> fe_msgbuf contains the full row simply copies it into a palloc'd char
> buffer whose pointer is returned back to ExecForeignCopyIn().  I
> wonder why not let FDWs implement the callback and pass it to copy.c
> through BeginForeignCopyTo()?  For example, you could implement a
> pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct
> pointer of fe_msgbuf to send it to the remote server.
It is good point! Thank you.

> Do you think all FDWs would want to use copy,c like above?  If not,
> maybe the above APIs are really postgres_fdw-specific?  Anyway, adding
> comments above the definitions of these functions would be helpful.
Agreed
>
> * I see that the remote copy is performed from scratch on every call
> of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
> send the `COPY remote_table FROM STDIN` in
> postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
> when there are no errors during the copy?
It is not possible. FDW share one connection between all foreign
relations from a server. If two or more partitions will be placed at one
foreign server you will have problems with concurrent COPY command. May
be we can create new connection for each partition?
>
> I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
> and sending `COPY remote_table FROM STDIN` only once instead of on
> every flush -- and I see significant speedup.  Please check the
> attached patch that applies on top of yours.
I integrated first change and rejected the second by the reason as above.
   One problem I spotted
> when trying my patch but didn't spend much time debugging is that
> local COPY cannot be interrupted by Ctrl+C anymore, but that should be
> fixable by adjusting PG_TRY() blocks.
Thanks
>
> * ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency.
+1

I will post a new version of the patch a little bit later.

--
regards,
Andrey Lepikhov
Postgres Professional


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
In reply to this post by Amit Langote
On 7/16/20 2:14 PM, Amit Langote wrote:
> Amit Langote
> EnterpriseDB: http://www.enterprisedb.com
>

Version 5 of the patch. With changes caused by Amit's comments.

--
regards,
Andrey Lepikhov
Postgres Professional

v5-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch (38K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Alexey Kondratov-2
Hi Andrey,

On 2020-07-23 09:23, Andrey V. Lepikhov wrote:
> On 7/16/20 2:14 PM, Amit Langote wrote:
>> Amit Langote
>> EnterpriseDB: http://www.enterprisedb.com
>>
>
> Version 5 of the patch. With changes caused by Amit's comments.

Just got a segfault with your v5 patch by deleting from a foreign table.
Here is a part of backtrace:

   * frame #0: 0x00000001029069ec
postgres`ExecShutdownForeignScan(node=0x00007ff28c8909b0) at
nodeForeignscan.c:385:3
     frame #1: 0x00000001028e7b06
postgres`ExecShutdownNode(node=0x00007ff28c8909b0) at
execProcnode.c:779:4
     frame #2: 0x000000010299b3fa
postgres`planstate_walk_members(planstates=0x00007ff28c8906d8, nplans=1,
walker=(postgres`ExecShutdownNode at execProcnode.c:752),
context=0x0000000000000000) at nodeFuncs.c:3998:7
     frame #3: 0x000000010299b010
postgres`planstate_tree_walker(planstate=0x00007ff28c8904c0,
walker=(postgres`ExecShutdownNode at execProcnode.c:752),
context=0x0000000000000000) at nodeFuncs.c:3914:8
     frame #4: 0x00000001028e7ab7
postgres`ExecShutdownNode(node=0x00007ff28c8904c0) at
execProcnode.c:771:2

(lldb) f 0
frame #0: 0x00000001029069ec
postgres`ExecShutdownForeignScan(node=0x00007ff28c8909b0) at
nodeForeignscan.c:385:3
    382 FdwRoutine *fdwroutine = node->fdwroutine;
    383
    384 if (fdwroutine->ShutdownForeignScan)
-> 385 fdwroutine->ShutdownForeignScan(node);
    386 }
(lldb) p node->fdwroutine->ShutdownForeignScan
(ShutdownForeignScan_function) $1 = 0x7f7f7f7f7f7f7f7f

It seems that ShutdownForeignScan inside node->fdwroutine doesn't have a
correct pointer to the required function.

I haven't had a chance to look closer on the code, but you can easily
reproduce this error with the attached script (patched Postgres binaries
should be available in the PATH). It works well with master and fails
with your patch applied.


Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

setup2.sh (886 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov


27.07.2020 21:34, Alexey Kondratov пишет:

> Hi Andrey,
>
> On 2020-07-23 09:23, Andrey V. Lepikhov wrote:
>> On 7/16/20 2:14 PM, Amit Langote wrote:
>>> Amit Langote
>>> EnterpriseDB: http://www.enterprisedb.com
>>>
>>
>> Version 5 of the patch. With changes caused by Amit's comments.
>
> Just got a segfault with your v5 patch by deleting from a foreign table.
> It seems that ShutdownForeignScan inside node->fdwroutine doesn't have a
> correct pointer to the required function.
>
> I haven't had a chance to look closer on the code, but you can easily
> reproduce this error with the attached script (patched Postgres binaries
> should be available in the PATH). It works well with master and fails
> with your patch applied.

I used master a3ab7a707d and v5 version of the patch with your script.
No errors found. Can you check your test case?

--
regards,
Andrey Lepikhov
Postgres Professional


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Alexey Kondratov-2
On 2020-07-28 03:33, Andrey Lepikhov wrote:

> 27.07.2020 21:34, Alexey Kondratov пишет:
>> Hi Andrey,
>>
>> On 2020-07-23 09:23, Andrey V. Lepikhov wrote:
>>> On 7/16/20 2:14 PM, Amit Langote wrote:
>>>> Amit Langote
>>>> EnterpriseDB: http://www.enterprisedb.com
>>>>
>>>
>>> Version 5 of the patch. With changes caused by Amit's comments.
>>
>> Just got a segfault with your v5 patch by deleting from a foreign
>> table. It seems that ShutdownForeignScan inside node->fdwroutine
>> doesn't have a correct pointer to the required function.
>>
>> I haven't had a chance to look closer on the code, but you can easily
>> reproduce this error with the attached script (patched Postgres
>> binaries should be available in the PATH). It works well with master
>> and fails with your patch applied.
>
> I used master a3ab7a707d and v5 version of the patch with your script.
> No errors found. Can you check your test case?

Yes, my bad. I forgot to re-install postgres_fdw extension, only did it
for postgres core, sorry for disturb.


Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Amit Langote
In reply to this post by Andrey V. Lepikhov
Hi Andrey,

Thanks for updating the patch.  I will try to take a look later.

On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov
<[hidden email]> wrote:
> On 7/16/20 2:14 PM, Amit Langote wrote:
> > * Why the "In" in these API names?
> >
> > +   /* COPY a bulk of tuples into a foreign relation */
> > +   BeginForeignCopyIn_function BeginForeignCopyIn;
> > +   EndForeignCopyIn_function EndForeignCopyIn;
> > +   ExecForeignCopyIn_function ExecForeignCopyIn;
>
> I used an analogy from copy.c.

Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it
makes sense to have "In" here, but maybe we don't, so how about
leaving out the "In" for clarity?

> > * I see that the remote copy is performed from scratch on every call
> > of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
> > send the `COPY remote_table FROM STDIN` in
> > postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
> > when there are no errors during the copy?
>
> It is not possible. FDW share one connection between all foreign
> relations from a server. If two or more partitions will be placed at one
> foreign server you will have problems with concurrent COPY command.

Ah, you're right.  I didn't consider multiple foreign partitions
pointing to the same server.  Indeed, we would need separate
connections to a given server to COPY to multiple remote relations on
that server in parallel.

> May be we can create new connection for each partition?

Yeah, perhaps, although it sounds like something that might be more
generally useful and so we should work on that separately if at all.

> > I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
> > and sending `COPY remote_table FROM STDIN` only once instead of on
> > every flush -- and I see significant speedup.  Please check the
> > attached patch that applies on top of yours.
>
> I integrated first change and rejected the second by the reason as above.

Thanks.

Will send more comments after reading the v5 patch.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Andrey V. Lepikhov
On 7/29/20 1:03 PM, Amit Langote wrote:

> Hi Andrey,
>
> Thanks for updating the patch.  I will try to take a look later.
>
> On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov
> <[hidden email]> wrote:
>> On 7/16/20 2:14 PM, Amit Langote wrote:
>>> * Why the "In" in these API names?
>>>
>>> +   /* COPY a bulk of tuples into a foreign relation */
>>> +   BeginForeignCopyIn_function BeginForeignCopyIn;
>>> +   EndForeignCopyIn_function EndForeignCopyIn;
>>> +   ExecForeignCopyIn_function ExecForeignCopyIn;
>>
>> I used an analogy from copy.c.
>
> Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it
> makes sense to have "In" here, but maybe we don't, so how about
> leaving out the "In" for clarity?
Ok, sounds good.

>
>>> * I see that the remote copy is performed from scratch on every call
>>> of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
>>> send the `COPY remote_table FROM STDIN` in
>>> postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
>>> when there are no errors during the copy?
>>
>> It is not possible. FDW share one connection between all foreign
>> relations from a server. If two or more partitions will be placed at one
>> foreign server you will have problems with concurrent COPY command.
>
> Ah, you're right.  I didn't consider multiple foreign partitions
> pointing to the same server.  Indeed, we would need separate
> connections to a given server to COPY to multiple remote relations on
> that server in parallel.
>
>> May be we can create new connection for each partition?
>
> Yeah, perhaps, although it sounds like something that might be more
> generally useful and so we should work on that separately if at all.
I will try to prepare a separate patch.

>
>>> I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
>>> and sending `COPY remote_table FROM STDIN` only once instead of on
>>> every flush -- and I see significant speedup.  Please check the
>>> attached patch that applies on top of yours.
>>
>> I integrated first change and rejected the second by the reason as above.
>
> Thanks.
>
> Will send more comments after reading the v5 patch.
>
Ok. I'll be waiting for the end of your review.

--
regards,
Andrey Lepikhov
Postgres Professional


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Amit Langote
Hi Andrey,

On Wed, Jul 29, 2020 at 5:36 PM Andrey V. Lepikhov
<[hidden email]> wrote:
> > Will send more comments after reading the v5 patch.
> >
> Ok. I'll be waiting for the end of your review.

Sorry about the late reply.

If you'd like to send a new version for other reviewers, please feel
free.  I haven't managed to take more than a brief look at the v5
patch, but will try to look at it (or maybe the new version if you
post) more closely this week.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: [POC] Fast COPY FROM command for the table with foreign partitions

Amit Langote
On Mon, Aug 3, 2020 at 8:38 PM Amit Langote <[hidden email]> wrote:

> On Wed, Jul 29, 2020 at 5:36 PM Andrey V. Lepikhov
> <[hidden email]> wrote:
> > > Will send more comments after reading the v5 patch.
> > >
> > Ok. I'll be waiting for the end of your review.
>
> Sorry about the late reply.
>
> If you'd like to send a new version for other reviewers, please feel
> free.  I haven't managed to take more than a brief look at the v5
> patch, but will try to look at it (or maybe the new version if you
> post) more closely this week.
I was playing around with v5 and I noticed an assertion failure which
I concluded is due to improper setting of ri_usesBulkModify.  You can
reproduce it with these steps.

create extension postgres_fdw;
create server lb foreign data wrapper postgres_fdw ;
create user mapping for current_user server lb;
create table foo (a int, b int) partition by list (a);
create table foo1 (like foo);
create foreign table ffoo1 partition of foo for values in (1) server
lb options (table_name 'foo1');
create table foo2 (like foo);
create foreign table ffoo2 partition of foo for values in (2) server
lb options (table_name 'foo2');
create function print_new_row() returns trigger language plpgsql as $$
begin raise notice '%', new; return new; end; $$;
create trigger ffoo1_br_trig before insert on ffoo1 for each row
execute function print_new_row();
copy foo from stdin csv;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,2
>> 2,3
>> \.
NOTICE:  (1,2)
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

#0  0x00007f2d5e266337 in raise () from /lib64/libc.so.6
#1  0x00007f2d5e267a28 in abort () from /lib64/libc.so.6
#2  0x0000000000aafd5d in ExceptionalCondition
(conditionName=0x7f2d37b468d0 "!resultRelInfo->ri_usesBulkModify ||
resultRelInfo->ri_FdwRoutine->BeginForeignCopyIn == NULL",
    errorType=0x7f2d37b46680 "FailedAssertion",
fileName=0x7f2d37b4677f "postgres_fdw.c", lineNumber=1863) at
assert.c:67
#3  0x00007f2d37b3b0fe in postgresExecForeignInsert (estate=0x2456320,
resultRelInfo=0x23a8f58, slot=0x23a9480, planSlot=0x0) at
postgres_fdw.c:1862
#4  0x000000000066362a in CopyFrom (cstate=0x23a8d40) at copy.c:3331

The problem is that partition ffoo1's BR trigger prevents it from
using multi-insert, but its ResultRelInfo.ri_usesBulkModify is true,
which is copied from its parent.  We should really check the same
things for a partition that CopyFrom() checks for the main target
relation (root parent) when deciding whether to use multi-insert.

However instead of duplicating the same logic to do so in two places
(CopyFrom and ExecInitPartitionInfo), I think it might be a good idea
to refactor the code to decide if multi-insert mode can be used for a
given relation by checking its properties and put it in some place
that both the main target relation and partitions need to invoke.
InitResultRelInfo() seems to be one such place.

Also, it might be a good idea to use ri_usesBulkModify more generally
than only for foreign relations as the patch currently does, because I
can see that it can replace the variable insertMethod in CopyFrom().
Having both insertMethod and ri_usesBulkModify in each ResultRelInfo
seems confusing and bug-prone.

Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to
reflect its scope.

Please check the attached delta patch that applies on top of v5 to see
what that would look like.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

ri_usesMultiInsert.patch (26K) Download Attachment
12