pgsql: Allow insert and update tuple routing and COPY for foreign table

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

pgsql: Allow insert and update tuple routing and COPY for foreign table

Robert Haas-5
Allow insert and update tuple routing and COPY for foreign tables.

Also enable this for postgres_fdw.

Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
patch series of which this is a part has been reviewed by Amit
Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
and me.  Minor documentation changes to the final version by me.

Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@...

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/3d956d9562aa4811b5eaaaf5314d361c61be2ae0

Modified Files
--------------
contrib/file_fdw/input/file_fdw.source         |   5 +
contrib/file_fdw/output/file_fdw.source        |   9 +-
contrib/postgres_fdw/expected/postgres_fdw.out | 334 +++++++++++++++++++++++++
contrib/postgres_fdw/postgres_fdw.c            |  96 +++++++
contrib/postgres_fdw/sql/postgres_fdw.sql      | 237 ++++++++++++++++++
doc/src/sgml/ddl.sgml                          |   8 +-
doc/src/sgml/fdwhandler.sgml                   |  66 +++++
doc/src/sgml/ref/copy.sgml                     |   5 +-
doc/src/sgml/ref/update.sgml                   |   3 +
src/backend/commands/copy.c                    |  96 +++++--
src/backend/executor/execMain.c                |   8 +-
src/backend/executor/execPartition.c           | 103 +++++---
src/backend/executor/nodeModifyTable.c         |  23 +-
src/include/executor/execPartition.h           |   8 +-
src/include/foreign/fdwapi.h                   |   8 +
src/include/nodes/execnodes.h                  |   6 +
16 files changed, 924 insertions(+), 91 deletions(-)

Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
On Fri, 2018-04-06 at 23:24 +0000, Robert Haas wrote:

> Allow insert and update tuple routing and COPY for foreign tables.
>
> Also enable this for postgres_fdw.
>
> Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
> patch series of which this is a part has been reviewed by Amit
> Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
> and me.  Minor documentation changes to the final version by me.
>
> Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@...
This commit makes life hard for foreign data wrappers that support
data modifications.

If a FDW implements ExecForeignInsert, this commit automatically assumes
that it also supports COPY FROM.  It will call ExecForeignInsert without
calling PlanForeignModify and BeginForeignModify, and a FDW that does not
expect that will probably fail.

So this commit silently turns a functioning FDW into a broken FDW.
That is not nice.  Probably not every FDW is aware of this change, and
maybe there are FDWs that support INSERT but don't want to support COPY
for some reason.

I propose that PostgreSQL only allows COPY FROM on a foreign table if the FDW
implements BeginForeignInsert.  The attached patch implements that.

I think this should be backpatched to v11.

Yours,
Laurenz Albe

0001-COPY-FROM-on-foreign-tables-requires-BeginForeignIns.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
On Mon, 2019-04-22 at 21:45 +0900, Etsuro Fujita wrote:

Thanks for looking into this!

> (2019/04/20 20:53), Laurenz Albe wrote:
> > On Fri, 2018-04-06 at 23:24 +0000, Robert Haas wrote:
> > > Allow insert and update tuple routing and COPY for foreign tables.
> > >
> > > Also enable this for postgres_fdw.
> > >
> > > Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
> > > patch series of which this is a part has been reviewed by Amit
> > > Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
> > > and me.  Minor documentation changes to the final version by me.
> > >
> > > Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@...
> >
> > This commit makes life hard for foreign data wrappers that support
> > data modifications.
> >
> > If a FDW implements ExecForeignInsert, this commit automatically assumes
> > that it also supports COPY FROM.  It will call ExecForeignInsert without
> > calling PlanForeignModify and BeginForeignModify, and a FDW that does not
> > expect that will probably fail.
>
> This is not 100% correct; the FDW documentation says:
>
>      <para>
>       Tuples inserted into a partitioned table by
> <command>INSERT</command> or
>       <command>COPY FROM</command> are routed to partitions.  If an FDW
>       supports routable foreign-table partitions, it should also provide the
>       following callback functions.  These functions are also called when
>       <command>COPY FROM</command> is executed on a foreign table.
>      </para>
I don't see the difference between the documentation and what I wrote above.

Before v11, a FDW could expect that ExecForeignInsert is only called if
BeginForeignModify was called earlier.
That has silently changed with v11.

> > maybe there are FDWs that support INSERT but don't want to support COPY
> > for some reason.
>
> I agree on that point.
>
> > I propose that PostgreSQL only allows COPY FROM on a foreign table if the FDW
> > implements BeginForeignInsert.  The attached patch implements that.
>
> I don't think that is a good idea, because there might be some FDWs that
> want to support COPY FROM on foreign tables without providing
> BeginForeignInsert.  (As for INSERT into foreign tables, we actually
> allow FDWs to support it without providing PlanForeignModify,
> BeginForeignModify, or EndForeignModify.)
>
> It's permissible to throw an error in BeginForeignInsert, so what I was
> thinking for FDWs that don't want to support COPY FROM and
> INSERT/UPDATE/COPY FROM tuple routing was to provide BeginForeignInsert
> implementing something like this:
>
> static void
> fooBeginForeignInsert(ModifyTableState *mtstate,
>                        ResultRelInfo *resultRelInfo)
> {
>      Relation    rel = resultRelInfo->ri_RelationDesc;
>
>      if (mtstate->ps.plan == NULL)
>          ereport(ERROR,
>                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                   errmsg("cannot copy to foreign table \"%s\"",
>                          RelationGetRelationName(rel))));
>      else
>          ereport(ERROR,
>                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                   errmsg("cannot route tuples into foreign table \"%s\"",
>                          RelationGetRelationName(rel))));
> }
Sure, it is not hard to modify a FDW to continue working with v11.

My point is that this should not be necessary.

If a FDW worked well with v10, it should continue to work with v11
unless there is a necessity for a compatibility-breaking change.

On the other hand, if a FDW wants to support COPY in v11 and has no
need for BeginForeignInsert to support that, it is a simple exercise
for it to provide an empty BeginForeignInsert just to signal that it
wants to support COPY.

I realized that my previous patch forgot to check for tuple routing,
updated patch attached.

Yours,
Laurenz Albe

0001-Foreign-table-COPY-FROM-and-tuple-routing-requires-B.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Robert Haas
On Mon, Apr 22, 2019 at 3:37 PM Laurenz Albe <[hidden email]> wrote:
> Sure, it is not hard to modify a FDW to continue working with v11.
>
> My point is that this should not be necessary.

I'm not sure whether this proposal is a good idea or a bad idea, but I
think that it's inevitable that FDWs are going to need some updating
for new releases as the API evolves.  That has happened before and
will continue to happen.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Andres Freund
In reply to this post by Laurenz Albe
Hi,

On 2019-04-22 21:37:25 +0200, Laurenz Albe wrote:

> Subject: [PATCH] Foreign table COPY FROM and tuple routing requires
>  BeginForeignInsert
>
> Commit 3d956d956a introduced support for foreign tables as partitions
> and COPY FROM on foreign tables.
>
> If a foreign data wrapper supports data modifications, but either has
> not adapted to this change yet or doesn't want to support it
> for other reasons, it probably got broken by the above commit,
> because COPY will just call ExecForeignInsert anyway, which might not
> work because neither PlanForeignModify nor BeginForeignModify have
> been called.
>
> To avoid breaking third-party foreign data wrappers in that way, allow
> COPY FROM and tuple routing for foreign tables only if the foreign data
> wrapper implements BeginForeignInsert.

Isn't this worse though? Before this it's an API change between major
versions. With this it's an API change in a *minor* version. Sure, it's
one that doesn't crash, but it's still a pretty substantial function
regression, no?

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
In reply to this post by Robert Haas
On Mon, 2019-04-22 at 16:24 -0400, Robert Haas wrote:
> On Mon, Apr 22, 2019 at 3:37 PM Laurenz Albe <[hidden email]> wrote:
> > Sure, it is not hard to modify a FDW to continue working with v11.
> >
> > My point is that this should not be necessary.
>
> I'm not sure whether this proposal is a good idea or a bad idea, but I
> think that it's inevitable that FDWs are going to need some updating
> for new releases as the API evolves.  That has happened before and
> will continue to happen.

Absolutely.
I am just unhappy that this change caused unnecessary breakage.

If you developed a read-only FDW for 9.2, it didn't break with the
write support introduced in 9.3, because that used different API
functions.  That's how it should be IMHO.

If you developed a FDW for 9.1, it got broken in 9.2, because the
API had to change to allow returning multiple paths.
That was unfortunate but necessary, so it is ok.

Nothing in this patch required an incompatible change.

Yours,
Laurenz Albe



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
In reply to this post by Andres Freund
On Mon, 2019-04-22 at 13:27 -0700, Andres Freund wrote:

> On 2019-04-22 21:37:25 +0200, Laurenz Albe wrote:
> > Commit 3d956d956a introduced support for foreign tables as partitions
> > and COPY FROM on foreign tables.
> >
> > If a foreign data wrapper supports data modifications, but either has
> > not adapted to this change yet or doesn't want to support it
> > for other reasons, it probably got broken by the above commit,
> > because COPY will just call ExecForeignInsert anyway, which might not
> > work because neither PlanForeignModify nor BeginForeignModify have
> > been called.
> >
> > To avoid breaking third-party foreign data wrappers in that way, allow
> > COPY FROM and tuple routing for foreign tables only if the foreign data
> > wrapper implements BeginForeignInsert.
>
> Isn't this worse though? Before this it's an API change between major
> versions. With this it's an API change in a *minor* version. Sure, it's
> one that doesn't crash, but it's still a pretty substantial function
> regression, no?

Hm, that's a good point.
You could say that this patch is too late, because a FDW might already be
relying on COPY FROM to work without BeginForeignInsert in v11.

How about just applying the patch from v12 on?
Then it is a behavior change in a major release, which is acceptable.
It requires the imaginary FDW above to add an empty BeginForeignInsert
callback function, but will unbreak FDW that slept through the change
that added COPY support.

Yours,
Laurenz Albe



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Andres Freund
Hi,

On 2019-04-22 22:56:20 +0200, Laurenz Albe wrote:

> On Mon, 2019-04-22 at 13:27 -0700, Andres Freund wrote:
> > On 2019-04-22 21:37:25 +0200, Laurenz Albe wrote:
> > > Commit 3d956d956a introduced support for foreign tables as partitions
> > > and COPY FROM on foreign tables.
> > >
> > > If a foreign data wrapper supports data modifications, but either has
> > > not adapted to this change yet or doesn't want to support it
> > > for other reasons, it probably got broken by the above commit,
> > > because COPY will just call ExecForeignInsert anyway, which might not
> > > work because neither PlanForeignModify nor BeginForeignModify have
> > > been called.
> > >
> > > To avoid breaking third-party foreign data wrappers in that way, allow
> > > COPY FROM and tuple routing for foreign tables only if the foreign data
> > > wrapper implements BeginForeignInsert.
> >
> > Isn't this worse though? Before this it's an API change between major
> > versions. With this it's an API change in a *minor* version. Sure, it's
> > one that doesn't crash, but it's still a pretty substantial function
> > regression, no?
>
> Hm, that's a good point.
> You could say that this patch is too late, because a FDW might already be
> relying on COPY FROM to work without BeginForeignInsert in v11.

I think that's the case.


> How about just applying the patch from v12 on?
> Then it is a behavior change in a major release, which is acceptable.
> It requires the imaginary FDW above to add an empty BeginForeignInsert
> callback function, but will unbreak FDW that slept through the change
> that added COPY support.

I fail to see the advantage. It'll still require FDWs to be fixed to
work correctly for v11, but additionally adds another set of API
differences that needs to be fixed by another set of FDWs.  I think this
ship simply has sailed.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
On Mon, 2019-04-22 at 14:07 -0700, Andres Freund wrote:

> How about just applying the patch from v12 on?
> > Then it is a behavior change in a major release, which is acceptable.
> > It requires the imaginary FDW above to add an empty BeginForeignInsert
> > callback function, but will unbreak FDW that slept through the change
> > that added COPY support.
>
> I fail to see the advantage. It'll still require FDWs to be fixed to
> work correctly for v11, but additionally adds another set of API
> differences that needs to be fixed by another set of FDWs.  I think this
> ship simply has sailed.

I can accept that (having fixed my own FDW), but I am worried that it will
cause problems for FDW users.  Well, I guess they can always avoid COPY if
they don't want FDWs to crash.

Yours,
Laurenz Albe



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
In reply to this post by Laurenz Albe
On Wed, 2019-04-24 at 20:54 +0900, Etsuro Fujita wrote:
> How about adding to the documentation for BeginForeignInsert a mention
> that if an FDW doesn't support COPY FROM and/or routable foreign tables,
> it must throw an error in BeginForeignInsert accordingly.

Sure, some more documentation would be good.

The documentation of ExecForeignInsert should mention something like:

  ExecForeignInsert is called for INSERT statements as well
  as for COPY FROM and tuples that are inserted into a foreign table
  because it is a partition of a partitioned table.

  In the case of a normal INSERT, BeginForeignModify will be called
  before ExecForeignInsert to perform any necessary setup.
  In the other cases, this setup has to happen in BeginForeignInsert.

  Before PostgreSQL v11, a foreign data wrapper could be certain that
  BeginForeignModify is always called before ExecForeignInsert.
  This is no longer true.

> > On the other hand, if a FDW wants to support COPY in v11 and has no
> > need for BeginForeignInsert to support that, it is a simple exercise
> > for it to provide an empty BeginForeignInsert just to signal that it
> > wants to support COPY.
>
> That seems to me inconsistent with the concept of the existing APIs for
> updating foreign tables, because for an FDW that wants to support
> INSERT/UPDATE/DELETE and has no need for
> PlanForeignModify/BeginForeignModify, those APIs don't require the FDW
> to provide empty PlanForeignModify/BeginForeignModify to tell the core
> that it wants to support INSERT/UPDATE/DELETE.

That is true, but so far there hasn't been a change to the FDW API that
caused a callback to be invoked in a different fashion than it used to be.

Perhaps it would have been better to create a new callback like
ExecForeignCopy with the same signature as ExecForeignInsert so that
you can use the same callback function for both if you want.
That would also have avoided the breakage.
But, of course it is too late for that now.

Note that postgres_fdw would have been broken by that API change as well
if it hasn't been patched.

At the very least, this should have been mentioned in the list of
incompatible changes for v11.

Yours,
Laurenz Albe



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Simon Riggs
In reply to this post by Laurenz Albe
On Wed, 24 Apr 2019 at 12:55, Etsuro Fujita <[hidden email]> wrote:
 
> My point is that this should not be necessary.

In my opinion, I think this is necessary...

Could we decide by looking at what FDWs are known to exist?  I hope we are trying to avoid breakage in the smallest number of FDWs.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
On Wed, 2019-04-24 at 14:25 +0100, Simon Riggs wrote:
> On Wed, 24 Apr 2019 at 12:55, Etsuro Fujita <[hidden email]> wrote:
>  
> > > My point is that this should not be necessary.
> >
> > In my opinion, I think this is necessary...
>
> Could we decide by looking at what FDWs are known to exist?
> I hope we are trying to avoid breakage in the smallest number of FDWs.

A good idea.  I don't volunteer to go through the list, but I had a look
at Multicorn, which is a FDW framework used by many FDWs, and it seems
to rely on multicornBeginForeignModify being called before
multicornExecForeignInsert (the former sets up a MulticornModifyState
used by the latter).

https://github.com/Kozea/Multicorn/blob/master/src/multicorn.c

Multicorn obviously hasn't got the message yet that the API has
changed in an incompatible fashion, so I'd argue that every
Multicorn FDW with write support is currently broken.


As Andres has argued above, it is too late to do anything more about
it than to document this and warn FDW authors as good as we can.

Yours,
Laurenz Albe



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Laurenz Albe
In reply to this post by Laurenz Albe
On Thu, 2019-04-25 at 22:17 +0900, Etsuro Fujita wrote:

> > The documentation of ExecForeignInsert should mention something like:
> >
> >    ExecForeignInsert is called for INSERT statements as well
> >    as for COPY FROM and tuples that are inserted into a foreign table
> >    because it is a partition of a partitioned table.
> >
> >    In the case of a normal INSERT, BeginForeignModify will be called
> >    before ExecForeignInsert to perform any necessary setup.
> >    In the other cases, this setup has to happen in BeginForeignInsert.
>
> These seem a bit redundant to me [...]
>
> OK, how about something like the attached?  I reworded this a bit, though.

I like your patch better than my wording.

Thanks for the effort!

Yours,
Laurenz Albe