Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Etsuro Fujita
(2019/04/20 20:53), Laurenz Albe wrote:

> On Fri, 2018-04-06 at 23:24 +0000, Robert Haas wrote:
>> Allow insert and update tuple routing and COPY for foreign tables.
>>
>> Also enable this for postgres_fdw.
>>
>> Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
>> patch series of which this is a part has been reviewed by Amit
>> Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
>> and me.  Minor documentation changes to the final version by me.
>>
>> Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@...
>
> This commit makes life hard for foreign data wrappers that support
> data modifications.

Will look into this.

Best regards,
Etsuro Fujita



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Amit Langote-2
Hi,

On 2019/04/20 20:53, Laurenz Albe wrote:

> On Fri, 2018-04-06 at 23:24 +0000, Robert Haas wrote:
>> Allow insert and update tuple routing and COPY for foreign tables.
>>
>> Also enable this for postgres_fdw.
>>
>> Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
>> patch series of which this is a part has been reviewed by Amit
>> Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
>> and me.  Minor documentation changes to the final version by me.
>>
>> Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@...
>
> This commit makes life hard for foreign data wrappers that support
> data modifications.
>
> If a FDW implements ExecForeignInsert, this commit automatically assumes
> that it also supports COPY FROM.  It will call ExecForeignInsert without
> calling PlanForeignModify and BeginForeignModify, and a FDW that does not
> expect that will probably fail.
>
> So this commit silently turns a functioning FDW into a broken FDW.
> That is not nice.  Probably not every FDW is aware of this change, and
> maybe there are FDWs that support INSERT but don't want to support COPY
> for some reason.

That seems like an oversight to me.  I agree that we had better checked
that a table's FDW provides BeginForeignInsert() before proceeding with
copying into the table, as your patch teaches CopyFrom() to do.

> I propose that PostgreSQL only allows COPY FROM on a foreign table if the FDW
> implements BeginForeignInsert.  The attached patch implements that.

Looks good to me, including the documentation change.

> I think this should be backpatched to v11.

Agreed.

Thanks,
Amit



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Etsuro Fujita
In reply to this post by Etsuro Fujita
(2019/04/20 20:53), Laurenz Albe wrote:

> On Fri, 2018-04-06 at 23:24 +0000, Robert Haas wrote:
>> Allow insert and update tuple routing and COPY for foreign tables.
>>
>> Also enable this for postgres_fdw.
>>
>> Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
>> patch series of which this is a part has been reviewed by Amit
>> Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
>> and me.  Minor documentation changes to the final version by me.
>>
>> Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@...
>
> This commit makes life hard for foreign data wrappers that support
> data modifications.
>
> If a FDW implements ExecForeignInsert, this commit automatically assumes
> that it also supports COPY FROM.  It will call ExecForeignInsert without
> calling PlanForeignModify and BeginForeignModify, and a FDW that does not
> expect that will probably fail.

This is not 100% correct; the FDW documentation says:

     <para>
      Tuples inserted into a partitioned table by
<command>INSERT</command> or
      <command>COPY FROM</command> are routed to partitions.  If an FDW
      supports routable foreign-table partitions, it should also provide the
      following callback functions.  These functions are also called when
      <command>COPY FROM</command> is executed on a foreign table.
     </para>

> maybe there are FDWs that support INSERT but don't want to support COPY
> for some reason.

I agree on that point.

> I propose that PostgreSQL only allows COPY FROM on a foreign table if the FDW
> implements BeginForeignInsert.  The attached patch implements that.

I don't think that is a good idea, because there might be some FDWs that
want to support COPY FROM on foreign tables without providing
BeginForeignInsert.  (As for INSERT into foreign tables, we actually
allow FDWs to support it without providing PlanForeignModify,
BeginForeignModify, or EndForeignModify.)

It's permissible to throw an error in BeginForeignInsert, so what I was
thinking for FDWs that don't want to support COPY FROM and
INSERT/UPDATE/COPY FROM tuple routing was to provide BeginForeignInsert
implementing something like this:

static void
fooBeginForeignInsert(ModifyTableState *mtstate,
                       ResultRelInfo *resultRelInfo)
{
     Relation    rel = resultRelInfo->ri_RelationDesc;

     if (mtstate->ps.plan == NULL)
         ereport(ERROR,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot copy to foreign table \"%s\"",
                         RelationGetRelationName(rel))));
     else
         ereport(ERROR,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot route tuples into foreign table \"%s\"",
                         RelationGetRelationName(rel))));
}

Best regards,
Etsuro Fujita



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Amit Langote-2
On 2019/04/22 21:45, Etsuro Fujita wrote:

> (2019/04/20 20:53), Laurenz Albe wrote:
>> I propose that PostgreSQL only allows COPY FROM on a foreign table if
>> the FDW
>> implements BeginForeignInsert.  The attached patch implements that.
>
> I don't think that is a good idea, because there might be some FDWs that
> want to support COPY FROM on foreign tables without providing
> BeginForeignInsert.  (As for INSERT into foreign tables, we actually allow
> FDWs to support it without providing PlanForeignModify,
> BeginForeignModify, or EndForeignModify.)

I now understand why Laurenz's patch would in fact be a regression for
FDWs that do support COPY FROM and partition tuple routing without
providing BeginForeignInsert, although my first reaction was the opposite,
which was based on thinking (without confirming) that it's the core that
would crash due to initialization step being absent, but that's not the case.

The documentation [1] also says:

  If the BeginForeignInsert pointer is set to NULL, no action is taken for
  the initialization.

[1]
https://www.postgresql.org/docs/current/fdw-callbacks.html#FDW-CALLBACKS-UPDATE

> It's permissible to throw an error in BeginForeignInsert, so what I was
> thinking for FDWs that don't want to support COPY FROM and
> INSERT/UPDATE/COPY FROM tuple routing was to provide BeginForeignInsert
> implementing something like this:
>
> static void
> fooBeginForeignInsert(ModifyTableState *mtstate,
>                       ResultRelInfo *resultRelInfo)
> {
>     Relation    rel = resultRelInfo->ri_RelationDesc;
>
>     if (mtstate->ps.plan == NULL)
>         ereport(ERROR,
>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                  errmsg("cannot copy to foreign table \"%s\"",
>                         RelationGetRelationName(rel))));
>     else
>         ereport(ERROR,
>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                  errmsg("cannot route tuples into foreign table \"%s\"",
>                         RelationGetRelationName(rel))));
> }

+1

Thanks,
Amit



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Etsuro Fujita
In reply to this post by Etsuro Fujita
(2019/04/23 4:37), Laurenz Albe wrote:

> On Mon, 2019-04-22 at 21:45 +0900, Etsuro Fujita wrote:
>> (2019/04/20 20:53), Laurenz Albe wrote:
>>> On Fri, 2018-04-06 at 23:24 +0000, Robert Haas wrote:
>>>> Allow insert and update tuple routing and COPY for foreign tables.
>>>>
>>>> Also enable this for postgres_fdw.
>>>>
>>>> Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
>>>> patch series of which this is a part has been reviewed by Amit
>>>> Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
>>>> and me.  Minor documentation changes to the final version by me.
>>>>
>>>> Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@...

>>> If a FDW implements ExecForeignInsert, this commit automatically assumes
>>> that it also supports COPY FROM.  It will call ExecForeignInsert without
>>> calling PlanForeignModify and BeginForeignModify, and a FDW that does not
>>> expect that will probably fail.
>>
>> This is not 100% correct; the FDW documentation says:
>>
>>       <para>
>>        Tuples inserted into a partitioned table by
>> <command>INSERT</command>  or
>>        <command>COPY FROM</command>  are routed to partitions.  If an FDW
>>        supports routable foreign-table partitions, it should also provide the
>>        following callback functions.  These functions are also called when
>>        <command>COPY FROM</command>  is executed on a foreign table.
>>       </para>
>
> I don't see the difference between the documentation and what I wrote above.
>
> Before v11, a FDW could expect that ExecForeignInsert is only called if
> BeginForeignModify was called earlier.
> That has silently changed with v11.

I have to admit that the documentation is not sufficient.

>> It's permissible to throw an error in BeginForeignInsert, so what I was
>> thinking for FDWs that don't want to support COPY FROM and
>> INSERT/UPDATE/COPY FROM tuple routing was to provide BeginForeignInsert
>> implementing something like this:
>>
>> static void
>> fooBeginForeignInsert(ModifyTableState *mtstate,
>>                         ResultRelInfo *resultRelInfo)
>> {
>>       Relation    rel = resultRelInfo->ri_RelationDesc;
>>
>>       if (mtstate->ps.plan == NULL)
>>           ereport(ERROR,
>>                   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                    errmsg("cannot copy to foreign table \"%s\"",
>>                           RelationGetRelationName(rel))));
>>       else
>>           ereport(ERROR,
>>                   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                    errmsg("cannot route tuples into foreign table \"%s\"",
>>                           RelationGetRelationName(rel))));
>> }
>
> Sure, it is not hard to modify a FDW to continue working with v11.

How about adding to the documentation for BeginForeignInsert a mention
that if an FDW doesn't support COPY FROM and/or routable foreign tables,
it must throw an error in BeginForeignInsert accordingly.

> My point is that this should not be necessary.

In my opinion, I think this is necessary...

> On the other hand, if a FDW wants to support COPY in v11 and has no
> need for BeginForeignInsert to support that, it is a simple exercise
> for it to provide an empty BeginForeignInsert just to signal that it
> wants to support COPY.

That seems to me inconsistent with the concept of the existing APIs for
updating foreign tables, because for an FDW that wants to support
INSERT/UPDATE/DELETE and has no need for
PlanForeignModify/BeginForeignModify, those APIs don't require the FDW
to provide empty PlanForeignModify/BeginForeignModify to tell the core
that it wants to support INSERT/UPDATE/DELETE.

Best regards,
Etsuro Fujita



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Etsuro Fujita
(2019/04/24 22:04), Laurenz Albe wrote:

> On Wed, 2019-04-24 at 20:54 +0900, Etsuro Fujita wrote:
>> How about adding to the documentation for BeginForeignInsert a mention
>> that if an FDW doesn't support COPY FROM and/or routable foreign tables,
>> it must throw an error in BeginForeignInsert accordingly.
>
> Sure, some more documentation would be good.
>
> The documentation of ExecForeignInsert should mention something like:
>
>    ExecForeignInsert is called for INSERT statements as well
>    as for COPY FROM and tuples that are inserted into a foreign table
>    because it is a partition of a partitioned table.
>
>    In the case of a normal INSERT, BeginForeignModify will be called
>    before ExecForeignInsert to perform any necessary setup.
>    In the other cases, this setup has to happen in BeginForeignInsert.
These seem a bit redundant to me because the documentation already says:

<programlisting>
void
BeginForeignModify(ModifyTableState *mtstate,
                    ResultRelInfo *rinfo,
                    List *fdw_private,
                    int subplan_index,
                    int eflags);
</programlisting>

      Begin executing a foreign table modification operation.  This
routine is
      called during executor startup.  It should perform any initialization
      needed prior to the actual table modifications.  Subsequently,
      <function>*ExecForeignInsert*</function>,
<function>ExecForeignUpdate</funct\
ion> or
      <function>ExecForeignDelete</function> will be called for each
tuple to be
      inserted, updated, or deleted.

And

<programlisting>
void
BeginForeignInsert(ModifyTableState *mtstate,
                    ResultRelInfo *rinfo);
</programlisting>

      Begin executing an insert operation on a foreign table.  This
routine is
      called right before the first tuple is inserted into the foreign table
      in both cases when it is the partition chosen for tuple routing
and the
      target specified in a <command>COPY FROM</command> command.  It should
      perform any initialization needed prior to the actual insertion.
      Subsequently, <function>*ExecForeignInsert*</function> will be
called for
      each tuple to be inserted into the foreign table.

>    Before PostgreSQL v11, a foreign data wrapper could be certain that
>    BeginForeignModify is always called before ExecForeignInsert.
>    This is no longer true.

OK, how about something like the attached?  I reworded this a bit, though.

>>> On the other hand, if a FDW wants to support COPY in v11 and has no
>>> need for BeginForeignInsert to support that, it is a simple exercise
>>> for it to provide an empty BeginForeignInsert just to signal that it
>>> wants to support COPY.
>>
>> That seems to me inconsistent with the concept of the existing APIs for
>> updating foreign tables, because for an FDW that wants to support
>> INSERT/UPDATE/DELETE and has no need for
>> PlanForeignModify/BeginForeignModify, those APIs don't require the FDW
>> to provide empty PlanForeignModify/BeginForeignModify to tell the core
>> that it wants to support INSERT/UPDATE/DELETE.
>
> That is true, but so far there hasn't been a change to the FDW API that
> caused a callback to be invoked in a different fashion than it used to be.
I agree on that point.

> Perhaps it would have been better to create a new callback like
> ExecForeignCopy with the same signature as ExecForeignInsert so that
> you can use the same callback function for both if you want.
> That would also have avoided the breakage.

If so, we would actually need another new callback
ExecForeignTupleRouting since ExecForeignInsert is also called for
INSERT/UPDATE tuple routing (or another two new callbacks
ExecForeignInsertTupleRouting and ExecForeignUpdateTupleRouting in case
an FDW wants to support either of the tuple routing).  My concern about
that is: introducing such a concept might lead to an increase in the
number of callbacks as FDW evolves, increasing the maintenance cost of
the core.  So I think it would be better to just have ExecForeignInsert
as a foreign-table alternative for heap_insert, as that would keep the
core much simple.

> At the very least, this should have been mentioned in the list of
> incompatible changes for v11.

Agreed.  In the attached, I added a mention to the release notes for PG11.

Best regards,
Etsuro Fujita

FDW-documentation.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Amit Langote-2
Fujita-san,

On 2019/04/25 22:17, Etsuro Fujita wrote:
> (2019/04/24 22:04), Laurenz Albe wrote:
>>    Before PostgreSQL v11, a foreign data wrapper could be certain that
>>    BeginForeignModify is always called before ExecForeignInsert.
>>    This is no longer true.
>
> OK, how about something like the attached?  I reworded this a bit, though.

Thanks for the patch.

+     Note that this function is also called when inserting routed tuples into
+     a foreign-table partition or executing <command>COPY FROM</command> on
+     a foreign table, in which case it is called in a different way than it
+     is in the <command>INSERT</command> case.

Maybe minor, but should the last part of this sentence read as:

...in which case it is called in a different way than it is in the case
<command>INSERT</command> is operating directly on the foreign table.

?

Thanks,
Amit



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Etsuro Fujita
Amit-san,

(2019/04/26 13:20), Amit Langote wrote:

> On 2019/04/25 22:17, Etsuro Fujita wrote:
>> (2019/04/24 22:04), Laurenz Albe wrote:
>>>     Before PostgreSQL v11, a foreign data wrapper could be certain that
>>>     BeginForeignModify is always called before ExecForeignInsert.
>>>     This is no longer true.
>>
>> OK, how about something like the attached?  I reworded this a bit, though.
>
> Thanks for the patch.
>
> +     Note that this function is also called when inserting routed tuples into
> +     a foreign-table partition or executing<command>COPY FROM</command>  on
> +     a foreign table, in which case it is called in a different way than it
> +     is in the<command>INSERT</command>  case.
>
> Maybe minor, but should the last part of this sentence read as:
>
> ...in which case it is called in a different way than it is in the case
> <command>INSERT</command>  is operating directly on the foreign table.
>
> ?

Yeah, but I think it would be OK to just say "the INSERT case" because
this note is added to the docs for ExecForeignInsert(), which allows the
FDW to directly insert into foreign tables as you know, so users will
read "the INSERT case" as "the case <command>INSERT</command> is
operating directly on the foreign table".

Thanks for the comment!

Best regards,
Etsuro Fujita



Reply | Threaded
Open this post in threaded view
|

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

Etsuro Fujita
(2019/04/26 13:58), Etsuro Fujita wrote:

> (2019/04/26 13:20), Amit Langote wrote:
>> + Note that this function is also called when inserting routed tuples
>> into
>> + a foreign-table partition or executing<command>COPY FROM</command> on
>> + a foreign table, in which case it is called in a different way than it
>> + is in the<command>INSERT</command> case.
>>
>> Maybe minor, but should the last part of this sentence read as:
>>
>> ...in which case it is called in a different way than it is in the case
>> <command>INSERT</command> is operating directly on the foreign table.
>>
>> ?
>
> Yeah, but I think it would be OK to just say "the INSERT case" because
> this note is added to the docs for ExecForeignInsert(), which allows the
> FDW to directly insert into foreign tables as you know, so users will
> read "the INSERT case" as "the case <command>INSERT</command> is
> operating directly on the foreign table".

Pushed as-is.  I think we can change that later if necessary.

Best regards,
Etsuro Fujita