CREATE ROUTINE MAPPING

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

CREATE ROUTINE MAPPING

Corey Huinker
A few months ago, I was researching ways for formalizing calling functions on one postgres instance from another. RPC, basically. In doing so, I stumbled across an obscure part of the the SQL Standard called ROUTINE MAPPING, which is exactly what I'm looking for.

The syntax specified is, roughly:

CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
SERVER my_server [ OPTIONS( ... ) ]

Which isn't too different from CREATE USER MAPPING.

The idea here is that if I had a local query:

SELECT t.x, remote_func1(),  remote_func2(t.y)
FROM remote_table t
WHERE t.active = true;

that would become this query on the remote side:

SELECT t.x, local_func1(), local_func2(t.y)
FROM local_table t
WHERE t.active = true;


That was probably the main intention of this feature, but I see a different possibility there. Consider the cases:

SELECT remote_func(1,'a');

and

SELECT * FROM remote_srf(10, true);

Now we could have written remote_func() and remote_srf() in plpythonu, and it could access whatever remote data that we wanted to see, but that exposes our local server to the untrusted pl/python module as well as python process overhead.

We could create a specialized foreign data wrapper that requires a WHERE clause to include all the require parameters as predicates, essentially making every function a table, but that's awkward and unclear to an end user.

Having the ability to import functions from other servers allows us to write foreign servers that expose functions to the local database, and those foreign servers handle the bloat and risks associated with accessing that remote data.

Moreover, it would allow hosted environments (AWS, etc) that restrict the extensions that can be added to the database to still connect to those foreign data sources.

I'm hoping to submit a patch for this someday, but it touches on several areas of the codebase where I have no familiarity, so I've put forth to spark interest in the feature, to see if any similar work is underway, or if anyone can offer guidance.

Thanks in advance.
Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Thu, Jan 11, 2018 at 09:37:43PM -0500, Corey Huinker wrote:
> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]

[neat use cases elided]

For what it's worth, the now-defunct DBI-Link I wrote had
remote_execute(), which did many of the things you describe here, only
with no help from the rest of PostgreSQL, as it was implemented
strictly in userland.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Ashutosh Bapat
In reply to this post by Corey Huinker
On Fri, Jan 12, 2018 at 8:07 AM, Corey Huinker <[hidden email]> wrote:

> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]
>
>
> Which isn't too different from CREATE USER MAPPING.
>
> The idea here is that if I had a local query:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>

I think this is a desired feature. Being able to call a function on
remote server through local server is often useful.

PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Pavel Stehule


2018-01-12 10:02 GMT+01:00 Ashutosh Bapat <[hidden email]>:
On Fri, Jan 12, 2018 at 8:07 AM, Corey Huinker <[hidden email]> wrote:
> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]
>
>
> Which isn't too different from CREATE USER MAPPING.
>
> The idea here is that if I had a local query:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>

I think this is a desired feature. Being able to call a function on
remote server through local server is often useful.

PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

+1

Pavel


--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Corey Huinker
In reply to this post by Ashutosh Bapat
PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

"local_routine_name" and "remote_routine_spec" were my own paraphrasings of what the spec implies. I'm nearly certain that the local routine name, which the spec says is just an identifier, cannot have a parameter spec on it, which leaves only one other place to define it, remote_routine_spec, which wasn't defined at all. I _suppose_ parameter definitions could be pushed into options, but that'd be ugly.
Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Fri, Jan 12, 2018 at 11:11:26AM -0500, Corey Huinker wrote:

> >
> > PostgreSQL allows function overloading, which means that there can
> > be multiple functions with same name differing in argument types.
> > So, the syntax has to include the input parameters or their types
> > at least.
>
> "local_routine_name" and "remote_routine_spec" were my own
> paraphrasings of what the spec implies. I'm nearly certain that the
> local routine name, which the spec says is just an identifier,
> cannot have a parameter spec on it, which leaves only one other
> place to define it, remote_routine_spec, which wasn't defined at
> all. I _suppose_ parameter definitions could be pushed into options,
> but that'd be ugly.

In my draft of SQL:2011, which I don't think has substantive changes
to what's either in the official SQL:2011 or SQL:2016, it says:

<routine mapping definition> ::=
CREATE ROUTINE MAPPING <routine mapping name> FOR <specific routine designator>
SERVER <foreign server name> [ <generic options> ]
Syntax Rules
1) Let FSN be the <foreign server name>. Let RMN be the <routine mapping name>.
2) The catalog identified by the explicit or implicit catalog name of FSN shall include a foreign server
descriptor whose foreign server name is equivalent to FSN.
3) The SQL-environment shall not include a routine mapping descriptor whose routine mapping name is
RMN.
4) Let R be the SQL-invoked routine identified by the <specific
routine designator>. R shall identify an SQL-invoked regular function.

It goes on from there, but I think there's a reasonable interpretation
of this which allows us to use the same syntax as CREATE
(FUNCTION|PROCEDURE), apart from the body, e.g.:

CREATE ROUTINE MAPPING local_routine_name
FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
   [ RETURNS rettype
     | RETURNS TABLE ( column_name column_type [, ...] ) ]
SERVER foreign_server_name
   [ (option [, ...]) ]

Does that seem like too broad an interpretation?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Corey Huinker


It goes on from there, but I think there's a reasonable interpretation
of this which allows us to use the same syntax as CREATE
(FUNCTION|PROCEDURE), apart from the body, e.g.:

CREATE ROUTINE MAPPING local_routine_name
FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
   [ RETURNS rettype
     | RETURNS TABLE ( column_name column_type [, ...] ) ]
SERVER foreign_server_name
   [ (option [, ...]) ]

Does that seem like too broad an interpretation?

That's really interesting. I didn't think to look in the definition of CREATE FUNCTION to see if a SERVER option popped in there, but seems like a more accessible way to introduce the notion of remote functions, because I talked to a few developers about this before posting to the list, and only one had ever heard of ROUTINE MAPPING and had no clear recollection of it. An option on CREATE FUNCTION is going to get noticed (and used!) a lot sooner.

Having said that, I think syntactically we have to implement CREATE ROUTINE MAPPING, even if it is just translated to a CREATE FUNCTION call.

In either case, I suspected that pg_proc would need a nullable srvid column pointing to pg_foreign_server, and possibly a new row in pg_language for 'external'. I had entertained having a pg_routine_mappings table like pg_user_mappings, and we still could, if the proc's language of 'external' clued the planner to look for the mapping. I can see arguments for either approach.

Before anyone asks, I looked for, and did not find, any suggestion of IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there wouldn't be any way to grab all the functions that .a foreign server is offering up.

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Fri, Jan 12, 2018 at 02:29:53PM -0500, Corey Huinker wrote:

> >
> >
> >
> > It goes on from there, but I think there's a reasonable interpretation
> > of this which allows us to use the same syntax as CREATE
> > (FUNCTION|PROCEDURE), apart from the body, e.g.:
> >
> > CREATE ROUTINE MAPPING local_routine_name
> > FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ]
> > argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
> >    [ RETURNS rettype
> >      | RETURNS TABLE ( column_name column_type [, ...] ) ]
> > SERVER foreign_server_name
> >    [ (option [, ...]) ]
> >
> > Does that seem like too broad an interpretation?
> >
>
> That's really interesting. I didn't think to look in the definition of
> CREATE FUNCTION to see if a SERVER option popped in there, but seems like a
> more accessible way to introduce the notion of remote functions,

It does indeed.  Adding the functionality to CREATE
(FUNCTION|PROCEDURE) seems like a *much* better idea than trying to
wedge it into the CREATE ROUTINE MAPPING syntax.

> because I talked to a few developers about this before posting to
> the list, and only one had ever heard of ROUTINE MAPPING and had no
> clear recollection of it.  An option on CREATE FUNCTION is going to
> get noticed (and used!) a lot sooner.

+1

> Having said that, I think syntactically we have to implement CREATE ROUTINE
> MAPPING, even if it is just translated to a CREATE FUNCTION call.
>
> In either case, I suspected that pg_proc would need a nullable srvid column
> pointing to pg_foreign_server, and possibly a new row in pg_language for
> 'external'.

Makes a lot of sense.

> I had entertained having a pg_routine_mappings table like
> pg_user_mappings, and we still could, if the proc's language of
> 'external' clued the planner to look for the mapping. I can see
> arguments for either approach.

It would be good to have them in the catalog somehow if we make CREATE
ROUTINE MAPPING a DDL.  If I've read the standard correctly, there are
parts of information_schema which come into play for those routine
mappings.

> Before anyone asks, I looked for, and did not find, any suggestion of
> IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there
> wouldn't be any way to grab all the functions that .a foreign server is
> offering up.

How about making an option to IMPORT FOREIGN SCHEMA do it?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Corey Huinker


> > CREATE ROUTINE MAPPING local_routine_name

> > FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ]
> > argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
> >    [ RETURNS rettype
> >      | RETURNS TABLE ( column_name column_type [, ...] ) ]
> > SERVER foreign_server_name
> >    [ (option [, ...]) ]
> >
> > Does that seem like too broad an interpretation?
> >
>
> I had entertained having a pg_routine_mappings table like
> pg_user_mappings, and we still could, if the proc's language of
> 'external' clued the planner to look for the mapping. I can see
> arguments for either approach.

It would be good to have them in the catalog somehow if we make CREATE
ROUTINE MAPPING a DDL.  If I've read the standard correctly, there are
parts of information_schema which come into play for those routine
mappings.

> Before anyone asks, I looked for, and did not find, any suggestion of
> IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there
> wouldn't be any way to grab all the functions that .a foreign server is
> offering up.

How about making an option to IMPORT FOREIGN SCHEMA do it?



Ok, so the steps seem to be:
1. settle on syntax.
2. determine data dictionary structures
3. parse and create those structures
4. "handle" external functions locally
5. provide structures passed to FDW handlers so that they can handle external functions
6. implement those handlers in postgres_fdw

#1 is largely prescribed for us, though I'm curious as to how the CRM statements I've made up in examples above would look like as CREATE FUNCTION ... SERVER ...

#2 deserves a lot of debate, but probably mostly hinges on the new "language" and how to associate a pg_proc entry with a pg_foreign_server

#3 i'm guessing this is a lot of borrowing code from CREATE ROUTINE MAPPING but is otherwise pretty straightforward.

#4 an external function obviously cannot be executed locally, doing so means that the planner failed to push it down, so this is probably stub-error functions

#5 These functions would essentially be passed in the same as foreign columns with the "name" as "f(a,b,4)", and the burden of forming the remote query is on the FDW

Which gets tricky. What should happen in simple situations is obvious:

SELECT t.x, remote_func1(),  remote_func2(t.y)
FROM remote_table t
WHERE t.active = true;

that would become this query on the remote side:

SELECT t.x, local_func1(), local_func2(t.y)
FROM local_table t
WHERE t.active = true;

And it's still simple when local functions consume remote input


SELECT local_func1(remote_func1(r.x)) FROM remote_table r WHERE r.active = true;


But other situations seem un-handle-able to me:

SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;


In those cases, at least initially, I think the FDW handler is right to raise an error, because the function inputs are unknowable at query time, and the inputs cannot also be pushed down to the remote server. That might not be common, but I can see situations like this:

SELECT r.*
FROM remote_srf( ( SELECT remote_code_value FROM local_table_of_remote_codes WHERE local_code_value = 'xyz' ) ) r;

and we would want things like that to work. Currently is similar table-situations the FDW has no choice but to fetch the entire table and filter locally. That's good for tables, whose contents are knowable, but the set of possible function inputs is unreasonably large. The current workaround in table-land is to run the inner query locally, and present the result at a constant to a follow-up query, so maybe that's what we have to do here, at least initially.

#6 is where the FDW either does the translation or rejects the notion that functions can be pushed down, either outright or based on the usage of the function in the query.


I'm doing this thinking on the mailing list in the hopes that it evokes suggestions, warnings, suggested code samples, and of course, help.
 
Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Wed, Jan 17, 2018 at 11:09:19AM -0500, Corey Huinker wrote:

> > > CREATE ROUTINE MAPPING local_routine_name
> > > > FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [
> > argname ]
> > > > argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
> > > >    [ RETURNS rettype
> > > >      | RETURNS TABLE ( column_name column_type [, ...] ) ]
> > > > SERVER foreign_server_name
> > > >    [ (option [, ...]) ]
> > > >
> > > > Does that seem like too broad an interpretation?
> > > >
> > >
> > > I had entertained having a pg_routine_mappings table like
> > > pg_user_mappings, and we still could, if the proc's language of
> > > 'external' clued the planner to look for the mapping. I can see
> > > arguments for either approach.
> >
> > It would be good to have them in the catalog somehow if we make CREATE
> > ROUTINE MAPPING a DDL.  If I've read the standard correctly, there are
> > parts of information_schema which come into play for those routine
> > mappings.
> >
> > > Before anyone asks, I looked for, and did not find, any suggestion of
> > > IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there
> > > wouldn't be any way to grab all the functions that .a foreign server is
> > > offering up.
> >
> > How about making an option to IMPORT FOREIGN SCHEMA do it?
> >
> >
>
> Ok, so the steps seem to be:
> 1. settle on syntax.
> 2. determine data dictionary structures
> 3. parse and create those structures
> 4. "handle" external functions locally
> 5. provide structures passed to FDW handlers so that they can handle
> external functions
> 6. implement those handlers in postgres_fdw
>
> #1 is largely prescribed for us, though I'm curious as to how the CRM
> statements I've made up in examples above would look like as CREATE
> FUNCTION ... SERVER ...
>
> #2 deserves a lot of debate, but probably mostly hinges on the new
> "language" and how to associate a pg_proc entry with a pg_foreign_server
>
> #3 i'm guessing this is a lot of borrowing code from CREATE ROUTINE MAPPING
> but is otherwise pretty straightforward.
>
> #4 an external function obviously cannot be executed locally, doing so
> means that the planner failed to push it down, so this is probably
> stub-error functions
>
> #5 These functions would essentially be passed in the same as foreign
> columns with the "name" as "f(a,b,4)", and the burden of forming the remote
> query is on the FDW
>
> Which gets tricky. What should happen in simple situations is obvious:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>
> And it's still simple when local functions consume remote input
>
>
> SELECT local_func1(remote_func1(r.x)) FROM remote_table r WHERE r.active =
> true;
>
>
> But other situations seem un-handle-able to me:
>
> SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;

Do we have any way, or any plan to make a way, to push the set (SELECT
x FROM local_table WHERE active = true) to the remote side for
execution there?  Obviously, there are foreign DBs that couldn't
support this, but I'm guessing they wouldn't have much by way of UDFs
either.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Corey Huinker

>
> But other situations seem un-handle-able to me:
>
> SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;

Do we have any way, or any plan to make a way, to push the set (SELECT
x FROM local_table WHERE active = true) to the remote side for
execution there?  Obviously, there are foreign DBs that couldn't
support this, but I'm guessing they wouldn't have much by way of UDFs
either.

No. The remote query has to be generated at planning time, so it can't make predicates out of anything that can't be resolved into constants by the planner itself. The complexities of doing so would be excessive, far better to let the application developer split the queries up because they know better which parts have to resolve first.


Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:

> >
> >
> > >
> > > But other situations seem un-handle-able to me:
> > >
> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
> >
> > Do we have any way, or any plan to make a way, to push the set (SELECT
> > x FROM local_table WHERE active = true) to the remote side for
> > execution there?  Obviously, there are foreign DBs that couldn't
> > support this, but I'm guessing they wouldn't have much by way of UDFs
> > either.
> >
>
> No. The remote query has to be generated at planning time, so it can't make
> predicates out of anything that can't be resolved into constants by the
> planner itself. The complexities of doing so would be excessive, far better
> to let the application developer split the queries up because they know
> better which parts have to resolve first.

So Corey and I, with lots of inputs from Andrew Gierth and Matheus
Oliveira, have come up with a sketch of how to do this, to wit:

- Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
  LANGUAGE as parameters, but not both. This seems simpler, at least
  in a proof of concept, than creating SQL standard compliant grammar
  out of whole cloth.  The SQL standard grammar could be layered in
  later via the rewriter if this turns out to work.

- In pg_proc, store foreign functions as having a new language,
  sql_med, which doesn't actually exist.  This "language" would
  function as a hint to the planner.

- Add a new system catalog for foreign functions that references
  pg_proc and pg_foreign_server. Writing to it would also do the usual
  stuff with pg_depend.

- During planning, at least to start, we'd ensure that foreign
  functions can only take arguments on the same server.

- Once it's established that the combinations could actually work,
  execution gets pushed to the foreign server(s)

What say?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Ashutosh Bapat
On Thu, Jan 25, 2018 at 10:43 AM, David Fetter <[hidden email]> wrote:

> On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
>> >
>> >
>> > >
>> > > But other situations seem un-handle-able to me:
>> > >
>> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
>> >
>> > Do we have any way, or any plan to make a way, to push the set (SELECT
>> > x FROM local_table WHERE active = true) to the remote side for
>> > execution there?  Obviously, there are foreign DBs that couldn't
>> > support this, but I'm guessing they wouldn't have much by way of UDFs
>> > either.
>> >
>>
>> No. The remote query has to be generated at planning time, so it can't make
>> predicates out of anything that can't be resolved into constants by the
>> planner itself. The complexities of doing so would be excessive, far better
>> to let the application developer split the queries up because they know
>> better which parts have to resolve first.
>
> So Corey and I, with lots of inputs from Andrew Gierth and Matheus
> Oliveira, have come up with a sketch of how to do this, to wit:
>
> - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
>   LANGUAGE as parameters, but not both. This seems simpler, at least
>   in a proof of concept, than creating SQL standard compliant grammar
>   out of whole cloth.  The SQL standard grammar could be layered in
>   later via the rewriter if this turns out to work.
>
> - In pg_proc, store foreign functions as having a new language,
>   sql_med, which doesn't actually exist.  This "language" would
>   function as a hint to the planner.
>
> - Add a new system catalog for foreign functions that references
>   pg_proc and pg_foreign_server. Writing to it would also do the usual
>   stuff with pg_depend.
>
> - During planning, at least to start, we'd ensure that foreign
>   functions can only take arguments on the same server.

May be I am going in details, not expected at this stage. Right now
FDWs have a notion of shippability - i.e. certain expressions can be
evaluated on the remote server. Shippable expressions are pushed down
to the foreign server, but that's optional. Unshippable expressions
however can not be pushed down to the foreign server. With this
change, we will have a new notion of shippability where a
function/expression must be shipped to the foreign server. As long as
these strict-shippable expressions are part of shippable expressions,
things work as they are today, but as an earlier mail by Corey shows,
if those are expressions are not part of shippable expressions, they
need to be evaluated on foreign server apart from the query that gets
pushed down. You seem to be suggesting that we do not implement it
right now, which is fine. But whatever design we chose should be
extensible to do that.

A possible way to implement this may be to implement sql-med language
handler which takes the responsibility to interact with FDW and
evaluate the function. That way we can use existing function
evaluation infrastructure.

>
> - Once it's established that the combinations could actually work,
>   execution gets pushed to the foreign server(s)
>

Overall this structure looks ok to me.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Previous Thread Next Thread