CREATE ROUTINE MAPPING

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

CREATE ROUTINE MAPPING

Corey Huinker
A few months ago, I was researching ways for formalizing calling functions on one postgres instance from another. RPC, basically. In doing so, I stumbled across an obscure part of the the SQL Standard called ROUTINE MAPPING, which is exactly what I'm looking for.

The syntax specified is, roughly:

CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
SERVER my_server [ OPTIONS( ... ) ]

Which isn't too different from CREATE USER MAPPING.

The idea here is that if I had a local query:

SELECT t.x, remote_func1(),  remote_func2(t.y)
FROM remote_table t
WHERE t.active = true;

that would become this query on the remote side:

SELECT t.x, local_func1(), local_func2(t.y)
FROM local_table t
WHERE t.active = true;


That was probably the main intention of this feature, but I see a different possibility there. Consider the cases:

SELECT remote_func(1,'a');

and

SELECT * FROM remote_srf(10, true);

Now we could have written remote_func() and remote_srf() in plpythonu, and it could access whatever remote data that we wanted to see, but that exposes our local server to the untrusted pl/python module as well as python process overhead.

We could create a specialized foreign data wrapper that requires a WHERE clause to include all the require parameters as predicates, essentially making every function a table, but that's awkward and unclear to an end user.

Having the ability to import functions from other servers allows us to write foreign servers that expose functions to the local database, and those foreign servers handle the bloat and risks associated with accessing that remote data.

Moreover, it would allow hosted environments (AWS, etc) that restrict the extensions that can be added to the database to still connect to those foreign data sources.

I'm hoping to submit a patch for this someday, but it touches on several areas of the codebase where I have no familiarity, so I've put forth to spark interest in the feature, to see if any similar work is underway, or if anyone can offer guidance.

Thanks in advance.
Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Thu, Jan 11, 2018 at 09:37:43PM -0500, Corey Huinker wrote:
> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]

[neat use cases elided]

For what it's worth, the now-defunct DBI-Link I wrote had
remote_execute(), which did many of the things you describe here, only
with no help from the rest of PostgreSQL, as it was implemented
strictly in userland.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Ashutosh Bapat
In reply to this post by Corey Huinker
On Fri, Jan 12, 2018 at 8:07 AM, Corey Huinker <[hidden email]> wrote:

> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]
>
>
> Which isn't too different from CREATE USER MAPPING.
>
> The idea here is that if I had a local query:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>

I think this is a desired feature. Being able to call a function on
remote server through local server is often useful.

PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Pavel Stehule


2018-01-12 10:02 GMT+01:00 Ashutosh Bapat <[hidden email]>:
On Fri, Jan 12, 2018 at 8:07 AM, Corey Huinker <[hidden email]> wrote:
> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]
>
>
> Which isn't too different from CREATE USER MAPPING.
>
> The idea here is that if I had a local query:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>

I think this is a desired feature. Being able to call a function on
remote server through local server is often useful.

PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

+1

Pavel


--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Corey Huinker
In reply to this post by Ashutosh Bapat
PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

"local_routine_name" and "remote_routine_spec" were my own paraphrasings of what the spec implies. I'm nearly certain that the local routine name, which the spec says is just an identifier, cannot have a parameter spec on it, which leaves only one other place to define it, remote_routine_spec, which wasn't defined at all. I _suppose_ parameter definitions could be pushed into options, but that'd be ugly.
Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Fri, Jan 12, 2018 at 11:11:26AM -0500, Corey Huinker wrote:

> >
> > PostgreSQL allows function overloading, which means that there can
> > be multiple functions with same name differing in argument types.
> > So, the syntax has to include the input parameters or their types
> > at least.
>
> "local_routine_name" and "remote_routine_spec" were my own
> paraphrasings of what the spec implies. I'm nearly certain that the
> local routine name, which the spec says is just an identifier,
> cannot have a parameter spec on it, which leaves only one other
> place to define it, remote_routine_spec, which wasn't defined at
> all. I _suppose_ parameter definitions could be pushed into options,
> but that'd be ugly.

In my draft of SQL:2011, which I don't think has substantive changes
to what's either in the official SQL:2011 or SQL:2016, it says:

<routine mapping definition> ::=
CREATE ROUTINE MAPPING <routine mapping name> FOR <specific routine designator>
SERVER <foreign server name> [ <generic options> ]
Syntax Rules
1) Let FSN be the <foreign server name>. Let RMN be the <routine mapping name>.
2) The catalog identified by the explicit or implicit catalog name of FSN shall include a foreign server
descriptor whose foreign server name is equivalent to FSN.
3) The SQL-environment shall not include a routine mapping descriptor whose routine mapping name is
RMN.
4) Let R be the SQL-invoked routine identified by the <specific
routine designator>. R shall identify an SQL-invoked regular function.

It goes on from there, but I think there's a reasonable interpretation
of this which allows us to use the same syntax as CREATE
(FUNCTION|PROCEDURE), apart from the body, e.g.:

CREATE ROUTINE MAPPING local_routine_name
FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
   [ RETURNS rettype
     | RETURNS TABLE ( column_name column_type [, ...] ) ]
SERVER foreign_server_name
   [ (option [, ...]) ]

Does that seem like too broad an interpretation?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

Corey Huinker


It goes on from there, but I think there's a reasonable interpretation
of this which allows us to use the same syntax as CREATE
(FUNCTION|PROCEDURE), apart from the body, e.g.:

CREATE ROUTINE MAPPING local_routine_name
FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
   [ RETURNS rettype
     | RETURNS TABLE ( column_name column_type [, ...] ) ]
SERVER foreign_server_name
   [ (option [, ...]) ]

Does that seem like too broad an interpretation?

That's really interesting. I didn't think to look in the definition of CREATE FUNCTION to see if a SERVER option popped in there, but seems like a more accessible way to introduce the notion of remote functions, because I talked to a few developers about this before posting to the list, and only one had ever heard of ROUTINE MAPPING and had no clear recollection of it. An option on CREATE FUNCTION is going to get noticed (and used!) a lot sooner.

Having said that, I think syntactically we have to implement CREATE ROUTINE MAPPING, even if it is just translated to a CREATE FUNCTION call.

In either case, I suspected that pg_proc would need a nullable srvid column pointing to pg_foreign_server, and possibly a new row in pg_language for 'external'. I had entertained having a pg_routine_mappings table like pg_user_mappings, and we still could, if the proc's language of 'external' clued the planner to look for the mapping. I can see arguments for either approach.

Before anyone asks, I looked for, and did not find, any suggestion of IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there wouldn't be any way to grab all the functions that .a foreign server is offering up.

Reply | Threaded
Open this post in threaded view
|

Re: CREATE ROUTINE MAPPING

David Fetter
On Fri, Jan 12, 2018 at 02:29:53PM -0500, Corey Huinker wrote:

> >
> >
> >
> > It goes on from there, but I think there's a reasonable interpretation
> > of this which allows us to use the same syntax as CREATE
> > (FUNCTION|PROCEDURE), apart from the body, e.g.:
> >
> > CREATE ROUTINE MAPPING local_routine_name
> > FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ]
> > argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
> >    [ RETURNS rettype
> >      | RETURNS TABLE ( column_name column_type [, ...] ) ]
> > SERVER foreign_server_name
> >    [ (option [, ...]) ]
> >
> > Does that seem like too broad an interpretation?
> >
>
> That's really interesting. I didn't think to look in the definition of
> CREATE FUNCTION to see if a SERVER option popped in there, but seems like a
> more accessible way to introduce the notion of remote functions,

It does indeed.  Adding the functionality to CREATE
(FUNCTION|PROCEDURE) seems like a *much* better idea than trying to
wedge it into the CREATE ROUTINE MAPPING syntax.

> because I talked to a few developers about this before posting to
> the list, and only one had ever heard of ROUTINE MAPPING and had no
> clear recollection of it.  An option on CREATE FUNCTION is going to
> get noticed (and used!) a lot sooner.

+1

> Having said that, I think syntactically we have to implement CREATE ROUTINE
> MAPPING, even if it is just translated to a CREATE FUNCTION call.
>
> In either case, I suspected that pg_proc would need a nullable srvid column
> pointing to pg_foreign_server, and possibly a new row in pg_language for
> 'external'.

Makes a lot of sense.

> I had entertained having a pg_routine_mappings table like
> pg_user_mappings, and we still could, if the proc's language of
> 'external' clued the planner to look for the mapping. I can see
> arguments for either approach.

It would be good to have them in the catalog somehow if we make CREATE
ROUTINE MAPPING a DDL.  If I've read the standard correctly, there are
parts of information_schema which come into play for those routine
mappings.

> Before anyone asks, I looked for, and did not find, any suggestion of
> IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there
> wouldn't be any way to grab all the functions that .a foreign server is
> offering up.

How about making an option to IMPORT FOREIGN SCHEMA do it?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Previous Thread Next Thread