Keeping state in a foreign data wrapper

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Keeping state in a foreign data wrapper

Stelios Sfakianakis
Hi,

I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for example the total number of requests submiited. Ideally these data should be kept “per-user” and of course with the proper locking to eliminate race conditions. So the question I have is how to design such a global (but per user and database) state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private fields in the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another possiblity seems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is even less clear how to do it.

So is there any simple way to do implement such shared thread/process safe state?
Thank you
Stelios





Reply | Threaded
Open this post in threaded view
|

Re: Keeping state in a foreign data wrapper

Ian Lawrence Barwick
2020年8月4日(火) 1:24 Stelios Sfakianakis <[hidden email]>:
>
> Hi,
>
> I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for example the total number of requests submiited. Ideally these data should be kept “per-user” and of course with the proper locking to eliminate race conditions. So the question I have is how to design such a global (but per user and database) state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private fields in the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another possiblity seems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is even less clear how to do it.

Shared memory would probably work; you'd need to load the FDW via
"shared_preload_libraries" and have the FDW handler function(s) update
shared memory with whatever statistics you want to track. You could
then define SQL functions to retrieve the stored values, and possibly
persist them over server restarts by storing/retrieving them from a
file.

Look at "pg_stat_statements" for a good example of how to do that kind of thing.

Regards

Ian Barwick

--
Ian Barwick                   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Keeping state in a foreign data wrapper

Stelios Sfakianakis
Thank you Ian for the prompt reply! I will certainly have a look at pg_stat_statements

I also came across mysql_fdw (https://github.com/EnterpriseDB/mysql_fdw) that features a connection pool shared across queries. It uses a hash table with the serverid and userid as lookup key : https://github.com/EnterpriseDB/mysql_fdw/blob/REL-2_5_4/connection.c#L55 The hash table is allocated in the cache memory context but it worries me that 1) no locks are used, 2) the "ConnectionHash" variable is declared static so in the multi-process architecture of Postgres could have been the case that multiple copies of this exist when the shared library of mysql_fdw is loaded?

Best regards
Stelios

On 4 Aug 2020, at 06:25, Ian Lawrence Barwick <[hidden email]> wrote:

2020年8月4日(火) 1:24 Stelios Sfakianakis <[hidden email]>:

Hi,

I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for example the total number of requests submiited. Ideally these data should be kept “per-user” and of course with the proper locking to eliminate race conditions. So the question I have is how to design such a global (but per user and database) state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private fields in the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another possiblity seems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is even less clear how to do it.

Shared memory would probably work; you'd need to load the FDW via
"shared_preload_libraries" and have the FDW handler function(s) update
shared memory with whatever statistics you want to track. You could
then define SQL functions to retrieve the stored values, and possibly
persist them over server restarts by storing/retrieving them from a
file.

Look at "pg_stat_statements" for a good example of how to do that kind of thing.

Regards

Ian Barwick

--
Ian Barwick                   https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Keeping state in a foreign data wrapper

Ian Lawrence Barwick
2020年8月4日(火) 14:54 Stelios Sfakianakis <[hidden email]>:

> On 4 Aug 2020, at 06:25, Ian Lawrence Barwick <[hidden email]> wrote:
>
> 2020年8月4日(火) 1:24 Stelios Sfakianakis <[hidden email]>:
>
> Hi,
>
>>> I am trying to  implement a FDW in Postgres for accessing a web api and I would like to keep information like for example the total number of requests submiited. Ideally these data should be kept “per-user” and of course with the proper locking to eliminate race conditions. So the question I have is how to design such a global (but per user and database) state, using the C FDW API of Postgres. I have gone through the API and for example I see various fdw_private fields in the query planning structures and callbacks but these do not seem to be relevant to my requirements. Another possiblity seems to be to use shared memory (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is even less clear how to do it.
>
>
>> Shared memory would probably work; you'd need to load the FDW via
>> "shared_preload_libraries" and have the FDW handler function(s) update
>> shared memory with whatever statistics you want to track. You could
>> then define SQL functions to retrieve the stored values, and possibly
>> persist them over server restarts by storing/retrieving them from a
>> file.
>
>> Look at "pg_stat_statements" for a good example of how to do that kind of thing.
>
> Thank you Ian for the prompt reply! I will certainly have a look at pg_stat_statements
>
> I also came across mysql_fdw (https://github.com/EnterpriseDB/mysql_fdw) that features a connection pool shared across queries. It uses a hash table with the serverid and userid as lookup key : https://github.com/EnterpriseDB/mysql_fdw/blob/REL-2_5_4/connection.c#L55

This is essentially the same as what "postgres_fdw" and similar FDW
implementations do.

> The hash table is allocated in the cache memory context but it worries me that 1) no locks are used, 2) the "ConnectionHash" variable is declared static so in the multi-process architecture of Postgres could have been the case that multiple copies of this exist when the shared library of mysql_fdw is loaded?

The hash table is specific to each running backend so will only be
accessed by that process.

Pre-loading a shared library just gives the library an opportunity to
set up shared memory etc. You can always try adding one of the FDW
libraries to "shared_preload_libraries" and see what happens
(theoretically nothing).

Regards

Ian Barwick



>
> Best regards
> Stelios
>
>
> Regards
>
> Ian Barwick
>
> --
> Ian Barwick                   https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Keeping state in a foreign data wrapper

Stelios Sfakianakis
Thank you again, I have another question in order to make sure I have a clear understanding:


> On 4 Aug 2020, at 11:24, Ian Lawrence Barwick <[hidden email]> wrote:
>
> The hash table is specific to each running backend so will only be
> accessed by that process.
>
> Pre-loading a shared library just gives the library an opportunity to
> set up shared memory etc. You can always try adding one of the FDW
> libraries to "shared_preload_libraries" and see what happens
> (theoretically nothing).
>

My impression was that since each client (e.g. libpq) connection results in the creation of a Postgres process in the backend (https://www.postgresql.org/developer/backend/) then this  (mysql) "connection pool" hash table is not global per se and shared among the different client / users sessions. But that defeats the purpose, no?

Thank you
Best
Stelios




Reply | Threaded
Open this post in threaded view
|

Re: Keeping state in a foreign data wrapper

Ian Barwick-3
On 2020/08/04 19:21, Stelios Sfakianakis wrote:
 > Thank you again, I have another question in order to make sure I have a clear understanding:
 >
 >
 >> On 4 Aug 2020, at 11:24, Ian Lawrence Barwick <[hidden email]> wrote:
 >>
 >> The hash table is specific to each running backend so will only be
 >> accessed by that process.
 >>
 >> Pre-loading a shared library just gives the library an opportunity to
 >> set up shared memory etc. You can always try adding one of the FDW
 >> libraries to "shared_preload_libraries" and see what happens
 >> (theoretically nothing).
 >>
 >
 > My impression was that since each client (e.g. libpq) connection results in the creation of a Postgres process in the backend (https://www.postgresql.org/developer/backend/) then this  (mysql) "connection pool" hash table is not global per se and shared among the different client / users sessions.

Correct, the connections are specific to each individual backend.

 > But that defeats the purpose, no?

The purpose is to cache connections within the session, to avoid the overhead
of reconnecting to the remote server each time a query for that server is issued
in that session.


Regards

Ian Barwick




--
Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Keeping state in a foreign data wrapper

Ian Barwick-3
In reply to this post by Stelios Sfakianakis
On 2020/08/04 19:21, Stelios Sfakianakis wrote:
 > Thank you again, I have another question in order to make sure I have a clear understanding:
 >
 >
 >> On 4 Aug 2020, at 11:24, Ian Lawrence Barwick <[hidden email]> wrote:
 >>
 >> The hash table is specific to each running backend so will only be
 >> accessed by that process.
 >>
 >> Pre-loading a shared library just gives the library an opportunity to
 >> set up shared memory etc. You can always try adding one of the FDW
 >> libraries to "shared_preload_libraries" and see what happens
 >> (theoretically nothing).
 >>
 >
 > My impression was that since each client (e.g. libpq) connection results in the creation of a Postgres process in the backend (https://www.postgresql.org/developer/backend/) then this  (mysql) "connection pool" hash table is not global per se and shared among the different client / users sessions.

Correct, the connections are specific to each individual backend.

 > But that defeats the purpose, no?

The purpose is to cache connections within the session, to avoid the overhead
of reconnecting to the remote server each time a query for that server is issued
in that session.


Regards

Ian Barwick



--
Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Keeping state in a foreign data wrapper

Laurenz Albe
In reply to this post by Stelios Sfakianakis
On Mon, 2020-08-03 at 19:23 +0300, Stelios Sfakianakis wrote:

> I am trying to  implement a FDW in Postgres for accessing a web api and I would
>  like to keep information like for example the total number of requests submiited.
>  Ideally these data should be kept “per-user” and of course with the proper locking
>  to eliminate race conditions. So the question I have is how to design such a global
>  (but per user and database) state, using the C FDW API of Postgres. I have gone
>  through the API and for example I see various fdw_private fields in the query
>  planning structures and callbacks but these do not seem to be relevant to my requirements.
>  Another possiblity seems to be to use shared memory
>  (https://www.postgresql.org/docs/12/xfunc-c.html#id-1.8.3.13.14) but it is even
>  less clear how to do it.
>
> So is there any simple way to do implement such shared thread/process safe state?

I would create a table as part of the extension and use SPI to store
the data there.

Yours,
Laurenz Albe