Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
Hi all,

Since 2dedf4d, recovery.conf is dead and all recovery parameters are now
GUCs.  While looking at a patch to switch primary_conninfo and
primary_slot_name to be reloadable, Sergei Kornilov had a very good
point that the WAL receiver uses a connection string and a physical slot
name based on what the startup process wants the WAL receiver to use:
https://www.postgresql.org/message-id/20181212043208.GI17695@...

It seems to me that doing so is now strange as the WAL receiver knows
about the GUC context, and actually it knows the parameters it should
use, so there is no point to pass down the values when requesting the
WAL receiver to start.

What do you think about the attached to simplify the logic?  Even if
primary_conninfo and primary_slot_name are not switched to SIGHUP this
cleanup looks like a good thing to me.

Thoughts?
--
Michael

walreceiver-gucs-v1.patch (4K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Andres Freund


On December 11, 2018 9:30:42 PM PST, Michael Paquier <[hidden email]> wrote:

>Hi all,
>
>Since 2dedf4d, recovery.conf is dead and all recovery parameters are
>now
>GUCs.  While looking at a patch to switch primary_conninfo and
>primary_slot_name to be reloadable, Sergei Kornilov had a very good
>point that the WAL receiver uses a connection string and a physical
>slot
>name based on what the startup process wants the WAL receiver to use:
>https://www.postgresql.org/message-id/20181212043208.GI17695@...
>
>It seems to me that doing so is now strange as the WAL receiver knows
>about the GUC context, and actually it knows the parameters it should
>use, so there is no point to pass down the values when requesting the
>WAL receiver to start.
>
>What do you think about the attached to simplify the logic?  Even if
>primary_conninfo and primary_slot_name are not switched to SIGHUP this
>cleanup looks like a good thing to me.

I am not convinced this is a good idea. This allows the state of walrcv and startup to diverge, they could e.g. have different configuration, due to differently time config reloads.  And they need to communicate via shmem anyway, so there's not a lot of complexity avoided.  And I think it's good to be able to show the active connection via functions, rather than the one currently in pg.conf, which might be different.

Andres

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Tue, Dec 11, 2018 at 09:34:58PM -0800, Andres Freund wrote:
> I am not convinced this is a good idea. This allows the state of
> walrcv and startup to diverge, they could e.g. have different
> configuration, due to differently time config reloads.  And they need
> to communicate via shmem anyway, so there's not a lot of complexity
> avoided.  And I think it's good to be able to show the active
> connection via functions, rather than the one currently in pg.conf,
> which might be different.

Well, the conninfo and slot name accessible to the user are the values
available only once the information of the WAL receiver in shared memory
is ready to be displayed.  Relying more on the GUC context has the
advantage to never have in shared memory the original string and only
store the clobbered one, which actually simplifies what's stored in
shared memory because you can entirely remove ready_to_display (I forgot
to remove that in the patch actually).  If those parameters become
reloadable, you actually rely only on what the param context has, and
not on what the shared memory context may have or not.

Removing entirely ready_to_display is quite appealing actually..
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Simon Riggs
In reply to this post by Andres Freund
On Wed, 12 Dec 2018 at 05:35, Andres Freund <[hidden email]> wrote:
 
>What do you think about the attached to simplify the logic?  Even if
>primary_conninfo and primary_slot_name are not switched to SIGHUP this
>cleanup looks like a good thing to me.

I am not convinced this is a good idea. This allows the state of walrcv and startup to diverge, they could e.g. have different configuration, due to differently time config reloads.
 
That sounds bad, but most parameters apply to one or the other, not both.

If there are some that apply to both, then yes, coordination would be important.

It does seem likely that the new scheme will require us to look carefully at when parameters are reloaded, since the timing of reloads was never taken into account in the previous coding.
 
--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Sergei Kornilov
In reply to this post by Andres Freund
Hello

> This allows the state of walrcv and startup to diverge, they could e.g. have different configuration, due to differently time config reloads.
So we have exactly same problem with, for example, hot_standby_feedback, right?
We change hot_standby_feedback value, reload it and we can have 'show hot_standby_feedback' different to currently running walreceiver.
But why we have nothing about hot_standby_feedback in pg_stat_get_wal_receiver()?
Where is difference?

regards, Sergei

Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Wed, Dec 12, 2018 at 06:55:04PM +0300, Sergei Kornilov wrote:
>> This allows the state of walrcv and startup to diverge, they could
>> e.g. have different configuration, due to differently time config
>> reloads.
>
> So we have exactly same problem with, for example, hot_standby_feedback, right?
> We change hot_standby_feedback value, reload it and we can have 'show
> hot_standby_feedback' different to currently running walreceiver.  But
> why we have nothing about hot_standby_feedback in pg_stat_get_wal_receiver()?
> Where is difference?

The difference is in the fact that a WAL receiver uses the connection
string and the slot name when establishing the connection when using
START_STREAMING through the replication protocol, and
hot_standby_feedback gets used depending on the context of the messages
exchanged.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
In reply to this post by Michael Paquier-2
On Wed, Dec 12, 2018 at 02:55:17PM +0900, Michael Paquier wrote:

> Well, the conninfo and slot name accessible to the user are the values
> available only once the information of the WAL receiver in shared memory
> is ready to be displayed.  Relying more on the GUC context has the
> advantage to never have in shared memory the original string and only
> store the clobbered one, which actually simplifies what's stored in
> shared memory because you can entirely remove ready_to_display (I forgot
> to remove that in the patch actually).  If those parameters become
> reloadable, you actually rely only on what the param context has, and
> not on what the shared memory context may have or not.
>
> Removing entirely ready_to_display is quite appealing actually..
I have been looking at that stuff this morning, and indeed
ready_for_display can be entirely removed.  I think that's quite an
advantage as WalRcv->conninfo only stores the connection string
cloberred to hide security-sensitive fields and it never has to save the
initial state which could have sensitive data, simplifying how the WAL
receiver data is filled.  I am adding that to the next commit fest.
--
Michael

walreceiver-gucs-v2.patch (6K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Sergei Kornilov
Hi

Not sure i can be reviewer since this was initially my proposal.
I vote +1 for this patch. Code looks good and i think we have no reason to leave RequestXLogStreaming as-is.

regards, Sergei

Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Fri, Dec 14, 2018 at 12:23:09PM +0300, Sergei Kornilov wrote:
> Not sure i can be reviewer since this was initially my proposal.

No issues from me if you wish to review the code.  In my opinion, it is
not a problem if you are a reviewer as I wrote the code.

> I vote +1 for this patch. Code looks good and i think we have no
> reason to leave RequestXLogStreaming as-is.

Thanks for the input.  The take on my side is to be able to remove
ready_to_display from WalRcv so let's see what others think.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Donald Dong
In reply to this post by Michael Paquier-2
Hi Michael,

Thank you for the information!

> On Dec 11, 2018, at 9:55 PM, Michael Paquier <[hidden email]> wrote:
>
> Well, the conninfo and slot name accessible to the user are the values
> available only once the information of the WAL receiver in shared memory
> is ready to be displayed.  Relying more on the GUC context has the
> advantage to never have in shared memory the original string and only
> store the clobbered one, which actually simplifies what's stored in
> shared memory because you can entirely remove ready_to_display (I forgot
> to remove that in the patch actually).  If those parameters become
> reloadable, you actually rely only on what the param context has, and
> not on what the shared memory context may have or not.

I wonder why do we need to have this addition?

src/backend/replication/walreceiver.c
+       memset(walrcv->slotname, 0, NAMEDATALEN);
+       if (PrimarySlotName)
+               strlcpy((char *) walrcv->slotname, PrimarySlotName, NAMEDATALEN);


src/backend/replication/walreceiver.c
288: PrimaryConnInfo == NULL || PrimaryConnInfo[0] == '\0'
291: errmsg("cannot connect to the primary server as \"primary_conninfo\" is not defined")));
392: PrimarySlotName && PrimarySlotName[0] != '\0'

src/backend/access/transam/xlog.c
11824: * If primary_conninfo is set, launch walreceiver to try
11833: PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0

I notice these patterns appear in different places. Maybe it will be better to have a common method such as pg_str_empty()?

Regards,
Donald Dong
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Wed, Jan 09, 2019 at 06:00:23AM -0800, Donald Dong wrote:
> I wonder why do we need to have this addition?
>
> src/backend/replication/walreceiver.c
> +       memset(walrcv->slotname, 0, NAMEDATALEN);
> +       if (PrimarySlotName)
> +               strlcpy((char *) walrcv->slotname, PrimarySlotName, NAMEDATALEN);
>

That's much cleaner to me to clean up the field for safety before
starting the process.  When requesting a WAL receiver to start,
slotname and conninfo gets zeroed before doing anything, you are right
that we could do without it actually.

> I notice these patterns appear in different places. Maybe it will be
> better to have a common method such as pg_str_empty()?

That's a pattern used quite a lot for many GUCs, so that's quite
separate from this patch.  Perhaps it would make sense to have a macro
for that purpose for GUCs, I am not sure if that's worth it though.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Donald Dong

On Jan 9, 2019, at 4:47 PM, Michael Paquier <[hidden email]> wrote:
> That's much cleaner to me to clean up the field for safety before
> starting the process.  When requesting a WAL receiver to start,
> slotname and conninfo gets zeroed before doing anything, you are right
> that we could do without it actually.

Cool! I agree that cleaning up the field would make the logic cleaner.

> That's a pattern used quite a lot for many GUCs, so that's quite
> separate from this patch.  Perhaps it would make sense to have a macro
> for that purpose for GUCs, I am not sure if that's worth it though.

Yeah. I think having such macro would make the code a bit cleaner. Tho the
duplication of logic is not too significant.



Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Wed, Jan 09, 2019 at 05:38:53PM -0800, Donald Dong wrote:
> On Jan 9, 2019, at 4:47 PM, Michael Paquier <[hidden email]> wrote:
>> That's much cleaner to me to clean up the field for safety before
>> starting the process.  When requesting a WAL receiver to start,
>> slotname and conninfo gets zeroed before doing anything, you are right
>> that we could do without it actually.
>
> Cool! I agree that cleaning up the field would make the logic cleaner.

I was just double-checking the code, and it is possible to remove the
part from RequestXLogStreaming() which sets slotname and conninfo to
'\0', so I removed that part from my local branch to simplify the
code.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Sergei Kornilov
Hello

> I was just double-checking the code, and it is possible to remove the
> part from RequestXLogStreaming() which sets slotname and conninfo to
> '\0', so I removed that part from my local branch to simplify the
> code.

Without both ready_to_display and cleaning in RequestXLogStreaming we do not show outdated PrimaryConnInfo in case walreceiver just started, but does not connected yet? While waiting walrcv_connect for example.

regards, Sergei

Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Thu, Jan 10, 2019 at 01:10:16PM +0300, Sergei Kornilov wrote:
> Without both ready_to_display and cleaning in RequestXLogStreaming
> we do not show outdated PrimaryConnInfo in case walreceiver just
> started, but does not connected yet? While waiting walrcv_connect
> for example.

Good point.  There is a gap between the moment the WAL receiver PID is
set when the WAL receiver starts up and the moment where the different
fields are reset to 0 which is not good as it could be possible that
the user sees the information from the previous WAL receiver, so we
should move the memset calls when the PID is set, reaching a state
where the PID is alive but where there is no connection yet.  The port
number needs a similar treatment.

Looking closer at the code, it seems to me that it would be a good
thing if we update the shared memory state when the WAL receiver dies,
so as not only the PID is set to 0, but all connection-related
information gets the call.

With all those comments I am finishing with the updated attached.
Thoughts?
--
Michael

walreceiver-gucs-v3.patch (8K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Sergei Kornilov
Hi

Thank you, patch looks good for me.

regards, Sergei

Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Thu, Jan 10, 2019 at 02:20:27PM +0300, Sergei Kornilov wrote:
> Thank you, patch looks good for me.

Thanks Sergei for the review.  I have been looking at the patch again
this morning with a fresh set of eyes and the thing looks in a
committable shape (the GUC values for NULL checks are a bit
inconsistent in the last patch by the way, so I have fixed them
locally).

I'll do an extra pass on it in the next couple of days and commit.
Then let's move on with the reloadable portions.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Andres Freund
Hi,

On 2019-01-11 09:34:28 +0900, Michael Paquier wrote:

> On Thu, Jan 10, 2019 at 02:20:27PM +0300, Sergei Kornilov wrote:
> > Thank you, patch looks good for me.
>
> Thanks Sergei for the review.  I have been looking at the patch again
> this morning with a fresh set of eyes and the thing looks in a
> committable shape (the GUC values for NULL checks are a bit
> inconsistent in the last patch by the way, so I have fixed them
> locally).
>
> I'll do an extra pass on it in the next couple of days and commit.
> Then let's move on with the reloadable portions.

I still think this whole direction of accessing the GUC in walreceiver
is a bad idea and shouldn't be pursued further.  There's definite
potential for startup process and WAL receiver having different states
of GUCs, the code doesn't get meaningfully simpler, the GUC value checks
in walreceiver make for horrible reporting up the chain.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Michael Paquier-2
On Thu, Jan 10, 2019 at 04:41:47PM -0800, Andres Freund wrote:
> I still think this whole direction of accessing the GUC in walreceiver
> is a bad idea and shouldn't be pursued further.  There's definite
> potential for startup process and WAL receiver having different states
> of GUCs, the code doesn't get meaningfully simpler, the GUC value checks
> in walreceiver make for horrible reporting up the chain.

Did you notice the set of messages from upthread?  The code *gets*
simpler by removing ready_to_display and the need to manipulate the
non-clobbered connection string sent directly from the startup
process.  In my opinion that's a clear gain.  We gain also the
possibility to track down that a WAL receiver is started but not
connected yet for monitoring tools.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

Andres Freund
On 2019-01-11 09:50:49 +0900, Michael Paquier wrote:

> On Thu, Jan 10, 2019 at 04:41:47PM -0800, Andres Freund wrote:
> > I still think this whole direction of accessing the GUC in walreceiver
> > is a bad idea and shouldn't be pursued further.  There's definite
> > potential for startup process and WAL receiver having different states
> > of GUCs, the code doesn't get meaningfully simpler, the GUC value checks
> > in walreceiver make for horrible reporting up the chain.
>
> Did you notice the set of messages from upthread?  The code *gets*
> simpler by removing ready_to_display and the need to manipulate the
> non-clobbered connection string sent directly from the startup
> process.

It's a minor simplification for code that's existed for quite a
while. Not worth it.

12