Add Information during standby recovery conflicts

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
53 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Add Information during standby recovery conflicts

Drouvot, Bertrand

Hi hackers,

I've attached a patch to add information during standby recovery conflicts.
The motive behind is to be able to get some information:

  • On the apply side
  • On the blocker(s) side
Motivation:

When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the root cause and find a way to avoid/mitigate new occurrences.

Adding this information would make the investigations easier, it could help answering questions like:
  • On which LSN was the WAL apply blocked?
  • What was the purpose of the bocked WAL record?
  • On which relation (if any) was the blocked WAL record related to?
  • What was the blocker(s) doing?
  • When did the blocker(s) started their queries (if any)?
  • What was the blocker(s) waiting for? on which wait event?
Technical context and proposal:

There is 2 points in this patch:
  • Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler)
  • Add the information about the blocker(s). This is done in postgres.c (RecoveryConflictInterrupt)
Outcome Example:
2020-06-15 06:48:23.774 UTC [6971] LOG:  wal record apply is blocked by 2 connection(s), reason: User query might have needed to see row versions that must be removed.
2020-06-15 06:48:23.774 UTC [6971] LOG:  blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0
2020-06-15 06:48:54.773 UTC [7088] LOG:  about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00
2020-06-15 06:48:54.773 UTC [7088] DETAIL:  statement: select *, pg_sleep(120)  from bdt;
2020-06-15 06:48:54.773 UTC [7088] STATEMENT:  select *, pg_sleep(120)  from bdt;
2020-06-15 06:48:54.773 UTC [7088] ERROR:  canceling statement due to conflict with recovery
2020-06-15 06:48:54.773 UTC [7088] DETAIL:  User query might have needed to see row versions that must be removed.
2020-06-15 06:48:54.773 UTC [7088] STATEMENT:  select *, pg_sleep(120)  from bdt;
2020-06-15 06:48:54.778 UTC [7037] LOG:  about to interrupt pid: 7037, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:13.008427+00
2020-06-15 06:48:54.778 UTC [7037] DETAIL:  statement: select *, pg_sleep(300)  from bdt;
2020-06-15 06:48:54.778 UTC [7037] STATEMENT:  select *, pg_sleep(300)  from bdt;
2020-06-15 06:48:54.778 UTC [7037] ERROR:  canceling statement due to conflict with recovery
2020-06-15 06:48:54.778 UTC [7037] DETAIL:  User query might have needed to see row versions that must be removed.
2020-06-15 06:48:54.778 UTC [7037] STATEMENT:  select *, pg_sleep(300)  from bdt;

It takes care of the other conflicts reason too.

So, should a standby recovery conflict occurs, we could see:
  • information about the blocked WAL record apply
  • information about the blocker(s)
I will add this patch to the next commitfest. I look forward to your feedback about the idea and/or implementation.

Regards,

Bertrand


v1-0001-Add-Standby-Recovery-Conflicts-Info.patch (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <[hidden email]> wrote:

>
> Hi hackers,
>
> I've attached a patch to add information during standby recovery conflicts.
> The motive behind is to be able to get some information:
>
> On the apply side
> On the blocker(s) side
>
> Motivation:
>
> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the root cause and find a way to avoid/mitigate new occurrences.

I think this is a good feature. Like log_lock_waits, it will help the
users to investigate recovery conflict issues.

>
> Adding this information would make the investigations easier, it could help answering questions like:
>
> On which LSN was the WAL apply blocked?
> What was the purpose of the bocked WAL record?
> On which relation (if any) was the blocked WAL record related to?
> What was the blocker(s) doing?
> When did the blocker(s) started their queries (if any)?
> What was the blocker(s) waiting for? on which wait event?
>
> Technical context and proposal:
>
> There is 2 points in this patch:
>
> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler)

I think we already have the information about the WAL record being
applied in errcontext.

I wonder if we can show the recovery conflict information in the main
LOG message, the blocker information in errdetail, and use errcontext
with regard to WAL record information. For example:

LOG:  process 500 waiting for recovery conflict on snapshot
DETAIL:  conflicting transition id: 100, 200, 300
CONTEXT:  WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506

> Outcome Example:
>
> 2020-06-15 06:48:23.774 UTC [6971] LOG:  wal record apply is blocked by 2 connection(s), reason: User query might have needed to see row versions that must be removed.
> 2020-06-15 06:48:23.774 UTC [6971] LOG:  blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0
> 2020-06-15 06:48:54.773 UTC [7088] LOG:  about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00

I'm concerned that the above information on the process who is about
to be interrupted is very detailed but I'm not sure it will be helpful
for the users. If the blocker is waiting on something lock, the
information should be logged by log_lock_waits. Also the canceled
backend will emit the ERROR log with the message "canceling statement
due to conflict with recovery”, and its pid can be logged at the log
prefix. The user can know who has been canceled by recovery conflicts
and the query.

Probably we need to consider having a time threshold (or boolean to
turn on/off) to emit this information to the server logs. Otherwise,
we will end up writing every conflict information, making the log
dirty needlessly.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand
Hey,

On 6/29/20 10:55 AM, Masahiko Sawada wrote:

> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <[hidden email]> wrote:
>> Hi hackers,
>>
>> I've attached a patch to add information during standby recovery conflicts.
>> The motive behind is to be able to get some information:
>>
>> On the apply side
>> On the blocker(s) side
>>
>> Motivation:
>>
>> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the root cause and find a way to avoid/mitigate new occurrences.
> I think this is a good feature. Like log_lock_waits, it will help the
> users to investigate recovery conflict issues.
exactly, thanks for looking at it.

>
>> Adding this information would make the investigations easier, it could help answering questions like:
>>
>> On which LSN was the WAL apply blocked?
>> What was the purpose of the bocked WAL record?
>> On which relation (if any) was the blocked WAL record related to?
>> What was the blocker(s) doing?
>> When did the blocker(s) started their queries (if any)?
>> What was the blocker(s) waiting for? on which wait event?
>>
>> Technical context and proposal:
>>
>> There is 2 points in this patch:
>>
>> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler)
> I think we already have the information about the WAL record being
> applied in errcontext.
right, but it won’t be displayed in case log_error_verbosity is set to
terse.
Or did you had in mind to try to avoid using the new
“current_replayed_xlog” in xlog.c?

>
> I wonder if we can show the recovery conflict information in the main
> LOG message, the blocker information in errdetail, and use errcontext
> with regard to WAL record information. For example:
>
> LOG:  process 500 waiting for recovery conflict on snapshot
> DETAIL:  conflicting transition id: 100, 200, 300
> CONTEXT:  WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506
Not sure at all if it would be possible to put all the information at
the same time.
For example in case of shared buffer pin conflict, the blocker is
currently known “only” during the RecoveryConflictInterrupt call (so
still not known yet when we can “already” report the blocked LSN
information).
It might also happen that the blocker(s) will never get interrupted (was
not blocking anymore before max_standby_streaming_delay has been
reached): then it would not be possible to display all the information
here (aka when it is interrupted) while we still want to be warned that
the WAL replay has been blocked.

>
>> Outcome Example:
>>
>> 2020-06-15 06:48:23.774 UTC [6971] LOG:  wal record apply is blocked by 2 connection(s), reason: User query might have needed to see row versions that must be removed.
>> 2020-06-15 06:48:23.774 UTC [6971] LOG:  blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0
>> 2020-06-15 06:48:54.773 UTC [7088] LOG:  about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00
> I'm concerned that the above information on the process who is about
> to be interrupted is very detailed but I'm not sure it will be helpful
> for the users. If the blocker is waiting on something lock, the
> information should be logged by log_lock_waits.
The blocker could also be in “idle in transaction”, say for example, on
the standby (with hot_standby_feedback set to off):

standby> begin;
standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
standby> select * from bdt;

on the master:

master> update bdt set generate_series = 15;
master> vacuum verbose bdt;

would produce:

2020-07-01 09:18:55.256 UTC [32751] LOG:  about to interrupt pid: 32751,
backend_type: client backend, state: idle in transaction,
wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01
09:18:17.390572+00
2020-07-01 09:18:55.256 UTC [32751] DETAIL:  statement: select * from bdt;

I think those information are useful to have (might get rid of
wait_event_type though: done in the new attached patch).

> Also the canceled
> backend will emit the ERROR log with the message "canceling statement
> due to conflict with recovery”, and its pid can be logged at the log
> prefix. The user can know who has been canceled by recovery conflicts
> and the query.

Right, we can also get rid of the pid and the query in the new log
message too (done in the new attached patch).

> Probably we need to consider having a time threshold (or boolean to
> turn on/off) to emit this information to the server logs. Otherwise,
> we will end up writing every conflict information, making the log
> dirty needlessly.

good point, I do agree (done in the new attached patch).

Bertrand

>
> Regards,
>
> --
> Masahiko Sawada            http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v1-0002-Add-Standby-Recovery-Conflicts-Info.patch (21K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
On Wed, 1 Jul 2020 at 21:52, Drouvot, Bertrand <[hidden email]> wrote:

>
> Hey,
>
> On 6/29/20 10:55 AM, Masahiko Sawada wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> >
> >
> >
> > On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <[hidden email]> wrote:
> >> Hi hackers,
> >>
> >> I've attached a patch to add information during standby recovery conflicts.
> >> The motive behind is to be able to get some information:
> >>
> >> On the apply side
> >> On the blocker(s) side
> >>
> >> Motivation:
> >>
> >> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the root cause and find a way to avoid/mitigate new occurrences.
> > I think this is a good feature. Like log_lock_waits, it will help the
> > users to investigate recovery conflict issues.
> exactly, thanks for looking at it.
> >
> >> Adding this information would make the investigations easier, it could help answering questions like:
> >>
> >> On which LSN was the WAL apply blocked?
> >> What was the purpose of the bocked WAL record?
> >> On which relation (if any) was the blocked WAL record related to?
> >> What was the blocker(s) doing?
> >> When did the blocker(s) started their queries (if any)?
> >> What was the blocker(s) waiting for? on which wait event?
> >>
> >> Technical context and proposal:
> >>
> >> There is 2 points in this patch:
> >>
> >> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler)
> > I think we already have the information about the WAL record being
> > applied in errcontext.
> right, but it won’t be displayed in case log_error_verbosity is set to
> terse.

Yes. I think in this case errcontext or errdetail is more appropriate
for this information in this case. Otherwise, we will end up emitting
same WAL record information twice in log_error_verbosity = verbose.

For instance, here is the server logs when I tested a recovery
conflict on buffer pin but it looks redundant:

2020-07-03 11:01:15.339 JST [60585] LOG:  wal record apply is blocked
by 1 connection(s), reason: User query might have needed to see row
versions that must be removed.
2020-07-03 11:01:15.339 JST [60585] CONTEXT:  WAL redo at 0/30025E0
for Heap2/CLEAN: remxid 505
2020-07-03 11:01:15.339 JST [60585] LOG:  blocked wal record rmgr:
Heap2, lsn: 0/030025E0, received at: 2020-07-03 11:01:15.338997+09,
desc: CLEAN, relation: rel 1663/12930/16384 fork main blk 0
2020-07-03 11:01:15.339 JST [60585] CONTEXT:  WAL redo at 0/30025E0
for Heap2/CLEAN: remxid 505
2020-07-03 11:01:15.347 JST [60604] LOG:  about to be interrupted:
backend_type: client backend, state: active, wait_event: PgSleep,
query_start: 2020-07-03 11:01:14.337708+09
2020-07-03 11:01:15.347 JST [60604] STATEMENT:  select pg_sleep(30);
2020-07-03 11:01:15.347 JST [60604] ERROR:  canceling statement due to
conflict with recovery
2020-07-03 11:01:15.347 JST [60604] DETAIL:  User query might have
needed to see row versions that must be removed.
2020-07-03 11:01:15.347 JST [60604] STATEMENT:  select pg_sleep(30);

There are the information WAL record three times and the reason for
interruption twice.

> Or did you had in mind to try to avoid using the new
> “current_replayed_xlog” in xlog.c?

Regarding LogBlockedWalRecordInfo(), it seems to me that the main
message of this logging is to let users know both that the process got
stuck due to recovery conflict and its reason (lock, database,
bufferpin etc). Other information such as the details of blocked WAL,
how many processes are blocking seems like additional information. So
I think this information would be better to be shown errcontext or
errdetails and we don’t need to create a similar feature as we already
show the WAL record being replayed in errcontext.

Also, this function emits two LOG messages related to each other but
I'm concerned that it can be hard for users to associate these
separate log lines especially when server logs are written at a high
rate. And I think these two log lines don't follow our error message
style[1].

>
> >
> > I wonder if we can show the recovery conflict information in the main
> > LOG message, the blocker information in errdetail, and use errcontext
> > with regard to WAL record information. For example:
> >
> > LOG:  process 500 waiting for recovery conflict on snapshot
> > DETAIL:  conflicting transition id: 100, 200, 300
> > CONTEXT:  WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506
> Not sure at all if it would be possible to put all the information at
> the same time.
> For example in case of shared buffer pin conflict, the blocker is
> currently known “only” during the RecoveryConflictInterrupt call (so
> still not known yet when we can “already” report the blocked LSN
> information).
> It might also happen that the blocker(s) will never get interrupted (was
> not blocking anymore before max_standby_streaming_delay has been
> reached): then it would not be possible to display all the information
> here (aka when it is interrupted) while we still want to be warned that
> the WAL replay has been blocked.

I think it's okay with showing different information for different
types of recovery conflict. In buffer pin conflict case, I think we
can let the user know the process is waiting for recovery conflict on
buffer pin in the main message and the WAL record being replayed in
errdetails.

> >
> >> Outcome Example:
> >>
> >> 2020-06-15 06:48:23.774 UTC [6971] LOG:  wal record apply is blocked by 2 connection(s), reason: User query might have needed to see row versions that must be removed.
> >> 2020-06-15 06:48:23.774 UTC [6971] LOG:  blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0
> >> 2020-06-15 06:48:54.773 UTC [7088] LOG:  about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00
> > I'm concerned that the above information on the process who is about
> > to be interrupted is very detailed but I'm not sure it will be helpful
> > for the users. If the blocker is waiting on something lock, the
> > information should be logged by log_lock_waits.
>
> The blocker could also be in “idle in transaction”, say for example, on
> the standby (with hot_standby_feedback set to off):
>
> standby> begin;
> standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> standby> select * from bdt;
>
> on the master:
>
> master> update bdt set generate_series = 15;
> master> vacuum verbose bdt;
>
> would produce:
>
> 2020-07-01 09:18:55.256 UTC [32751] LOG:  about to interrupt pid: 32751,
> backend_type: client backend, state: idle in transaction,
> wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01
> 09:18:17.390572+00
> 2020-07-01 09:18:55.256 UTC [32751] DETAIL:  statement: select * from bdt;
>
> I think those information are useful to have (might get rid of
> wait_event_type though: done in the new attached patch).

Since the backend cancels query at a convenient time
(CHECK_FOR_INTERRUPT()), I'm concerned that the wait event information
displayed in that log might not be helpful. For example, even if the
blocker is getting stuck on disk I/O (WAIT_EVENT_BUFFILE_READ etc)
while holding a lock that the recovery process is waiting for, when
the blocker is able to cancel the query it's no longer waiting for
disk I/O. Also, regarding displaying the backend type, we already show
the backend type by setting %d in log_line_prefix.

I still think we should show this additional information (wait event,
query start etc) in errdetails even if we want to show this
information in the server logs. Perhaps we can improve
errdetail_recovery_conflict()?

Aside from the above comments from the perspective of high-level
design, I think we can split this patch into two patches: logging the
recovery process is waiting (adding log_recovery_conficts) and logging
the details of blockers who is about to be interrupted. I think it'll
make review easy.

Regards,

[1] https://www.postgresql.org/docs/devel/error-style-guide.html

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand

On 7/3/20 4:20 AM, Masahiko Sawada wrote:

> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed, 1 Jul 2020 at 21:52, Drouvot, Bertrand <[hidden email]> wrote:
>> Hey,
>>
>> On 6/29/20 10:55 AM, Masahiko Sawada wrote:
>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>>
>>>
>>>
>>> On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <[hidden email]> wrote:
>>>> Hi hackers,
>>>>
>>>> I've attached a patch to add information during standby recovery conflicts.
>>>> The motive behind is to be able to get some information:
>>>>
>>>> On the apply side
>>>> On the blocker(s) side
>>>>
>>>> Motivation:
>>>>
>>>> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the root cause and find a way to avoid/mitigate new occurrences.
>>> I think this is a good feature. Like log_lock_waits, it will help the
>>> users to investigate recovery conflict issues.
>> exactly, thanks for looking at it.
>>>> Adding this information would make the investigations easier, it could help answering questions like:
>>>>
>>>> On which LSN was the WAL apply blocked?
>>>> What was the purpose of the bocked WAL record?
>>>> On which relation (if any) was the blocked WAL record related to?
>>>> What was the blocker(s) doing?
>>>> When did the blocker(s) started their queries (if any)?
>>>> What was the blocker(s) waiting for? on which wait event?
>>>>
>>>> Technical context and proposal:
>>>>
>>>> There is 2 points in this patch:
>>>>
>>>> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler)
>>> I think we already have the information about the WAL record being
>>> applied in errcontext.
>> right, but it won’t be displayed in case log_error_verbosity is set to
>> terse.
> Yes. I think in this case errcontext or errdetail is more appropriate
> for this information in this case. Otherwise, we will end up emitting
> same WAL record information twice in log_error_verbosity = verbose.
>
> For instance, here is the server logs when I tested a recovery
> conflict on buffer pin but it looks redundant:
>
> 2020-07-03 11:01:15.339 JST [60585] LOG:  wal record apply is blocked
> by 1 connection(s), reason: User query might have needed to see row
> versions that must be removed.
> 2020-07-03 11:01:15.339 JST [60585] CONTEXT:  WAL redo at 0/30025E0
> for Heap2/CLEAN: remxid 505
> 2020-07-03 11:01:15.339 JST [60585] LOG:  blocked wal record rmgr:
> Heap2, lsn: 0/030025E0, received at: 2020-07-03 11:01:15.338997+09,
> desc: CLEAN, relation: rel 1663/12930/16384 fork main blk 0
> 2020-07-03 11:01:15.339 JST [60585] CONTEXT:  WAL redo at 0/30025E0
> for Heap2/CLEAN: remxid 505
> 2020-07-03 11:01:15.347 JST [60604] LOG:  about to be interrupted:
> backend_type: client backend, state: active, wait_event: PgSleep,
> query_start: 2020-07-03 11:01:14.337708+09
> 2020-07-03 11:01:15.347 JST [60604] STATEMENT:  select pg_sleep(30);
> 2020-07-03 11:01:15.347 JST [60604] ERROR:  canceling statement due to
> conflict with recovery
> 2020-07-03 11:01:15.347 JST [60604] DETAIL:  User query might have
> needed to see row versions that must be removed.
> 2020-07-03 11:01:15.347 JST [60604] STATEMENT:  select pg_sleep(30);
>
> There are the information WAL record three times and the reason for
> interruption twice.

Fully makes sense, the new patch version attached is now producing:

2020-07-06 06:10:36.022 UTC [14035] LOG:  waiting for recovery conflict
on snapshot
2020-07-06 06:10:36.022 UTC [14035] DETAIL:  WAL record received at
2020-07-06 06:10:36.021963+00.
         Tablespace/database/relation are 1663/13586/16672, fork is main
and block is 0.
         There is 1 blocking connection(s).
2020-07-06 06:10:36.022 UTC [14035] CONTEXT:  WAL redo at 0/3A05708 for
Heap2/CLEAN: remxid 972

It does not provide the pid(s) of the blocking processes as they'll
appear during the interruption(s).

>> Or did you had in mind to try to avoid using the new
>> “current_replayed_xlog” in xlog.c?
> Regarding LogBlockedWalRecordInfo(), it seems to me that the main
> message of this logging is to let users know both that the process got
> stuck due to recovery conflict and its reason (lock, database,
> bufferpin etc). Other information such as the details of blocked WAL,
> how many processes are blocking seems like additional information. So
> I think this information would be better to be shown errcontext or
> errdetails and we don’t need to create a similar feature as we already
> show the WAL record being replayed in errcontext.

I got rid of current_replayed_xlog in the new patch attached and make
use of the errcontext argument instead.

>
> Also, this function emits two LOG messages related to each other but
> I'm concerned that it can be hard for users to associate these
> separate log lines especially when server logs are written at a high
> rate. And I think these two log lines don't follow our error message
> style[1].
>
>>> I wonder if we can show the recovery conflict information in the main
>>> LOG message, the blocker information in errdetail, and use errcontext
>>> with regard to WAL record information. For example:
>>>
>>> LOG:  process 500 waiting for recovery conflict on snapshot
>>> DETAIL:  conflicting transition id: 100, 200, 300
>>> CONTEXT:  WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506
>> Not sure at all if it would be possible to put all the information at
>> the same time.
>> For example in case of shared buffer pin conflict, the blocker is
>> currently known “only” during the RecoveryConflictInterrupt call (so
>> still not known yet when we can “already” report the blocked LSN
>> information).
>> It might also happen that the blocker(s) will never get interrupted (was
>> not blocking anymore before max_standby_streaming_delay has been
>> reached): then it would not be possible to display all the information
>> here (aka when it is interrupted) while we still want to be warned that
>> the WAL replay has been blocked.
> I think it's okay with showing different information for different
> types of recovery conflict. In buffer pin conflict case, I think we
> can let the user know the process is waiting for recovery conflict on
> buffer pin in the main message and the WAL record being replayed in
> errdetails.
>
>>>> Outcome Example:
>>>>
>>>> 2020-06-15 06:48:23.774 UTC [6971] LOG:  wal record apply is blocked by 2 connection(s), reason: User query might have needed to see row versions that must be removed.
>>>> 2020-06-15 06:48:23.774 UTC [6971] LOG:  blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0
>>>> 2020-06-15 06:48:54.773 UTC [7088] LOG:  about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00
>>> I'm concerned that the above information on the process who is about
>>> to be interrupted is very detailed but I'm not sure it will be helpful
>>> for the users. If the blocker is waiting on something lock, the
>>> information should be logged by log_lock_waits.
>> The blocker could also be in “idle in transaction”, say for example, on
>> the standby (with hot_standby_feedback set to off):
>>
>> standby> begin;
>> standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
>> standby> select * from bdt;
>>
>> on the master:
>>
>> master> update bdt set generate_series = 15;
>> master> vacuum verbose bdt;
>>
>> would produce:
>>
>> 2020-07-01 09:18:55.256 UTC [32751] LOG:  about to interrupt pid: 32751,
>> backend_type: client backend, state: idle in transaction,
>> wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01
>> 09:18:17.390572+00
>> 2020-07-01 09:18:55.256 UTC [32751] DETAIL:  statement: select * from bdt;
>>
>> I think those information are useful to have (might get rid of
>> wait_event_type though: done in the new attached patch).
> Since the backend cancels query at a convenient time
> (CHECK_FOR_INTERRUPT()), I'm concerned that the wait event information
> displayed in that log might not be helpful. For example, even if the
> blocker is getting stuck on disk I/O (WAIT_EVENT_BUFFILE_READ etc)
> while holding a lock that the recovery process is waiting for, when
> the blocker is able to cancel the query it's no longer waiting for
> disk I/O. Also, regarding displaying the backend type, we already show
> the backend type by setting %d in log_line_prefix.
>
> I still think we should show this additional information (wait event,
> query start etc) in errdetails even if we want to show this
> information in the server logs. Perhaps we can improve
> errdetail_recovery_conflict()?
>
> Aside from the above comments from the perspective of high-level
> design, I think we can split this patch into two patches: logging the
> recovery process is waiting (adding log_recovery_conficts) and logging
> the details of blockers who is about to be interrupted. I think it'll
> make review easy.

Ok. Let's keep this thread for the new attached patch that focus on the
recovery process waiting.

I'll create a new one with a dedicated thread for the blockers
information once done.

Thanks

Bertrand

> Regards,
>
> [1] https://www.postgresql.org/docs/devel/error-style-guide.html
>
> --
> Masahiko Sawada            http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v1-0001-Add-Standby-Recovery-Conflicts-BlockedOnly-Info.patch (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
On Mon, 6 Jul 2020 at 15:42, Drouvot, Bertrand <[hidden email]> wrote:

>
>
> On 7/3/20 4:20 AM, Masahiko Sawada wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> >
> >
> >
> > On Wed, 1 Jul 2020 at 21:52, Drouvot, Bertrand <[hidden email]> wrote:
> >> Hey,
> >>
> >> On 6/29/20 10:55 AM, Masahiko Sawada wrote:
> >>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> >>>
> >>>
> >>>
> >>> On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <[hidden email]> wrote:
> >>>> Hi hackers,
> >>>>
> >>>> I've attached a patch to add information during standby recovery conflicts.
> >>>> The motive behind is to be able to get some information:
> >>>>
> >>>> On the apply side
> >>>> On the blocker(s) side
> >>>>
> >>>> Motivation:
> >>>>
> >>>> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the root cause and find a way to avoid/mitigate new occurrences.
> >>> I think this is a good feature. Like log_lock_waits, it will help the
> >>> users to investigate recovery conflict issues.
> >> exactly, thanks for looking at it.
> >>>> Adding this information would make the investigations easier, it could help answering questions like:
> >>>>
> >>>> On which LSN was the WAL apply blocked?
> >>>> What was the purpose of the bocked WAL record?
> >>>> On which relation (if any) was the blocked WAL record related to?
> >>>> What was the blocker(s) doing?
> >>>> When did the blocker(s) started their queries (if any)?
> >>>> What was the blocker(s) waiting for? on which wait event?
> >>>>
> >>>> Technical context and proposal:
> >>>>
> >>>> There is 2 points in this patch:
> >>>>
> >>>> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler)
> >>> I think we already have the information about the WAL record being
> >>> applied in errcontext.
> >> right, but it won’t be displayed in case log_error_verbosity is set to
> >> terse.
> > Yes. I think in this case errcontext or errdetail is more appropriate
> > for this information in this case. Otherwise, we will end up emitting
> > same WAL record information twice in log_error_verbosity = verbose.
> >
> > For instance, here is the server logs when I tested a recovery
> > conflict on buffer pin but it looks redundant:
> >
> > 2020-07-03 11:01:15.339 JST [60585] LOG:  wal record apply is blocked
> > by 1 connection(s), reason: User query might have needed to see row
> > versions that must be removed.
> > 2020-07-03 11:01:15.339 JST [60585] CONTEXT:  WAL redo at 0/30025E0
> > for Heap2/CLEAN: remxid 505
> > 2020-07-03 11:01:15.339 JST [60585] LOG:  blocked wal record rmgr:
> > Heap2, lsn: 0/030025E0, received at: 2020-07-03 11:01:15.338997+09,
> > desc: CLEAN, relation: rel 1663/12930/16384 fork main blk 0
> > 2020-07-03 11:01:15.339 JST [60585] CONTEXT:  WAL redo at 0/30025E0
> > for Heap2/CLEAN: remxid 505
> > 2020-07-03 11:01:15.347 JST [60604] LOG:  about to be interrupted:
> > backend_type: client backend, state: active, wait_event: PgSleep,
> > query_start: 2020-07-03 11:01:14.337708+09
> > 2020-07-03 11:01:15.347 JST [60604] STATEMENT:  select pg_sleep(30);
> > 2020-07-03 11:01:15.347 JST [60604] ERROR:  canceling statement due to
> > conflict with recovery
> > 2020-07-03 11:01:15.347 JST [60604] DETAIL:  User query might have
> > needed to see row versions that must be removed.
> > 2020-07-03 11:01:15.347 JST [60604] STATEMENT:  select pg_sleep(30);
> >
> > There are the information WAL record three times and the reason for
> > interruption twice.
>
>
> Fully makes sense, the new patch version attached is now producing:
>
> 2020-07-06 06:10:36.022 UTC [14035] LOG:  waiting for recovery conflict
> on snapshot

How about adding the subject? that is, "recovery is waiting for
recovery conflict on %s" or "recovery process <pid> is waiting for
conflict on %s".

> 2020-07-06 06:10:36.022 UTC [14035] DETAIL:  WAL record received at
> 2020-07-06 06:10:36.021963+00.
>          Tablespace/database/relation are 1663/13586/16672, fork is main
> and block is 0.
>          There is 1 blocking connection(s).

To follow the existing log message, perhaps this can be something like
"WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d
processes". But I'm not sure the errdetail is the best place to
display the WAL information as I mentioned in the latter part of this
email.

> 2020-07-06 06:10:36.022 UTC [14035] CONTEXT:  WAL redo at 0/3A05708 for
> Heap2/CLEAN: remxid 972
>
> It does not provide the pid(s) of the blocking processes as they'll
> appear during the interruption(s).
>
> >> Or did you had in mind to try to avoid using the new
> >> “current_replayed_xlog” in xlog.c?
> > Regarding LogBlockedWalRecordInfo(), it seems to me that the main
> > message of this logging is to let users know both that the process got
> > stuck due to recovery conflict and its reason (lock, database,
> > bufferpin etc). Other information such as the details of blocked WAL,
> > how many processes are blocking seems like additional information. So
> > I think this information would be better to be shown errcontext or
> > errdetails and we don’t need to create a similar feature as we already
> > show the WAL record being replayed in errcontext.
>
>
> I got rid of current_replayed_xlog in the new patch attached and make
> use of the errcontext argument instead.
>
> >
> > Also, this function emits two LOG messages related to each other but
> > I'm concerned that it can be hard for users to associate these
> > separate log lines especially when server logs are written at a high
> > rate. And I think these two log lines don't follow our error message
> > style[1].
> >
> >>> I wonder if we can show the recovery conflict information in the main
> >>> LOG message, the blocker information in errdetail, and use errcontext
> >>> with regard to WAL record information. For example:
> >>>
> >>> LOG:  process 500 waiting for recovery conflict on snapshot
> >>> DETAIL:  conflicting transition id: 100, 200, 300
> >>> CONTEXT:  WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506
> >> Not sure at all if it would be possible to put all the information at
> >> the same time.
> >> For example in case of shared buffer pin conflict, the blocker is
> >> currently known “only” during the RecoveryConflictInterrupt call (so
> >> still not known yet when we can “already” report the blocked LSN
> >> information).
> >> It might also happen that the blocker(s) will never get interrupted (was
> >> not blocking anymore before max_standby_streaming_delay has been
> >> reached): then it would not be possible to display all the information
> >> here (aka when it is interrupted) while we still want to be warned that
> >> the WAL replay has been blocked.
> > I think it's okay with showing different information for different
> > types of recovery conflict. In buffer pin conflict case, I think we
> > can let the user know the process is waiting for recovery conflict on
> > buffer pin in the main message and the WAL record being replayed in
> > errdetails.
> >
> >>>> Outcome Example:
> >>>>
> >>>> 2020-06-15 06:48:23.774 UTC [6971] LOG:  wal record apply is blocked by 2 connection(s), reason: User query might have needed to see row versions that must be removed.
> >>>> 2020-06-15 06:48:23.774 UTC [6971] LOG:  blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0
> >>>> 2020-06-15 06:48:54.773 UTC [7088] LOG:  about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00
> >>> I'm concerned that the above information on the process who is about
> >>> to be interrupted is very detailed but I'm not sure it will be helpful
> >>> for the users. If the blocker is waiting on something lock, the
> >>> information should be logged by log_lock_waits.
> >> The blocker could also be in “idle in transaction”, say for example, on
> >> the standby (with hot_standby_feedback set to off):
> >>
> >> standby> begin;
> >> standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> >> standby> select * from bdt;
> >>
> >> on the master:
> >>
> >> master> update bdt set generate_series = 15;
> >> master> vacuum verbose bdt;
> >>
> >> would produce:
> >>
> >> 2020-07-01 09:18:55.256 UTC [32751] LOG:  about to interrupt pid: 32751,
> >> backend_type: client backend, state: idle in transaction,
> >> wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01
> >> 09:18:17.390572+00
> >> 2020-07-01 09:18:55.256 UTC [32751] DETAIL:  statement: select * from bdt;
> >>
> >> I think those information are useful to have (might get rid of
> >> wait_event_type though: done in the new attached patch).
> > Since the backend cancels query at a convenient time
> > (CHECK_FOR_INTERRUPT()), I'm concerned that the wait event information
> > displayed in that log might not be helpful. For example, even if the
> > blocker is getting stuck on disk I/O (WAIT_EVENT_BUFFILE_READ etc)
> > while holding a lock that the recovery process is waiting for, when
> > the blocker is able to cancel the query it's no longer waiting for
> > disk I/O. Also, regarding displaying the backend type, we already show
> > the backend type by setting %d in log_line_prefix.
> >
> > I still think we should show this additional information (wait event,
> > query start etc) in errdetails even if we want to show this
> > information in the server logs. Perhaps we can improve
> > errdetail_recovery_conflict()?
> >
> > Aside from the above comments from the perspective of high-level
> > design, I think we can split this patch into two patches: logging the
> > recovery process is waiting (adding log_recovery_conficts) and logging
> > the details of blockers who is about to be interrupted. I think it'll
> > make review easy.
>
>
> Ok. Let's keep this thread for the new attached patch that focus on the
> recovery process waiting.

Thank you for updating the patch!

I've tested the latest patch. On recovery conflict on lock and on
bufferpin, if max_standby_streaming_delay is disabled (set to -1), the
logs don't appear even if log_recovery_conflicts is true.

Here is random comments on the code:

+       recovery_conflict_main_message = psprintf("waiting for
recovery conflict on %s",
+
get_procsignal_reason_desc(reason));
:
+               ereport(LOG,
+                               (errmsg("%s", recovery_conflict_main_message),
+                               errdetail("%s\n" "There is %d blocking
connection(s).", wal_record_detail_str, num_waitlist_entries)));

It's not translation-support-friendly. I think the message "waiting
for recovery conflict on %s" should be surrounded by _(). Or we can
just put it to ereport as follows:

ereport(LOG,
    (errmsg("waiting for recovery conflicts on %s",
get_procsignal_reason_desc(reason))
    ...

---
+       oldcontext = MemoryContextSwitchTo(ErrorContext);
+       econtext = error_context_stack;
+
+       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))

I don't think it's a good idea to rely on error_context_stack because
other codes might set another error context before reaching here in
the future.

---
+       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))
+               wal_record_detail_str = psprintf("WAL record received
at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and
block is %u.",
+                                               receipt_time_str,
rnode.spcNode, rnode.dbNode, rnode.relNode,
+                                               forkNames[forknum],
+                                               blknum);

There might be a block tag in block ids other than 0. I'm not sure the
errdetail is the best place where we display WAL information. For
instance, we display both the relation oid and block number depending
on RM as follows:

2020-07-07 15:50:30.733 JST [13344] LOG:  waiting for recovery conflict on lock
2020-07-07 15:50:30.733 JST [13344] DETAIL:  WAL record received at
2020-07-07 15:50:27.73378+09.
        There is 1 blocking connection(s).
2020-07-07 15:50:30.733 JST [13344] CONTEXT:  WAL redo at 0/3000028
for Standby/LOCK: xid 506 db 13586 rel 16384

I wonder if we can display the details of redo WAL information by
improving xlog_outdesc() or rm_redo_error_callback() so that it
displays relfilenode, forknum, and block number. What do you think?

---
+       /* display wal record information */
+       if (log_recovery_conflicts)
+               LogBlockedWalRecordInfo(num_waitlist_entries, reason);

I'm concerned that we should log the recovery conflict information
every time when a recovery conflict happens from the perspective of
overheads and the amount of the logs. Can we logs that information
after waiting for deadlock_timeouts secs like log_lock_waits or
waiting for the fixed duration?

---
@@ -609,6 +682,10 @@ StandbyTimeoutHandler(void)
        /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */
        disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false);

+       /* display wal record information */
+       if (log_recovery_conflicts)
+               LogBlockedWalRecordInfo(-1,
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+
        SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
 }

ResolveRecoveryConflictWithBufferPin() which sets a timer to call
StandbyTimeoutHandler() can be called multiple times even if the
recovery is waiting for one buffer pin. I think we should avoid
logging the same contents multiple times.

---
-
+       {
+               {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT,
+                       gettext_noop("Logs standby recovery conflicts."),
+                       NULL
+               },
+               &log_recovery_conflicts,
+               true,
+               NULL, NULL, NULL
+       },

Other logging parameters such as log_lock_waits is false by default. I
think this parameter can also be false by default but is there any
reason to enable it by default?

Since this parameter applies only to the startup process, I think it
should be PGC_SIGHUP.

---
+       /* display wal record information */
+       if (log_recovery_conflicts)
+               LogBlockedWalRecordInfo(CountDBBackends(dbid),
PROCSIG_RECOVERY_CONFLICT_DATABASE);

We log the recovery conflict into the server log but we don't update
the process title to append "waiting". While discussing the process
title update on recovery conflict, I got the review comment[1] that we
don't need to update the process title because no wait occurs when
recovery conflict with database happens. As the comment says, recovery
is canceling the existing processes on the database being removed, but
not waiting for something.

Regards,

[1] https://www.postgresql.org/message-id/84e4ea5f-80ec-9862-d419-c4433b3c2965%40oss.nttdata.com

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand

On 7/7/20 9:43 AM, Masahiko Sawada wrote:
> Fully makes sense, the new patch version attached is now producing:
>> 2020-07-06 06:10:36.022 UTC [14035] LOG:  waiting for recovery conflict
>> on snapshot
> How about adding the subject? that is, "recovery is waiting for
> recovery conflict on %s" or "recovery process <pid> is waiting for
> conflict on %s".


The subject is now added in the new attached patch (I did not include
the pid as it is part of the log prefix).

It now looks like:

2020-07-11 12:00:41.092 UTC [23217] LOG:  recovery is waiting for
recovery conflict on snapshot
2020-07-11 12:00:41.092 UTC [23217] DETAIL:  There is 1 blocking
connection(s).
2020-07-11 12:00:41.092 UTC [23217] CONTEXT:  WAL redo at 0/4A0A6BF0 for
Heap2/CLEAN: remxid 1128
         WAL record received at 2020-07-11 12:00:41.092231+00
         tbs 1663 db 13586 rel 16805, fork main, blkno 0

>
>> 2020-07-06 06:10:36.022 UTC [14035] DETAIL:  WAL record received at
>> 2020-07-06 06:10:36.021963+00.
>>           Tablespace/database/relation are 1663/13586/16672, fork is main
>> and block is 0.
>>           There is 1 blocking connection(s).
> To follow the existing log message, perhaps this can be something like
> "WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d
> processes". But I'm not sure the errdetail is the best place to
> display the WAL information as I mentioned in the latter part of this
> email.
moved to the context and formatted the same way as the current
Standby/LOCK context.


> Ok. Let's keep this thread for the new attached patch that focus on the
>> recovery process waiting.
> Thank you for updating the patch!
>
> I've tested the latest patch.


Thank you for testing and reviewing!


> On recovery conflict on lock and on
> bufferpin, if max_standby_streaming_delay is disabled (set to -1), the
> logs don't appear even if log_recovery_conflicts is true.


Nice catch! it is fixed in the new attached patch (the log reporting has
been moved out of StandbyTimeoutHandler()).

>
> Here is random comments on the code:
>
> +       recovery_conflict_main_message = psprintf("waiting for
> recovery conflict on %s",
> +
> get_procsignal_reason_desc(reason));
> :
> +               ereport(LOG,
> +                               (errmsg("%s", recovery_conflict_main_message),
> +                               errdetail("%s\n" "There is %d blocking
> connection(s).", wal_record_detail_str, num_waitlist_entries)));
>
> It's not translation-support-friendly. I think the message "waiting
> for recovery conflict on %s" should be surrounded by _(). Or we can
> just put it to ereport as follows:
>
> ereport(LOG,
>      (errmsg("waiting for recovery conflicts on %s",
> get_procsignal_reason_desc(reason))
>      ...

changed in the new attached patch.


>
> ---
> +       oldcontext = MemoryContextSwitchTo(ErrorContext);
> +       econtext = error_context_stack;
> +
> +       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))
>
> I don't think it's a good idea to rely on error_context_stack because
> other codes might set another error context before reaching here in
> the future.

right, changed in the new attached patch: this is now done in
rm_redo_error_callback() and using the XLogReaderState passed as argument.


>
> ---
> +       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))
> +               wal_record_detail_str = psprintf("WAL record received
> at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and
> block is %u.",
> +                                               receipt_time_str,
> rnode.spcNode, rnode.dbNode, rnode.relNode,
> +                                               forkNames[forknum],
> +                                               blknum);
>
> There might be a block tag in block ids other than 0.

right, fixed in the new attached patch.


> I'm not sure the
> errdetail is the best place where we display WAL information.


moved to context in the new attached patch.


> For
> instance, we display both the relation oid and block number depending
> on RM as follows:
>
> 2020-07-07 15:50:30.733 JST [13344] LOG:  waiting for recovery conflict on lock
> 2020-07-07 15:50:30.733 JST [13344] DETAIL:  WAL record received at
> 2020-07-07 15:50:27.73378+09.
>          There is 1 blocking connection(s).
> 2020-07-07 15:50:30.733 JST [13344] CONTEXT:  WAL redo at 0/3000028
> for Standby/LOCK: xid 506 db 13586 rel 16384
>
> I wonder if we can display the details of redo WAL information by
> improving xlog_outdesc() or rm_redo_error_callback() so that it
> displays relfilenode, forknum, and block number. What do you think?

I think that fully make sense to move this to rm_redo_error_callback().

This is where the information is now been displayed in the new attached
patch.


>
> ---
> +       /* display wal record information */
> +       if (log_recovery_conflicts)
> +               LogBlockedWalRecordInfo(num_waitlist_entries, reason);
>
> I'm concerned that we should log the recovery conflict information
> every time when a recovery conflict happens from the perspective of
> overheads and the amount of the logs. Can we logs that information
> after waiting for deadlock_timeouts secs like log_lock_waits or
> waiting for the fixed duration?

The new attached patch is now waiting for deadlock_timeout duration.

>
> ---
> @@ -609,6 +682,10 @@ StandbyTimeoutHandler(void)
>          /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */
>          disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false);
>
> +       /* display wal record information */
> +       if (log_recovery_conflicts)
> +               LogBlockedWalRecordInfo(-1,
> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
> +
>          SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
>   }
>
> ResolveRecoveryConflictWithBufferPin() which sets a timer to call
> StandbyTimeoutHandler() can be called multiple times even if the
> recovery is waiting for one buffer pin. I think we should avoid
> logging the same contents multiple times.

I do agree, only the first pass is now been logged.


>
> ---
> -
> +       {
> +               {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT,
> +                       gettext_noop("Logs standby recovery conflicts."),
> +                       NULL
> +               },
> +               &log_recovery_conflicts,
> +               true,
> +               NULL, NULL, NULL
> +       },
>
> Other logging parameters such as log_lock_waits is false by default. I
> think this parameter can also be false by default but is there any
> reason to enable it by default?

now set to false by default.


>
> Since this parameter applies only to the startup process, I think it
> should be PGC_SIGHUP.


changed that way.


>
> ---
> +       /* display wal record information */
> +       if (log_recovery_conflicts)
> +               LogBlockedWalRecordInfo(CountDBBackends(dbid),
> PROCSIG_RECOVERY_CONFLICT_DATABASE);
>
> We log the recovery conflict into the server log but we don't update
> the process title to append "waiting". While discussing the process
> title update on recovery conflict, I got the review comment[1] that we
> don't need to update the process title because no wait occurs when
> recovery conflict with database happens. As the comment says, recovery
> is canceling the existing processes on the database being removed, but
> not waiting for something.

Ok, I keep reporting the conflict but changed the wording for this
particular case.

Thanks

Bertrand


v1-0002-Add-Standby-Recovery-Conflicts-BlockedOnly-Info.patch (15K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
On Sat, 11 Jul 2020 at 21:56, Drouvot, Bertrand <[hidden email]> wrote:

>
>
> On 7/7/20 9:43 AM, Masahiko Sawada wrote:
> > Fully makes sense, the new patch version attached is now producing:
> >> 2020-07-06 06:10:36.022 UTC [14035] LOG:  waiting for recovery conflict
> >> on snapshot
> > How about adding the subject? that is, "recovery is waiting for
> > recovery conflict on %s" or "recovery process <pid> is waiting for
> > conflict on %s".
>
>
> The subject is now added in the new attached patch (I did not include
> the pid as it is part of the log prefix).
>
> It now looks like:
>
> 2020-07-11 12:00:41.092 UTC [23217] LOG:  recovery is waiting for
> recovery conflict on snapshot
> 2020-07-11 12:00:41.092 UTC [23217] DETAIL:  There is 1 blocking
> connection(s).
> 2020-07-11 12:00:41.092 UTC [23217] CONTEXT:  WAL redo at 0/4A0A6BF0 for
> Heap2/CLEAN: remxid 1128
>          WAL record received at 2020-07-11 12:00:41.092231+00
>          tbs 1663 db 13586 rel 16805, fork main, blkno 0
> >
> >> 2020-07-06 06:10:36.022 UTC [14035] DETAIL:  WAL record received at
> >> 2020-07-06 06:10:36.021963+00.
> >>           Tablespace/database/relation are 1663/13586/16672, fork is main
> >> and block is 0.
> >>           There is 1 blocking connection(s).
> > To follow the existing log message, perhaps this can be something like
> > "WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d
> > processes". But I'm not sure the errdetail is the best place to
> > display the WAL information as I mentioned in the latter part of this
> > email.
>
> moved to the context and formatted the same way as the current
> Standby/LOCK context.
>
>
> > Ok. Let's keep this thread for the new attached patch that focus on the
> >> recovery process waiting.
> > Thank you for updating the patch!
> >
> > I've tested the latest patch.
>
>
> Thank you for testing and reviewing!
>
>
> > On recovery conflict on lock and on
> > bufferpin, if max_standby_streaming_delay is disabled (set to -1), the
> > logs don't appear even if log_recovery_conflicts is true.
>
>
> Nice catch! it is fixed in the new attached patch (the log reporting has
> been moved out of StandbyTimeoutHandler()).
>
> >
> > Here is random comments on the code:
> >
> > +       recovery_conflict_main_message = psprintf("waiting for
> > recovery conflict on %s",
> > +
> > get_procsignal_reason_desc(reason));
> > :
> > +               ereport(LOG,
> > +                               (errmsg("%s", recovery_conflict_main_message),
> > +                               errdetail("%s\n" "There is %d blocking
> > connection(s).", wal_record_detail_str, num_waitlist_entries)));
> >
> > It's not translation-support-friendly. I think the message "waiting
> > for recovery conflict on %s" should be surrounded by _(). Or we can
> > just put it to ereport as follows:
> >
> > ereport(LOG,
> >      (errmsg("waiting for recovery conflicts on %s",
> > get_procsignal_reason_desc(reason))
> >      ...
>
>
> changed in the new attached patch.
>
>
> >
> > ---
> > +       oldcontext = MemoryContextSwitchTo(ErrorContext);
> > +       econtext = error_context_stack;
> > +
> > +       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))
> >
> > I don't think it's a good idea to rely on error_context_stack because
> > other codes might set another error context before reaching here in
> > the future.
>
>
> right, changed in the new attached patch: this is now done in
> rm_redo_error_callback() and using the XLogReaderState passed as argument.
>
>
> >
> > ---
> > +       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))
> > +               wal_record_detail_str = psprintf("WAL record received
> > at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and
> > block is %u.",
> > +                                               receipt_time_str,
> > rnode.spcNode, rnode.dbNode, rnode.relNode,
> > +                                               forkNames[forknum],
> > +                                               blknum);
> >
> > There might be a block tag in block ids other than 0.
>
>
> right, fixed in the new attached patch.
>
>
> > I'm not sure the
> > errdetail is the best place where we display WAL information.
>
>
> moved to context in the new attached patch.
>
>
> > For
> > instance, we display both the relation oid and block number depending
> > on RM as follows:
> >
> > 2020-07-07 15:50:30.733 JST [13344] LOG:  waiting for recovery conflict on lock
> > 2020-07-07 15:50:30.733 JST [13344] DETAIL:  WAL record received at
> > 2020-07-07 15:50:27.73378+09.
> >          There is 1 blocking connection(s).
> > 2020-07-07 15:50:30.733 JST [13344] CONTEXT:  WAL redo at 0/3000028
> > for Standby/LOCK: xid 506 db 13586 rel 16384
> >
> > I wonder if we can display the details of redo WAL information by
> > improving xlog_outdesc() or rm_redo_error_callback() so that it
> > displays relfilenode, forknum, and block number. What do you think?
>
>
> I think that fully make sense to move this to rm_redo_error_callback().
>
> This is where the information is now been displayed in the new attached
> patch.
>
>
> >
> > ---
> > +       /* display wal record information */
> > +       if (log_recovery_conflicts)
> > +               LogBlockedWalRecordInfo(num_waitlist_entries, reason);
> >
> > I'm concerned that we should log the recovery conflict information
> > every time when a recovery conflict happens from the perspective of
> > overheads and the amount of the logs. Can we logs that information
> > after waiting for deadlock_timeouts secs like log_lock_waits or
> > waiting for the fixed duration?
>
>
> The new attached patch is now waiting for deadlock_timeout duration.
>
> >
> > ---
> > @@ -609,6 +682,10 @@ StandbyTimeoutHandler(void)
> >          /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */
> >          disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false);
> >
> > +       /* display wal record information */
> > +       if (log_recovery_conflicts)
> > +               LogBlockedWalRecordInfo(-1,
> > PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
> > +
> >          SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
> >   }
> >
> > ResolveRecoveryConflictWithBufferPin() which sets a timer to call
> > StandbyTimeoutHandler() can be called multiple times even if the
> > recovery is waiting for one buffer pin. I think we should avoid
> > logging the same contents multiple times.
>
>
> I do agree, only the first pass is now been logged.
>
>
> >
> > ---
> > -
> > +       {
> > +               {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT,
> > +                       gettext_noop("Logs standby recovery conflicts."),
> > +                       NULL
> > +               },
> > +               &log_recovery_conflicts,
> > +               true,
> > +               NULL, NULL, NULL
> > +       },
> >
> > Other logging parameters such as log_lock_waits is false by default. I
> > think this parameter can also be false by default but is there any
> > reason to enable it by default?
>
>
> now set to false by default.
>
>
> >
> > Since this parameter applies only to the startup process, I think it
> > should be PGC_SIGHUP.
>
>
> changed that way.
>
>
> >
> > ---
> > +       /* display wal record information */
> > +       if (log_recovery_conflicts)
> > +               LogBlockedWalRecordInfo(CountDBBackends(dbid),
> > PROCSIG_RECOVERY_CONFLICT_DATABASE);
> >
> > We log the recovery conflict into the server log but we don't update
> > the process title to append "waiting". While discussing the process
> > title update on recovery conflict, I got the review comment[1] that we
> > don't need to update the process title because no wait occurs when
> > recovery conflict with database happens. As the comment says, recovery
> > is canceling the existing processes on the database being removed, but
> > not waiting for something.
>
>
> Ok, I keep reporting the conflict but changed the wording for this
> particular case.

Thank you for updating the patch! Here is my comments on the latest
version patch:

+   /*
+    * Ensure we are in the startup process
+    * if we want to log standby recovery conflicts
+    */
+   if (AmStartupProcess() && log_recovery_conflicts)
+   {

This function must be used by only the startup process. Not sure but
if we need this check I think we should use an assertion rather than a
condition in if statement.

I think the information about relfilenode and forknumber is useful
even in a normal recovery case. I think we can always add this
information to errcontext. What do you think?

+       GetXLogReceiptTime(&rtime, &fromStream);
+       if (fromStream)
+       {
+           receipt_time_str = pstrdup(timestamptz_to_str(rtime));
+           appendStringInfo(&buf,"\nWAL record received at %s",
receipt_time_str);

Not sure showing the receipt time of WAL record is useful for users in
this case. IIUC that the receipt time doesn't correspond to individual
WAL records whereas the errcontext information is about the particular
WAL record.

+           for (block_id = 0; block_id <= record->max_block_id; block_id++)
+           {
+               if (XLogRecGetBlockTag(record, block_id, &rnode,
&forknum, &blknum))
+                   appendStringInfo(&buf,"\ntbs %u db %u rel %u, fork
%s, blkno %u",
+                               rnode.spcNode, rnode.dbNode, rnode.relNode,
+                               forkNames[forknum],
+                               blknum);

How about showing something like pg_waldump?

CONTEXT:  WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506, blkref
#0: rel 1000/20000/1234 fork main blk 10, blkref #1: rel
1000/20000/1234 fork main blk 11

or

CONTEXT:  WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506
    blkref #0: rel 1000/20000/1234 fork main blk 10
    blkref #1: rel 1000/20000/1234 fork main blk 11

But the latter format makes grepping difficult.

Also, I guess the changes in rm_redo_error_callback can also be a
separate patch.

+/*
+ * Display information about the wal record
+ * apply being blocked
+ */
+static void
+LogBlockedWalRecordInfo(int num_waitlist_entries, ProcSignalReason reason)

I think the function name needs to be updated. The function no longer
logs WAL record information.

+{
+   if (num_waitlist_entries > 0)
+       if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE)
+           ereport(LOG,
+               (errmsg("recovery is experiencing recovery conflict on
%s", get_procsignal_reason_desc(reason)),
+               errdetail("There is %d conflicting connection(s).",
num_waitlist_entries)));
+       else
+           ereport(LOG,
+               (errmsg("recovery is waiting for recovery conflict on
%s", get_procsignal_reason_desc(reason)),
+               errdetail("There is %d blocking connection(s).",
num_waitlist_entries)));
+   else
+       ereport(LOG,
+           (errmsg("recovery is waiting for recovery conflict on %s",
get_procsignal_reason_desc(reason))));
+}

How about displaying actual virtual transaction ids or process ids the
startup process is waiting for? For example, the log is going to be:

LOG:  recovery is waiting for recovery conflict on snapshot
DETAIL:  Conflicting virtual transaction ids: 100/101, 200/202, 300/303

or

LOG:  recovery is waiting for recovery conflict on snapshot
DETAIL:  Conflicting processes: 123, 456, 789

FYI, errdetail_plural() or errdetail_log_plural() can be used for pluralization.

+   tmpWaitlist = waitlist;
+   while (VirtualTransactionIdIsValid(*tmpWaitlist))
+   {
+       tmpWaitlist++;
+   }
+
+   num_waitlist_entries = (tmpWaitlist - waitlist);
+
+   /* display wal record information */
+   if (log_recovery_conflicts &&
(TimestampDifferenceExceeds(recovery_conflicts_log_time,
GetCurrentTimestamp(),
+                                   DeadlockTimeout))) {
+       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
+       recovery_conflicts_log_time = GetCurrentTimestamp();
+   }

recovery_conflicts_log_time is not initialized. And shouldn't we
compare the current timestamp to the timestamp when the startup
process started waiting?

I think we should call LogBlockedWalRecordInfo() inside of the inner
while loop rather than at the beginning of
ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the
startup process waits until 'ltime', then enters
ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
snapshot and tablespace conflict cases (i.g.
ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace()), it enters
ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
reaching ‘ltime’ inside of the inner while look. So the above
condition could always be false.

I wonder if we can have something like the following function:

bool
LogRecoveryConflict(VirtualTransactionId *waitlist, ProcSignalReason reason,
                                  TimestampTz wait_start)
{
    if (!TimestampDifferenceExceeds(wait_start, GetCurrentTimestamp(),

max_standby_streaming_delay))
        return false;

    if (waitlist)
    {
        char *buf;

        buf = construct a string containing all process ids (or
virtual transaction ids) to resolve from waitlist;

        ereport(LOG,
            (errmsg("recovery is waiting recovery conflict on %s",
get_procsignal_reason_desc(reason)),
            (errdetail_log_plural("Conflicting process : %s.",
              "Conflicting processes : %s.",
              the number of processes in *waitlist, buf))));
    }
    else
        ereport(LOG,
        (errmsg("recovery is resolving recovery conflict on %s",
get_procsignal_reason_desc(reason))));

    return true;
}

'wait_start' is the timestamp when the caller started waiting for the
recovery conflict. This function logs resolving recovery conflict with
detailed information if needed. The caller call this function when
log_recovery_conflicts is enabled as follows:

bool logged = false;
    :
if (log_recovery_conflicts && !logged)
    logged = LogRecoveryConflict(waitlist, reason, waitStart);

+     <varlistentry id="guc-log-recovery-conflicts"
xreflabel="log_recovery_conflicts">
+      <term><varname>log_recovery_conflicts</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>log_recovery_conflicts</varname>
configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Controls whether detailed information is produced when a conflict
+        occurs during standby recovery. The default is <literal>on</literal>.
+        Only superusers can change this setting.
+       </para>
+      </listitem>
+     </varlistentry>

I think the documentation needs to be updated.

IIUC this feature is to logging the startup process is waiting due to
recovery conflict or waiting for recovery conflict resolution. But it
logs only when the startup process waits for longer than a certain
time. I think we can improve the documentation in terms of that point.
Also, the last sentence is no longer true; log_recovery_conflicts is
now PGC_SIGHUP.

BTW 'log_recovery_conflict_waits' might be better name for consistency
with log_lock_waits?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand
Hi,

On 7/31/20 7:12 AM, Masahiko Sawada wrote:

> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Sat, 11 Jul 2020 at 21:56, Drouvot, Bertrand <[hidden email]> wrote:
>>
>> On 7/7/20 9:43 AM, Masahiko Sawada wrote:
>>> Fully makes sense, the new patch version attached is now producing:
>>>> 2020-07-06 06:10:36.022 UTC [14035] LOG:  waiting for recovery conflict
>>>> on snapshot
>>> How about adding the subject? that is, "recovery is waiting for
>>> recovery conflict on %s" or "recovery process <pid> is waiting for
>>> conflict on %s".
>>
>> The subject is now added in the new attached patch (I did not include
>> the pid as it is part of the log prefix).
>>
>> It now looks like:
>>
>> 2020-07-11 12:00:41.092 UTC [23217] LOG:  recovery is waiting for
>> recovery conflict on snapshot
>> 2020-07-11 12:00:41.092 UTC [23217] DETAIL:  There is 1 blocking
>> connection(s).
>> 2020-07-11 12:00:41.092 UTC [23217] CONTEXT:  WAL redo at 0/4A0A6BF0 for
>> Heap2/CLEAN: remxid 1128
>>           WAL record received at 2020-07-11 12:00:41.092231+00
>>           tbs 1663 db 13586 rel 16805, fork main, blkno 0
>>>> 2020-07-06 06:10:36.022 UTC [14035] DETAIL:  WAL record received at
>>>> 2020-07-06 06:10:36.021963+00.
>>>>            Tablespace/database/relation are 1663/13586/16672, fork is main
>>>> and block is 0.
>>>>            There is 1 blocking connection(s).
>>> To follow the existing log message, perhaps this can be something like
>>> "WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d
>>> processes". But I'm not sure the errdetail is the best place to
>>> display the WAL information as I mentioned in the latter part of this
>>> email.
>> moved to the context and formatted the same way as the current
>> Standby/LOCK context.
>>
>>
>>> Ok. Let's keep this thread for the new attached patch that focus on the
>>>> recovery process waiting.
>>> Thank you for updating the patch!
>>>
>>> I've tested the latest patch.
>>
>> Thank you for testing and reviewing!
>>
>>
>>> On recovery conflict on lock and on
>>> bufferpin, if max_standby_streaming_delay is disabled (set to -1), the
>>> logs don't appear even if log_recovery_conflicts is true.
>>
>> Nice catch! it is fixed in the new attached patch (the log reporting has
>> been moved out of StandbyTimeoutHandler()).
>>
>>> Here is random comments on the code:
>>>
>>> +       recovery_conflict_main_message = psprintf("waiting for
>>> recovery conflict on %s",
>>> +
>>> get_procsignal_reason_desc(reason));
>>> :
>>> +               ereport(LOG,
>>> +                               (errmsg("%s", recovery_conflict_main_message),
>>> +                               errdetail("%s\n" "There is %d blocking
>>> connection(s).", wal_record_detail_str, num_waitlist_entries)));
>>>
>>> It's not translation-support-friendly. I think the message "waiting
>>> for recovery conflict on %s" should be surrounded by _(). Or we can
>>> just put it to ereport as follows:
>>>
>>> ereport(LOG,
>>>       (errmsg("waiting for recovery conflicts on %s",
>>> get_procsignal_reason_desc(reason))
>>>       ...
>>
>> changed in the new attached patch.
>>
>>
>>> ---
>>> +       oldcontext = MemoryContextSwitchTo(ErrorContext);
>>> +       econtext = error_context_stack;
>>> +
>>> +       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))
>>>
>>> I don't think it's a good idea to rely on error_context_stack because
>>> other codes might set another error context before reaching here in
>>> the future.
>>
>> right, changed in the new attached patch: this is now done in
>> rm_redo_error_callback() and using the XLogReaderState passed as argument.
>>
>>
>>> ---
>>> +       if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum))
>>> +               wal_record_detail_str = psprintf("WAL record received
>>> at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and
>>> block is %u.",
>>> +                                               receipt_time_str,
>>> rnode.spcNode, rnode.dbNode, rnode.relNode,
>>> +                                               forkNames[forknum],
>>> +                                               blknum);
>>>
>>> There might be a block tag in block ids other than 0.
>>
>> right, fixed in the new attached patch.
>>
>>
>>> I'm not sure the
>>> errdetail is the best place where we display WAL information.
>>
>> moved to context in the new attached patch.
>>
>>
>>> For
>>> instance, we display both the relation oid and block number depending
>>> on RM as follows:
>>>
>>> 2020-07-07 15:50:30.733 JST [13344] LOG:  waiting for recovery conflict on lock
>>> 2020-07-07 15:50:30.733 JST [13344] DETAIL:  WAL record received at
>>> 2020-07-07 15:50:27.73378+09.
>>>           There is 1 blocking connection(s).
>>> 2020-07-07 15:50:30.733 JST [13344] CONTEXT:  WAL redo at 0/3000028
>>> for Standby/LOCK: xid 506 db 13586 rel 16384
>>>
>>> I wonder if we can display the details of redo WAL information by
>>> improving xlog_outdesc() or rm_redo_error_callback() so that it
>>> displays relfilenode, forknum, and block number. What do you think?
>>
>> I think that fully make sense to move this to rm_redo_error_callback().
>>
>> This is where the information is now been displayed in the new attached
>> patch.
>>
>>
>>> ---
>>> +       /* display wal record information */
>>> +       if (log_recovery_conflicts)
>>> +               LogBlockedWalRecordInfo(num_waitlist_entries, reason);
>>>
>>> I'm concerned that we should log the recovery conflict information
>>> every time when a recovery conflict happens from the perspective of
>>> overheads and the amount of the logs. Can we logs that information
>>> after waiting for deadlock_timeouts secs like log_lock_waits or
>>> waiting for the fixed duration?
>>
>> The new attached patch is now waiting for deadlock_timeout duration.
>>
>>> ---
>>> @@ -609,6 +682,10 @@ StandbyTimeoutHandler(void)
>>>           /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */
>>>           disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false);
>>>
>>> +       /* display wal record information */
>>> +       if (log_recovery_conflicts)
>>> +               LogBlockedWalRecordInfo(-1,
>>> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
>>> +
>>>           SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
>>>    }
>>>
>>> ResolveRecoveryConflictWithBufferPin() which sets a timer to call
>>> StandbyTimeoutHandler() can be called multiple times even if the
>>> recovery is waiting for one buffer pin. I think we should avoid
>>> logging the same contents multiple times.
>>
>> I do agree, only the first pass is now been logged.
>>
>>
>>> ---
>>> -
>>> +       {
>>> +               {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT,
>>> +                       gettext_noop("Logs standby recovery conflicts."),
>>> +                       NULL
>>> +               },
>>> +               &log_recovery_conflicts,
>>> +               true,
>>> +               NULL, NULL, NULL
>>> +       },
>>>
>>> Other logging parameters such as log_lock_waits is false by default. I
>>> think this parameter can also be false by default but is there any
>>> reason to enable it by default?
>>
>> now set to false by default.
>>
>>
>>> Since this parameter applies only to the startup process, I think it
>>> should be PGC_SIGHUP.
>>
>> changed that way.
>>
>>
>>> ---
>>> +       /* display wal record information */
>>> +       if (log_recovery_conflicts)
>>> +               LogBlockedWalRecordInfo(CountDBBackends(dbid),
>>> PROCSIG_RECOVERY_CONFLICT_DATABASE);
>>>
>>> We log the recovery conflict into the server log but we don't update
>>> the process title to append "waiting". While discussing the process
>>> title update on recovery conflict, I got the review comment[1] that we
>>> don't need to update the process title because no wait occurs when
>>> recovery conflict with database happens. As the comment says, recovery
>>> is canceling the existing processes on the database being removed, but
>>> not waiting for something.
>>
>> Ok, I keep reporting the conflict but changed the wording for this
>> particular case.
> Thank you for updating the patch! Here is my comments on the latest
> version patch:
>
> +   /*
> +    * Ensure we are in the startup process
> +    * if we want to log standby recovery conflicts
> +    */
> +   if (AmStartupProcess() && log_recovery_conflicts)
> +   {
>
> This function must be used by only the startup process. Not sure but
> if we need this check I think we should use an assertion rather than a
> condition in if statement.
>
> I think the information about relfilenode and forknumber is useful
> even in a normal recovery case. I think we can always add this
> information to errcontext. What do you think?
Fully agree, that it could be useful outside the context of this patch.

>
> +       GetXLogReceiptTime(&rtime, &fromStream);
> +       if (fromStream)
> +       {
> +           receipt_time_str = pstrdup(timestamptz_to_str(rtime));
> +           appendStringInfo(&buf,"\nWAL record received at %s",
> receipt_time_str);
>
> Not sure showing the receipt time of WAL record is useful for users in
> this case. IIUC that the receipt time doesn't correspond to individual
> WAL records whereas the errcontext information is about the particular
> WAL record.
>
> +           for (block_id = 0; block_id <= record->max_block_id; block_id++)
> +           {
> +               if (XLogRecGetBlockTag(record, block_id, &rnode,
> &forknum, &blknum))
> +                   appendStringInfo(&buf,"\ntbs %u db %u rel %u, fork
> %s, blkno %u",
> +                               rnode.spcNode, rnode.dbNode, rnode.relNode,
> +                               forkNames[forknum],
> +                               blknum);
>
> How about showing something like pg_waldump?
>
> CONTEXT:  WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506, blkref
> #0: rel 1000/20000/1234 fork main blk 10, blkref #1: rel
> 1000/20000/1234 fork main blk 11
>
> or
>
> CONTEXT:  WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506
>      blkref #0: rel 1000/20000/1234 fork main blk 10
>      blkref #1: rel 1000/20000/1234 fork main blk 11
>
> But the latter format makes grepping difficult.
>
> Also, I guess the changes in rm_redo_error_callback can also be a
> separate patch.
I fully agree with your comments.

As you have already seen I created a new patch dedicated to the
rm_redo_error_callback() changes:
https://commitfest.postgresql.org/29/2668/.

Then those changes are not part of the new attached patch related to
this thread anymore.

>
> +/*
> + * Display information about the wal record
> + * apply being blocked
> + */
> +static void
> +LogBlockedWalRecordInfo(int num_waitlist_entries, ProcSignalReason reason)
>
> I think the function name needs to be updated. The function no longer
> logs WAL record information.
right.

>
> +{
> +   if (num_waitlist_entries > 0)
> +       if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE)
> +           ereport(LOG,
> +               (errmsg("recovery is experiencing recovery conflict on
> %s", get_procsignal_reason_desc(reason)),
> +               errdetail("There is %d conflicting connection(s).",
> num_waitlist_entries)));
> +       else
> +           ereport(LOG,
> +               (errmsg("recovery is waiting for recovery conflict on
> %s", get_procsignal_reason_desc(reason)),
> +               errdetail("There is %d blocking connection(s).",
> num_waitlist_entries)));
> +   else
> +       ereport(LOG,
> +           (errmsg("recovery is waiting for recovery conflict on %s",
> get_procsignal_reason_desc(reason))));
> +}
>
> How about displaying actual virtual transaction ids or process ids the
> startup process is waiting for? For example, the log is going to be:
>
> LOG:  recovery is waiting for recovery conflict on snapshot
> DETAIL:  Conflicting virtual transaction ids: 100/101, 200/202, 300/303
The new attached patch adds the information related to the virtual
transaction ids, so that the outcome is now:

2020-08-10 13:31:00.485 UTC [1760] LOG:  recovery is resolving recovery
conflict on snapshot
2020-08-10 13:31:00.485 UTC [1760] DETAIL:  Conflicting virtual
transaction ids: 4/2, 3/2.

>
> or
>
> LOG:  recovery is waiting for recovery conflict on snapshot
> DETAIL:  Conflicting processes: 123, 456, 789
I think it makes more sense to display the conflicting virtual
transaction id(s), as the blocking process(es) will be known during the
statement cancellation.

>
> FYI, errdetail_plural() or errdetail_log_plural() can be used for pluralization.
>
> +   tmpWaitlist = waitlist;
> +   while (VirtualTransactionIdIsValid(*tmpWaitlist))
> +   {
> +       tmpWaitlist++;
> +   }
> +
> +   num_waitlist_entries = (tmpWaitlist - waitlist);
> +
> +   /* display wal record information */
> +   if (log_recovery_conflicts &&
> (TimestampDifferenceExceeds(recovery_conflicts_log_time,
> GetCurrentTimestamp(),
> +                                   DeadlockTimeout))) {
> +       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
> +       recovery_conflicts_log_time = GetCurrentTimestamp();
> +   }
>
> recovery_conflicts_log_time is not initialized. And shouldn't we
> compare the current timestamp to the timestamp when the startup
> process started waiting?
>
> I think we should call LogBlockedWalRecordInfo() inside of the inner
> while loop rather than at the beginning of
> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the
> startup process waits until 'ltime', then enters
> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
> snapshot and tablespace conflict cases (i.g.
> ResolveRecoveryConflictWithSnapshot() and
> ResolveRecoveryConflictWithTablespace()), it enters
> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
> reaching ‘ltime’ inside of the inner while look. So the above
> condition could always be false.
That would make the information being displayed after
max_standby_streaming_delay is reached for the multiple cases you just
described.

It indeed makes more sense to call the function in the inside of the
inner loop: this is done in the new attached patch.

>
> I wonder if we can have something like the following function:
>
> bool
> LogRecoveryConflict(VirtualTransactionId *waitlist, ProcSignalReason reason,
>                                    TimestampTz wait_start)
> {
>      if (!TimestampDifferenceExceeds(wait_start, GetCurrentTimestamp(),
>
> max_standby_streaming_delay))
>          return false;
>
>      if (waitlist)
>      {
>          char *buf;
>
>          buf = construct a string containing all process ids (or
> virtual transaction ids) to resolve from waitlist;
>
>          ereport(LOG,
>              (errmsg("recovery is waiting recovery conflict on %s",
> get_procsignal_reason_desc(reason)),
>              (errdetail_log_plural("Conflicting process : %s.",
>                "Conflicting processes : %s.",
>                the number of processes in *waitlist, buf))));
>      }
>      else
>          ereport(LOG,
>          (errmsg("recovery is resolving recovery conflict on %s",
> get_procsignal_reason_desc(reason))));
>
>      return true;
> }
>
> 'wait_start' is the timestamp when the caller started waiting for the
> recovery conflict. This function logs resolving recovery conflict with
> detailed information if needed. The caller call this function when
> log_recovery_conflicts is enabled as follows:
>
> bool logged = false;
>      :
> if (log_recovery_conflicts && !logged)
>      logged = LogRecoveryConflict(waitlist, reason, waitStart);
It's more or less what the new attached patch is doing, except that it
deals with the waitStart comparison in the while loop (to make the
changes to the LogRecoveryConflict() calls outside
ResolveRecoveryConflictWithVirtualXIDs() simpler).

For the buffer pin case, I moved the call to LogRecoveryConflict to
StandbyTimeoutHandler (instead of the beginning of
ResolveRecoveryConflictWithBufferPin).

That way we are consistent across all the cases: we are logging once the
conflict resolution is actually happening (means we are ready/about to
cancel what is needed).

Then, the LogRecoveryConflict() is now not reporting that "recovery is
waiting recovery conflict" anymore but that "recovery is resolving
recovery conflict" in all the cases.

>
> +     <varlistentry id="guc-log-recovery-conflicts"
> xreflabel="log_recovery_conflicts">
> +      <term><varname>log_recovery_conflicts</varname> (<type>boolean</type>)
> +      <indexterm>
> +       <primary><varname>log_recovery_conflicts</varname>
> configuration parameter</primary>
> +      </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        Controls whether detailed information is produced when a conflict
> +        occurs during standby recovery. The default is <literal>on</literal>.
> +        Only superusers can change this setting.
> +       </para>
> +      </listitem>
> +     </varlistentry>
>
> I think the documentation needs to be updated.
Right, updated that way in the new attached patch.

>
> IIUC this feature is to logging the startup process is waiting due to
> recovery conflict or waiting for recovery conflict resolution. But it
> logs only when the startup process waits for longer than a certain
> time. I think we can improve the documentation in terms of that point.
> Also, the last sentence is no longer true; log_recovery_conflicts is
> now PGC_SIGHUP.
>
> BTW 'log_recovery_conflict_waits' might be better name for consistency
> with log_lock_waits?
As per the changes in this new patch, we are not logging when the
startup process starts waiting anymore, but only during the conflict
resolution.

I changed the GUC to "log_recovery_conflicts_resolution" instead (based
on my comment above).

Thanks,

Bertrand


v1-0003-Add-Standby-Recovery-Conflicts-BlockedOnly-Info.patch (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <[hidden email]> wrote:

>
> Hi,
>
> On 7/31/20 7:12 AM, Masahiko Sawada wrote:
> > +   tmpWaitlist = waitlist;
> > +   while (VirtualTransactionIdIsValid(*tmpWaitlist))
> > +   {
> > +       tmpWaitlist++;
> > +   }
> > +
> > +   num_waitlist_entries = (tmpWaitlist - waitlist);
> > +
> > +   /* display wal record information */
> > +   if (log_recovery_conflicts &&
> > (TimestampDifferenceExceeds(recovery_conflicts_log_time,
> > GetCurrentTimestamp(),
> > +                                   DeadlockTimeout))) {
> > +       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
> > +       recovery_conflicts_log_time = GetCurrentTimestamp();
> > +   }
> >
> > recovery_conflicts_log_time is not initialized. And shouldn't we
> > compare the current timestamp to the timestamp when the startup
> > process started waiting?
> >
> > I think we should call LogBlockedWalRecordInfo() inside of the inner
> > while loop rather than at the beginning of
> > ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the
> > startup process waits until 'ltime', then enters
> > ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
> > Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
> > beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
> > snapshot and tablespace conflict cases (i.g.
> > ResolveRecoveryConflictWithSnapshot() and
> > ResolveRecoveryConflictWithTablespace()), it enters
> > ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
> > reaching ‘ltime’ inside of the inner while look. So the above
> > condition could always be false.
>
> That would make the information being displayed after
> max_standby_streaming_delay is reached for the multiple cases you just
> described.

Sorry, it should be deadlock_timeout, not max_standby_streaming_delay.
Otherwise, the recovery conflict log message is printed when
resolution, which seems not to achieve the original purpose. Am I
missing something?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand

On 8/27/20 10:16 AM, Masahiko Sawada wrote

>
> On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <[hidden email]> wrote:
>> Hi,
>>
>> On 7/31/20 7:12 AM, Masahiko Sawada wrote:
>>> +   tmpWaitlist = waitlist;
>>> +   while (VirtualTransactionIdIsValid(*tmpWaitlist))
>>> +   {
>>> +       tmpWaitlist++;
>>> +   }
>>> +
>>> +   num_waitlist_entries = (tmpWaitlist - waitlist);
>>> +
>>> +   /* display wal record information */
>>> +   if (log_recovery_conflicts &&
>>> (TimestampDifferenceExceeds(recovery_conflicts_log_time,
>>> GetCurrentTimestamp(),
>>> +                                   DeadlockTimeout))) {
>>> +       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
>>> +       recovery_conflicts_log_time = GetCurrentTimestamp();
>>> +   }
>>>
>>> recovery_conflicts_log_time is not initialized. And shouldn't we
>>> compare the current timestamp to the timestamp when the startup
>>> process started waiting?
>>>
>>> I think we should call LogBlockedWalRecordInfo() inside of the inner
>>> while loop rather than at the beginning of
>>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the
>>> startup process waits until 'ltime', then enters
>>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
>>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
>>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
>>> snapshot and tablespace conflict cases (i.g.
>>> ResolveRecoveryConflictWithSnapshot() and
>>> ResolveRecoveryConflictWithTablespace()), it enters
>>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
>>> reaching ‘ltime’ inside of the inner while look. So the above
>>> condition could always be false.
>> That would make the information being displayed after
>> max_standby_streaming_delay is reached for the multiple cases you just
>> described.
> Sorry, it should be deadlock_timeout, not max_standby_streaming_delay.
> Otherwise, the recovery conflict log message is printed when
> resolution, which seems not to achieve the original purpose. Am I
> missing something?

Ok, I understand where the confusion is coming from.

Indeed the new version is now printing the recovery conflict log message
during the conflict resolution (while the initial intention was to be
warned as soon as the replay had to wait).

The advantage of the new version is that it would be consistent across
all the conflicts scenarios (if not, we would get messages during the
resolution or when the replay started waiting, depending of the conflict
scenario).

On the other hand, the cons of the new version is that we would miss
messages when no resolution is needed (replay wait duration <
max_standby_streaming_delay), but is that really annoying? (As no
cancellation would occur)

Thinking about it, i like the new version (being warned during the
resolution) as we would get messages only when cancelation will occur
(which is what the user might want to avoid, so the extra info would be
useful).

What do you think?

Bertrand



Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <[hidden email]> wrote:

>
>
> On 8/27/20 10:16 AM, Masahiko Sawada wrote
> >
> > On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <[hidden email]> wrote:
> >> Hi,
> >>
> >> On 7/31/20 7:12 AM, Masahiko Sawada wrote:
> >>> +   tmpWaitlist = waitlist;
> >>> +   while (VirtualTransactionIdIsValid(*tmpWaitlist))
> >>> +   {
> >>> +       tmpWaitlist++;
> >>> +   }
> >>> +
> >>> +   num_waitlist_entries = (tmpWaitlist - waitlist);
> >>> +
> >>> +   /* display wal record information */
> >>> +   if (log_recovery_conflicts &&
> >>> (TimestampDifferenceExceeds(recovery_conflicts_log_time,
> >>> GetCurrentTimestamp(),
> >>> +                                   DeadlockTimeout))) {
> >>> +       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
> >>> +       recovery_conflicts_log_time = GetCurrentTimestamp();
> >>> +   }
> >>>
> >>> recovery_conflicts_log_time is not initialized. And shouldn't we
> >>> compare the current timestamp to the timestamp when the startup
> >>> process started waiting?
> >>>
> >>> I think we should call LogBlockedWalRecordInfo() inside of the inner
> >>> while loop rather than at the beginning of
> >>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the
> >>> startup process waits until 'ltime', then enters
> >>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
> >>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
> >>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
> >>> snapshot and tablespace conflict cases (i.g.
> >>> ResolveRecoveryConflictWithSnapshot() and
> >>> ResolveRecoveryConflictWithTablespace()), it enters
> >>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
> >>> reaching ‘ltime’ inside of the inner while look. So the above
> >>> condition could always be false.
> >> That would make the information being displayed after
> >> max_standby_streaming_delay is reached for the multiple cases you just
> >> described.
> > Sorry, it should be deadlock_timeout, not max_standby_streaming_delay.
> > Otherwise, the recovery conflict log message is printed when
> > resolution, which seems not to achieve the original purpose. Am I
> > missing something?
>
> Ok, I understand where the confusion is coming from.
>
> Indeed the new version is now printing the recovery conflict log message
> during the conflict resolution (while the initial intention was to be
> warned as soon as the replay had to wait).
>
> The advantage of the new version is that it would be consistent across
> all the conflicts scenarios (if not, we would get messages during the
> resolution or when the replay started waiting, depending of the conflict
> scenario).
>
> On the other hand, the cons of the new version is that we would miss
> messages when no resolution is needed (replay wait duration <
> max_standby_streaming_delay), but is that really annoying? (As no
> cancellation would occur)
>
> Thinking about it, i like the new version (being warned during the
> resolution) as we would get messages only when cancelation will occur
> (which is what the user might want to avoid, so the extra info would be
> useful).
>
> What do you think?

Hmm, I think we print the reason why backends are canceled even of as
now by ProcessInterrupts(). With this patch and related patches you
proposed on another thread, the startup process reports virtual xids
being interrupted, the reason, and LSN of blocked WAL, then processes
will also report its virtual xid and reason. Therefore, the new
information added by these patches is only the LSN of blocked WAL.
Also, the blocked WAL would be unblocked just after the startup
process reports the resolution message. What use cases are you
assuming?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand

On 8/28/20 7:03 AM, Masahiko Sawada wrote:

> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <[hidden email]> wrote:
>>
>> On 8/27/20 10:16 AM, Masahiko Sawada wrote
>>> On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <[hidden email]> wrote:
>>>> Hi,
>>>>
>>>> On 7/31/20 7:12 AM, Masahiko Sawada wrote:
>>>>> +   tmpWaitlist = waitlist;
>>>>> +   while (VirtualTransactionIdIsValid(*tmpWaitlist))
>>>>> +   {
>>>>> +       tmpWaitlist++;
>>>>> +   }
>>>>> +
>>>>> +   num_waitlist_entries = (tmpWaitlist - waitlist);
>>>>> +
>>>>> +   /* display wal record information */
>>>>> +   if (log_recovery_conflicts &&
>>>>> (TimestampDifferenceExceeds(recovery_conflicts_log_time,
>>>>> GetCurrentTimestamp(),
>>>>> +                                   DeadlockTimeout))) {
>>>>> +       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
>>>>> +       recovery_conflicts_log_time = GetCurrentTimestamp();
>>>>> +   }
>>>>>
>>>>> recovery_conflicts_log_time is not initialized. And shouldn't we
>>>>> compare the current timestamp to the timestamp when the startup
>>>>> process started waiting?
>>>>>
>>>>> I think we should call LogBlockedWalRecordInfo() inside of the inner
>>>>> while loop rather than at the beginning of
>>>>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the
>>>>> startup process waits until 'ltime', then enters
>>>>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
>>>>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
>>>>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
>>>>> snapshot and tablespace conflict cases (i.g.
>>>>> ResolveRecoveryConflictWithSnapshot() and
>>>>> ResolveRecoveryConflictWithTablespace()), it enters
>>>>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
>>>>> reaching ‘ltime’ inside of the inner while look. So the above
>>>>> condition could always be false.
>>>> That would make the information being displayed after
>>>> max_standby_streaming_delay is reached for the multiple cases you just
>>>> described.
>>> Sorry, it should be deadlock_timeout, not max_standby_streaming_delay.
>>> Otherwise, the recovery conflict log message is printed when
>>> resolution, which seems not to achieve the original purpose. Am I
>>> missing something?
>> Ok, I understand where the confusion is coming from.
>>
>> Indeed the new version is now printing the recovery conflict log message
>> during the conflict resolution (while the initial intention was to be
>> warned as soon as the replay had to wait).
>>
>> The advantage of the new version is that it would be consistent across
>> all the conflicts scenarios (if not, we would get messages during the
>> resolution or when the replay started waiting, depending of the conflict
>> scenario).
>>
>> On the other hand, the cons of the new version is that we would miss
>> messages when no resolution is needed (replay wait duration <
>> max_standby_streaming_delay), but is that really annoying? (As no
>> cancellation would occur)
>>
>> Thinking about it, i like the new version (being warned during the
>> resolution) as we would get messages only when cancelation will occur
>> (which is what the user might want to avoid, so the extra info would be
>> useful).
>>
>> What do you think?
> Hmm, I think we print the reason why backends are canceled even of as
> now by ProcessInterrupts(). With this patch and related patches you
> proposed on another thread, the startup process reports virtual xids
> being interrupted, the reason, and LSN of blocked WAL, then processes
> will also report its virtual xid and reason. Therefore, the new
> information added by these patches is only the LSN of blocked WAL.

That's completely right, let's come back to the original intention of
this patch (means, don't wait for the conflict resolution to log messages).

I'll submit a new version once updated.

Thanks!

Bertrand



Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand

On 8/28/20 4:14 PM, Drouvot, Bertrand wrote:

>
> On 8/28/20 7:03 AM, Masahiko Sawada wrote:
>> CAUTION: This email originated from outside of the organization. Do
>> not click links or open attachments unless you can confirm the sender
>> and know the content is safe.
>>
>>
>>
>> On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <[hidden email]>
>> wrote:
>>>
>>> On 8/27/20 10:16 AM, Masahiko Sawada wrote
>>>> On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand
>>>> <[hidden email]> wrote:
>>>>> Hi,
>>>>>
>>>>> On 7/31/20 7:12 AM, Masahiko Sawada wrote:
>>>>>> +   tmpWaitlist = waitlist;
>>>>>> +   while (VirtualTransactionIdIsValid(*tmpWaitlist))
>>>>>> +   {
>>>>>> +       tmpWaitlist++;
>>>>>> +   }
>>>>>> +
>>>>>> +   num_waitlist_entries = (tmpWaitlist - waitlist);
>>>>>> +
>>>>>> +   /* display wal record information */
>>>>>> +   if (log_recovery_conflicts &&
>>>>>> (TimestampDifferenceExceeds(recovery_conflicts_log_time,
>>>>>> GetCurrentTimestamp(),
>>>>>> +                                   DeadlockTimeout))) {
>>>>>> +       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
>>>>>> +       recovery_conflicts_log_time = GetCurrentTimestamp();
>>>>>> +   }
>>>>>>
>>>>>> recovery_conflicts_log_time is not initialized. And shouldn't we
>>>>>> compare the current timestamp to the timestamp when the startup
>>>>>> process started waiting?
>>>>>>
>>>>>> I think we should call LogBlockedWalRecordInfo() inside of the inner
>>>>>> while loop rather than at the beginning of
>>>>>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases,
>>>>>> the
>>>>>> startup process waits until 'ltime', then enters
>>>>>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
>>>>>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
>>>>>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
>>>>>> snapshot and tablespace conflict cases (i.g.
>>>>>> ResolveRecoveryConflictWithSnapshot() and
>>>>>> ResolveRecoveryConflictWithTablespace()), it enters
>>>>>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
>>>>>> reaching ‘ltime’ inside of the inner while look. So the above
>>>>>> condition could always be false.
>>>>> That would make the information being displayed after
>>>>> max_standby_streaming_delay is reached for the multiple cases you
>>>>> just
>>>>> described.
>>>> Sorry, it should be deadlock_timeout, not max_standby_streaming_delay.
>>>> Otherwise, the recovery conflict log message is printed when
>>>> resolution, which seems not to achieve the original purpose. Am I
>>>> missing something?
>>> Ok, I understand where the confusion is coming from.
>>>
>>> Indeed the new version is now printing the recovery conflict log
>>> message
>>> during the conflict resolution (while the initial intention was to be
>>> warned as soon as the replay had to wait).
>>>
>>> The advantage of the new version is that it would be consistent across
>>> all the conflicts scenarios (if not, we would get messages during the
>>> resolution or when the replay started waiting, depending of the
>>> conflict
>>> scenario).
>>>
>>> On the other hand, the cons of the new version is that we would miss
>>> messages when no resolution is needed (replay wait duration <
>>> max_standby_streaming_delay), but is that really annoying? (As no
>>> cancellation would occur)
>>>
>>> Thinking about it, i like the new version (being warned during the
>>> resolution) as we would get messages only when cancelation will occur
>>> (which is what the user might want to avoid, so the extra info would be
>>> useful).
>>>
>>> What do you think?
>> Hmm, I think we print the reason why backends are canceled even of as
>> now by ProcessInterrupts(). With this patch and related patches you
>> proposed on another thread, the startup process reports virtual xids
>> being interrupted, the reason, and LSN of blocked WAL, then processes
>> will also report its virtual xid and reason. Therefore, the new
>> information added by these patches is only the LSN of blocked WAL.
>
> That's completely right, let's come back to the original intention of
> this patch (means, don't wait for the conflict resolution to log
> messages).
>
> I'll submit a new version once updated.
Please find attached the new patch.

It provides the following outcomes depending on the conflict:

2020-10-04 09:08:51.923 UTC [30788] LOG:  recovery is waiting recovery
conflict on buffer pin

OR

2020-10-04 09:52:25.832 UTC [1249] LOG:  recovery is waiting recovery
conflict on snapshot
2020-10-04 09:52:25.832 UTC [1249] DETAIL:  Conflicting virtual
transaction ids: 3/2, 2/4.

OR

2020-10-04 09:11:51.717 UTC [30788] LOG:  recovery is waiting recovery
conflict on lock
2020-10-04 09:11:51.717 UTC [30788] DETAIL:  Conflicting virtual
transaction id: 2/5.

OR

2020-10-04 09:13:04.104 UTC [30788] LOG:  recovery is resolving recovery
conflict on database

Thanks

Bertrand


v1-0004-Add-Standby-Recovery-Conflicts-BlockedOnly-Info.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Álvaro Herrera
> +extern const char *get_procsignal_reason_desc(ProcSignalReason reason)
> + {
> + const char *reasonDesc = "unknown reason";
> +
> + switch (reason)
> + {
> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
> + reasonDesc = "buffer pin";
> + break;

It doesn't work to construct sentences from pieces, for translatability
reasons.  Maybe you can return the whole errmsg sentence from this
routine instead.



Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Drouvot, Bertrand
Hi,

On 10/4/20 4:10 PM, Alvaro Herrera wrote

>> +extern const char *get_procsignal_reason_desc(ProcSignalReason reason)
>> +     {
>> +             const char *reasonDesc = "unknown reason";
>> +
>> +             switch (reason)
>> +             {
>> +                     case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
>> +                             reasonDesc = "buffer pin";
>> +                             break;
> It doesn't work to construct sentences from pieces, for translatability
> reasons.  Maybe you can return the whole errmsg sentence from this
> routine instead.
>
Thanks for the feedback!

Enclosed a new version that take care of it.

Bertrand


v1-0005-Add-Standby-Recovery-Conflicts-BlockedOnly-Info.patch (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
In reply to this post by Drouvot, Bertrand
On Sun, 4 Oct 2020 at 20:48, Drouvot, Bertrand <[hidden email]> wrote:

>
>
> On 8/28/20 4:14 PM, Drouvot, Bertrand wrote:
> >
> > On 8/28/20 7:03 AM, Masahiko Sawada wrote:
> >> CAUTION: This email originated from outside of the organization. Do
> >> not click links or open attachments unless you can confirm the sender
> >> and know the content is safe.
> >>
> >>
> >>
> >> On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <[hidden email]>
> >> wrote:
> >>>
> >>> On 8/27/20 10:16 AM, Masahiko Sawada wrote
> >>>> On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand
> >>>> <[hidden email]> wrote:
> >>>>> Hi,
> >>>>>
> >>>>> On 7/31/20 7:12 AM, Masahiko Sawada wrote:
> >>>>>> +   tmpWaitlist = waitlist;
> >>>>>> +   while (VirtualTransactionIdIsValid(*tmpWaitlist))
> >>>>>> +   {
> >>>>>> +       tmpWaitlist++;
> >>>>>> +   }
> >>>>>> +
> >>>>>> +   num_waitlist_entries = (tmpWaitlist - waitlist);
> >>>>>> +
> >>>>>> +   /* display wal record information */
> >>>>>> +   if (log_recovery_conflicts &&
> >>>>>> (TimestampDifferenceExceeds(recovery_conflicts_log_time,
> >>>>>> GetCurrentTimestamp(),
> >>>>>> +                                   DeadlockTimeout))) {
> >>>>>> +       LogBlockedWalRecordInfo(num_waitlist_entries, reason);
> >>>>>> +       recovery_conflicts_log_time = GetCurrentTimestamp();
> >>>>>> +   }
> >>>>>>
> >>>>>> recovery_conflicts_log_time is not initialized. And shouldn't we
> >>>>>> compare the current timestamp to the timestamp when the startup
> >>>>>> process started waiting?
> >>>>>>
> >>>>>> I think we should call LogBlockedWalRecordInfo() inside of the inner
> >>>>>> while loop rather than at the beginning of
> >>>>>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases,
> >>>>>> the
> >>>>>> startup process waits until 'ltime', then enters
> >>>>>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'.
> >>>>>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the
> >>>>>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in
> >>>>>> snapshot and tablespace conflict cases (i.g.
> >>>>>> ResolveRecoveryConflictWithSnapshot() and
> >>>>>> ResolveRecoveryConflictWithTablespace()), it enters
> >>>>>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for
> >>>>>> reaching ‘ltime’ inside of the inner while look. So the above
> >>>>>> condition could always be false.
> >>>>> That would make the information being displayed after
> >>>>> max_standby_streaming_delay is reached for the multiple cases you
> >>>>> just
> >>>>> described.
> >>>> Sorry, it should be deadlock_timeout, not max_standby_streaming_delay.
> >>>> Otherwise, the recovery conflict log message is printed when
> >>>> resolution, which seems not to achieve the original purpose. Am I
> >>>> missing something?
> >>> Ok, I understand where the confusion is coming from.
> >>>
> >>> Indeed the new version is now printing the recovery conflict log
> >>> message
> >>> during the conflict resolution (while the initial intention was to be
> >>> warned as soon as the replay had to wait).
> >>>
> >>> The advantage of the new version is that it would be consistent across
> >>> all the conflicts scenarios (if not, we would get messages during the
> >>> resolution or when the replay started waiting, depending of the
> >>> conflict
> >>> scenario).
> >>>
> >>> On the other hand, the cons of the new version is that we would miss
> >>> messages when no resolution is needed (replay wait duration <
> >>> max_standby_streaming_delay), but is that really annoying? (As no
> >>> cancellation would occur)
> >>>
> >>> Thinking about it, i like the new version (being warned during the
> >>> resolution) as we would get messages only when cancelation will occur
> >>> (which is what the user might want to avoid, so the extra info would be
> >>> useful).
> >>>
> >>> What do you think?
> >> Hmm, I think we print the reason why backends are canceled even of as
> >> now by ProcessInterrupts(). With this patch and related patches you
> >> proposed on another thread, the startup process reports virtual xids
> >> being interrupted, the reason, and LSN of blocked WAL, then processes
> >> will also report its virtual xid and reason. Therefore, the new
> >> information added by these patches is only the LSN of blocked WAL.
> >
> > That's completely right, let's come back to the original intention of
> > this patch (means, don't wait for the conflict resolution to log
> > messages).
> >
> > I'll submit a new version once updated.
>
> Please find attached the new patch.
>
> It provides the following outcomes depending on the conflict:
>
> 2020-10-04 09:08:51.923 UTC [30788] LOG:  recovery is waiting recovery
> conflict on buffer pin
>
> OR
>
> 2020-10-04 09:52:25.832 UTC [1249] LOG:  recovery is waiting recovery
> conflict on snapshot
> 2020-10-04 09:52:25.832 UTC [1249] DETAIL:  Conflicting virtual
> transaction ids: 3/2, 2/4.
>
> OR
>
> 2020-10-04 09:11:51.717 UTC [30788] LOG:  recovery is waiting recovery
> conflict on lock
> 2020-10-04 09:11:51.717 UTC [30788] DETAIL:  Conflicting virtual
> transaction id: 2/5.
>
> OR
>
> 2020-10-04 09:13:04.104 UTC [30788] LOG:  recovery is resolving recovery
> conflict on database
>
Sorry I didn't realize I failed to send review comments. Here is my
comments on the previous version (v4) patch (attached to [1]):

---
@ -3828,6 +3830,8 @@ LockBufferForCleanup(Buffer buffer)
                        GetPrivateRefCount(buffer));

       bufHdr = GetBufferDescriptor(buffer - 1);
       waitStart = GetCurrentTimestamp();
       logged = false;

I think it's better to avoid calling GetCurrentTimestamp() when the
timestamp is not used. In the above case, we can call it only when
log_recovery_conflict_waits is true.

---
+void
+LogRecoveryConflict(VirtualTransactionId *waitlist, ProcSignalReason
reason, bool waiting)

It seems to me that having 'reason' at the first argument is natural.

---
+       /* Log the recovery conflict */
+       if (log_recovery_conflict_waits)
+               LogRecoveryConflict(NULL,
PROCSIG_RECOVERY_CONFLICT_DATABASE, false);

There seems no need to log recovery conflicts because no wait occurs
when recovery conflict with database happens.

---
if (InHotStandby)
                {
                        /* Set a timer and wait for that or for the
Lock to be granted */
+                       VirtualTransactionId *backends;
+                       backends =
GetLockConflicts(&locallock->tag.lock, AccessExclusiveLock, NULL);
+                       if (log_recovery_conflict_waits && !logged) {
+                               LogRecoveryConflict(backends,
PROCSIG_RECOVERY_CONFLICT_LOCK, true);
+                               logged = true;
+                       }
+
                        ResolveRecoveryConflictWithLock(locallock->tag.lock);
                }

We log the recovery conflicts without checking deadlock timeout here.
I think we should call LogRecoveryConflict() after
ResolveRecoveryConflictWithLock() with the check if we're waiting for
longer than deadlock_time sec. That way, we can also remove the
following description in the doc, which seems more appropriate
behavior:

+        For the lock conflict case, it does not wait for
<varname>deadlock_timeout</varname> to be reached.

Also, in ResolveRecoveryConflictWithVirtualXIDs() called from
ResolveRecoveryConflictWithLock(), we could log the same again here:

+                       /* Log the recovery conflict */
+                       if (log_recovery_conflict_waits && !logged
+                                       && TimestampDifferenceExceeds(waitStart,
+
GetCurrentTimestamp(), DeadlockTimeout))
+                       {
+                               LogRecoveryConflict(waitlist, reason,
report_waiting);
+                               logged = true;
+                       }

Note that since the 'report_waiting' is false in this case, waitStart
is not initialized.

I think we can use 'report_waiting' for this purpose too. The
'report_waiting' is false in the lock conflict cases where we also
want to skip logging a recovery conflict in
ResolveRecoveryConflictWithVirtualXIDs().

---
@@ -1069,6 +1069,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod
lockMethodTable)
       PGPROC     *proc;
        PGPROC     *leader = MyProc->lockGroupLeader;
        int                     i;
+       bool logged;

I guess it's better to rename 'logged' to 'logged_recovery_confclit'
or other to make the purpose clear. This function logs not only
recovery conflicts but also deadlock and waits for lock.

As an alternative idea, possibly we can allow it to report the log
multiple times with how long waiting for. We pass the time when the
wait started to LogRecoveryConflict() and show something like
"recovery is still waiting recovery conflict on buffer pin after 100.0
ms". But we can also leave it as the next improvement.

---
+extern const char *get_procsignal_reason_desc(ProcSignalReason reason)
+   {
+       const char *reasonDesc = "unknown reason";
+

I think we can move this function to standby.c.

---
The patch needs to run pgindent. For instance, the following change
doesn't follow the coding style:

+
+       if (waitlist && waiting) {
+               vxids = waitlist;
+               count = 0;
+               initStringInfo(&buf);

---
Currently, we report the list of the conflicted virtual transaction
ids but perhaps showing process ids instead seems better. What do you
think?

I've attached the patch as an idea of fixing the above comments as
well as the comment from Alvaro. I can be applied on top of v4 patch.

[1] https://www.postgresql.org/message-id/29da248a-e21c-a3eb-a051-f1ec79b13d31%40amazon.com

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

fix_masahiko.patch (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Álvaro Herrera
On 2020-Oct-14, Masahiko Sawada wrote:

> I've attached the patch as an idea of fixing the above comments as
> well as the comment from Alvaro. I can be applied on top of v4 patch.

One note about the translation stuff.  Currently you have _("...") where
the string is produced, and then ereport(.., errmsg("%s", str) where it
is used.  Both those things will attempt to translate the string, which
isn't great.  It is better if we only translate once.  You have two
options to fix this: one is to change _() to gettext_noop() (which marks
the string for translation so that it appears in the message catalog,
but it does not return the translation -- it returns the original, and
then errmsg() translates at run time).  The other is to change errmsg()
to errmsg_internal() .. so the function returns the translated message
and errmsg_internal() doesn't apply a translation.

I prefer the first option, because if we ever include a server feature
to log both the non-translated message alongside the translated one, we
will already have both in hand.


Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Masahiko Sawada-2
On Wed, 14 Oct 2020 at 07:44, Alvaro Herrera <[hidden email]> wrote:

>
> On 2020-Oct-14, Masahiko Sawada wrote:
>
> > I've attached the patch as an idea of fixing the above comments as
> > well as the comment from Alvaro. I can be applied on top of v4 patch.
>
> One note about the translation stuff.  Currently you have _("...") where
> the string is produced, and then ereport(.., errmsg("%s", str) where it
> is used.  Both those things will attempt to translate the string, which
> isn't great.  It is better if we only translate once.  You have two
> options to fix this: one is to change _() to gettext_noop() (which marks
> the string for translation so that it appears in the message catalog,
> but it does not return the translation -- it returns the original, and
> then errmsg() translates at run time).  The other is to change errmsg()
> to errmsg_internal() .. so the function returns the translated message
> and errmsg_internal() doesn't apply a translation.
>
> I prefer the first option, because if we ever include a server feature
> to log both the non-translated message alongside the translated one, we
> will already have both in hand.
Thanks, I didn't know that. So perhaps ATWrongRelkindError() has the
same translation problem? It uses _() when producing the message but
also uses errmsg().

I've attached the patch changed accordingly. I also fixed some bugs
around recovery conflicts on locks and changed the code so that the
log shows pids instead of virtual transaction ids since pids are much
easy to use for the users.

Regards,


--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

v6-0001-Log-the-standby-recovery-conflict-waits.patch (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add Information during standby recovery conflicts

Kyotaro Horiguchi-4
At Wed, 14 Oct 2020 17:39:20 +0900, Masahiko Sawada <[hidden email]> wrote in

> On Wed, 14 Oct 2020 at 07:44, Alvaro Herrera <[hidden email]> wrote:
> >
> > On 2020-Oct-14, Masahiko Sawada wrote:
> >
> > > I've attached the patch as an idea of fixing the above comments as
> > > well as the comment from Alvaro. I can be applied on top of v4 patch.
> >
> > One note about the translation stuff.  Currently you have _("...") where
> > the string is produced, and then ereport(.., errmsg("%s", str) where it
> > is used.  Both those things will attempt to translate the string, which
> > isn't great.  It is better if we only translate once.  You have two
> > options to fix this: one is to change _() to gettext_noop() (which marks
> > the string for translation so that it appears in the message catalog,
> > but it does not return the translation -- it returns the original, and
> > then errmsg() translates at run time).  The other is to change errmsg()
> > to errmsg_internal() .. so the function returns the translated message
> > and errmsg_internal() doesn't apply a translation.
> >
> > I prefer the first option, because if we ever include a server feature
> > to log both the non-translated message alongside the translated one, we
> > will already have both in hand.
>
> Thanks, I didn't know that. So perhaps ATWrongRelkindError() has the
> same translation problem? It uses _() when producing the message but
> also uses errmsg().
>
> I've attached the patch changed accordingly. I also fixed some bugs
> around recovery conflicts on locks and changed the code so that the
> log shows pids instead of virtual transaction ids since pids are much
> easy to use for the users.

You're misunderstanding.

ereport(..(errmsg("%s", _("hogehoge")))) results in
fprintf((translated("%s")), translate("hogehoge")).

So your change (errmsg("%s", gettext_noop("hogehoge")) results in

fprintf((translated("%s")), DONT_translate("hogehoge")).

which leads to a translation problem.

(errmsg(gettext_noop("hogehoge"))

works fine. You can see the instance in aclchk.c.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


123