improve wals replay on secondary

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

improve wals replay on secondary

Mariel Cherkassky
Hey,
PG 9.6, I have a standalone configured. I tried to start up a secondary, run standby clone (repmgr). The clone process took 3 hours and during that time wals were generated(mostly because of the checkpoint_timeout). As a result of that, when I start the secondary ,I see that the secondary keeps getting the wals but I dont see any messages that indicate that the secondary tried to replay the wals. 
messages that i see :
receiving incremental file list
000000010000377B000000DE

sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
total size is 4.15M  speedup is 1.01
2019-05-22 12:48:10 EEST  60942  LOG:  restored log file "000000010000377B000000DE" from archive
2019-05-22 12:48:11 EEST db63311  FATAL:  the database system is starting up
2019-05-22 12:48:12 EEST db63313  FATAL:  the database system is starting up 

I was hoping to see the following messages (taken from a different machine) : 
2019-05-27 01:15:37 EDT  7428  LOG:  restartpoint starting: time
2019-05-27 01:16:18 EDT  7428  LOG:  restartpoint complete: wrote 406 buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0 recycled; write=41.390 s, sync=0.001 s, total=41.582 s; sync file
s=128, longest=0.000 s, average=0.000 s; distance=2005 kB, estimate=2699 kB
2019-05-27 01:16:18 EDT  7428  LOG:  recovery restart point at 4/D096C4F8

My primary settings(wals settings) : 
wal_buffers = 16MB
checkpoint_completion_target = 0.9
checkpoint_timeout = 30min

Any idea what can explain why the secondary doesnt replay the wals ?
Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Fabio Pardi
Hi Mariel,

if i m not wrong, on the secondary you will see the messages you
mentioned when a checkpoint happens.

What are checkpoint_timeout and max_wal_size on your standby?

Did you ever see this on your standby log?

"consistent recovery state reached at .."


Maybe you can post your whole configuration of your standby for easier
debug.

regards,

fabio pardi




On 5/27/19 10:49 AM, Mariel Cherkassky wrote:

> Hey,
> PG 9.6, I have a standalone configured. I tried to start up a secondary,
> run standby clone (repmgr). The clone process took 3 hours and during
> that time wals were generated(mostly because of the checkpoint_timeout).
> As a result of that, when I start the secondary ,I see that the
> secondary keeps getting the wals but I dont see any messages that
> indicate that the secondary tried to replay the wals. 
> messages that i see :
> receiving incremental file list
> 000000010000377B000000DE
>
> sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
> total size is 4.15M  speedup is 1.01
> 2019-05-22 12:48:10 EEST  60942  LOG:  restored log file
> "000000010000377B000000DE" from archive
> 2019-05-22 12:48:11 EEST db63311  FATAL:  the database system is starting up
> 2019-05-22 12:48:12 EEST db63313  FATAL:  the database system is
> starting up 
>
> I was hoping to see the following messages (taken from a different
> machine) : 
> 2019-05-27 01:15:37 EDT  7428  LOG:  restartpoint starting: time
> 2019-05-27 01:16:18 EDT  7428  LOG:  restartpoint complete: wrote 406
> buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0 recycled;
> write=41.390 s, sync=0.001 s, total=41.582 s; sync file
> s=128, longest=0.000 s, average=0.000 s; distance=2005 kB, estimate=2699 kB
> 2019-05-27 01:16:18 EDT  7428  LOG:  recovery restart point at 4/D096C4F8
>
> My primary settings(wals settings) : 
> wal_buffers = 16MB
> checkpoint_completion_target = 0.9
> checkpoint_timeout = 30min
>
> Any idea what can explain why the secondary doesnt replay the wals ?


Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Fabio Pardi
Hi Mariel,

let s keep the list in cc...

settings look ok.

what's in the recovery.conf file then?

regards,

fabio pardi

On 5/27/19 11:23 AM, Mariel Cherkassky wrote:

> Hey,
> the configuration is the same as in the primary : 
> max_wal_size = 2GB
> min_wal_size = 1GB
> wal_buffers = 16MB
> checkpoint_completion_target = 0.9
> checkpoint_timeout = 30min
>
> Regarding your question, I didnt see this message (consistent recovery
> state reached at), I guess thats why the secondary isnt avaialble yet..
>
> Maybe I'm wrong, but what I understood from the documentation- restart
> point is generated only after the secondary had a checkpoint wihch means
> only after 30 minutes or after max_wal_size is reached ?  But still, why
> wont the secondary reach a consisteny recovery state (does it requires a
> restart point to be generated ? )
>
>
> ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     Hi Mariel,
>
>     if i m not wrong, on the secondary you will see the messages you
>     mentioned when a checkpoint happens.
>
>     What are checkpoint_timeout and max_wal_size on your standby?
>
>     Did you ever see this on your standby log?
>
>     "consistent recovery state reached at .."
>
>
>     Maybe you can post your whole configuration of your standby for easier
>     debug.
>
>     regards,
>
>     fabio pardi
>
>
>
>
>     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     > Hey,
>     > PG 9.6, I have a standalone configured. I tried to start up a
>     secondary,
>     > run standby clone (repmgr). The clone process took 3 hours and during
>     > that time wals were generated(mostly because of the
>     checkpoint_timeout).
>     > As a result of that, when I start the secondary ,I see that the
>     > secondary keeps getting the wals but I dont see any messages that
>     > indicate that the secondary tried to replay the wals. 
>     > messages that i see :
>     > receiving incremental file list
>     > 000000010000377B000000DE
>     >
>     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     > total size is 4.15M  speedup is 1.01
>     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored log file
>     > "000000010000377B000000DE" from archive
>     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the database system is
>     starting up
>     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the database system is
>     > starting up 
>     >
>     > I was hoping to see the following messages (taken from a different
>     > machine) : 
>     > 2019-05-27 01:15:37 EDT  7428  LOG:  restartpoint starting: time
>     > 2019-05-27 01:16:18 EDT  7428  LOG:  restartpoint complete: wrote 406
>     > buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0
>     recycled;
>     > write=41.390 s, sync=0.001 s, total=41.582 s; sync file
>     > s=128, longest=0.000 s, average=0.000 s; distance=2005 kB,
>     estimate=2699 kB
>     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery restart point at
>     4/D096C4F8
>     >
>     > My primary settings(wals settings) : 
>     > wal_buffers = 16MB
>     > checkpoint_completion_target = 0.9
>     > checkpoint_timeout = 30min
>     >
>     > Any idea what can explain why the secondary doesnt replay the wals ?
>
>


Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Mariel Cherkassky
standby_mode = 'on'
primary_conninfo = 'host=X.X.X.X user=repmgr  connect_timeout=10 '
recovery_target_timeline = 'latest'
primary_slot_name = repmgr_slot_1
restore_command = 'rsync -avzhe ssh [hidden email]:/var/lib/pgsql/archive/%f /var/lib/pgsql/archive/%f ; gunzip < /var/lib/pgsql/archive/%f > %p'
archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup /var/lib/pgsql/archive %r'

‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏ <‪[hidden email]‬‏>:‬
Hi Mariel,

let s keep the list in cc...

settings look ok.

what's in the recovery.conf file then?

regards,

fabio pardi

On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
> Hey,
> the configuration is the same as in the primary : 
> max_wal_size = 2GB
> min_wal_size = 1GB
> wal_buffers = 16MB
> checkpoint_completion_target = 0.9
> checkpoint_timeout = 30min
>
> Regarding your question, I didnt see this message (consistent recovery
> state reached at), I guess thats why the secondary isnt avaialble yet..
>
> Maybe I'm wrong, but what I understood from the documentation- restart
> point is generated only after the secondary had a checkpoint wihch means
> only after 30 minutes or after max_wal_size is reached ?  But still, why
> wont the secondary reach a consisteny recovery state (does it requires a
> restart point to be generated ? )
>
>
> ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     Hi Mariel,
>
>     if i m not wrong, on the secondary you will see the messages you
>     mentioned when a checkpoint happens.
>
>     What are checkpoint_timeout and max_wal_size on your standby?
>
>     Did you ever see this on your standby log?
>
>     "consistent recovery state reached at .."
>
>
>     Maybe you can post your whole configuration of your standby for easier
>     debug.
>
>     regards,
>
>     fabio pardi
>
>
>
>
>     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     > Hey,
>     > PG 9.6, I have a standalone configured. I tried to start up a
>     secondary,
>     > run standby clone (repmgr). The clone process took 3 hours and during
>     > that time wals were generated(mostly because of the
>     checkpoint_timeout).
>     > As a result of that, when I start the secondary ,I see that the
>     > secondary keeps getting the wals but I dont see any messages that
>     > indicate that the secondary tried to replay the wals. 
>     > messages that i see :
>     > receiving incremental file list
>     > 000000010000377B000000DE
>     >
>     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     > total size is 4.15M  speedup is 1.01
>     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored log file
>     > "000000010000377B000000DE" from archive
>     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the database system is
>     starting up
>     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the database system is
>     > starting up 
>     >
>     > I was hoping to see the following messages (taken from a different
>     > machine) : 
>     > 2019-05-27 01:15:37 EDT  7428  LOG:  restartpoint starting: time
>     > 2019-05-27 01:16:18 EDT  7428  LOG:  restartpoint complete: wrote 406
>     > buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0
>     recycled;
>     > write=41.390 s, sync=0.001 s, total=41.582 s; sync file
>     > s=128, longest=0.000 s, average=0.000 s; distance=2005 kB,
>     estimate=2699 kB
>     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery restart point at
>     4/D096C4F8
>     >
>     > My primary settings(wals settings) : 
>     > wal_buffers = 16MB
>     > checkpoint_completion_target = 0.9
>     > checkpoint_timeout = 30min
>     >
>     > Any idea what can explain why the secondary doesnt replay the wals ?
>
>
Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Fabio Pardi
If you did not even see this messages on your standby logs:

restartpoint starting: xlog

then it means that the checkpoint was even never started.

In that case, I have no clue.

Try to describe step by step how to reproduce the problem together with
your setup and the version number of Postgres and repmgr, and i might be
able to help you further.

regards,

fabio pardi

On 5/27/19 12:17 PM, Mariel Cherkassky wrote:

> standby_mode = 'on'
> primary_conninfo = 'host=X.X.X.X user=repmgr  connect_timeout=10 '
> recovery_target_timeline = 'latest'
> primary_slot_name = repmgr_slot_1
> restore_command = 'rsync -avzhe ssh
> [hidden email]:/var/lib/pgsql/archive/%f /var/lib/pgsql/archive/%f ;
> gunzip < /var/lib/pgsql/archive/%f > %p'
> archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
> /var/lib/pgsql/archive %r'
>
> ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     Hi Mariel,
>
>     let s keep the list in cc...
>
>     settings look ok.
>
>     what's in the recovery.conf file then?
>
>     regards,
>
>     fabio pardi
>
>     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     > Hey,
>     > the configuration is the same as in the primary : 
>     > max_wal_size = 2GB
>     > min_wal_size = 1GB
>     > wal_buffers = 16MB
>     > checkpoint_completion_target = 0.9
>     > checkpoint_timeout = 30min
>     >
>     > Regarding your question, I didnt see this message (consistent recovery
>     > state reached at), I guess thats why the secondary isnt avaialble
>     yet..
>     >
>     > Maybe I'm wrong, but what I understood from the documentation- restart
>     > point is generated only after the secondary had a checkpoint wihch
>     means
>     > only after 30 minutes or after max_wal_size is reached ?  But
>     still, why
>     > wont the secondary reach a consisteny recovery state (does it
>     requires a
>     > restart point to be generated ? )
>     >
>     >
>     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
>     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     Hi Mariel,
>     >
>     >     if i m not wrong, on the secondary you will see the messages you
>     >     mentioned when a checkpoint happens.
>     >
>     >     What are checkpoint_timeout and max_wal_size on your standby?
>     >
>     >     Did you ever see this on your standby log?
>     >
>     >     "consistent recovery state reached at .."
>     >
>     >
>     >     Maybe you can post your whole configuration of your standby
>     for easier
>     >     debug.
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >
>     >
>     >
>     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     > Hey,
>     >     > PG 9.6, I have a standalone configured. I tried to start up a
>     >     secondary,
>     >     > run standby clone (repmgr). The clone process took 3 hours
>     and during
>     >     > that time wals were generated(mostly because of the
>     >     checkpoint_timeout).
>     >     > As a result of that, when I start the secondary ,I see that the
>     >     > secondary keeps getting the wals but I dont see any messages
>     that
>     >     > indicate that the secondary tried to replay the wals. 
>     >     > messages that i see :
>     >     > receiving incremental file list
>     >     > 000000010000377B000000DE
>     >     >
>     >     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     >     > total size is 4.15M  speedup is 1.01
>     >     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored log file
>     >     > "000000010000377B000000DE" from archive
>     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the database system is
>     >     starting up
>     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the database system is
>     >     > starting up 
>     >     >
>     >     > I was hoping to see the following messages (taken from a
>     different
>     >     > machine) : 
>     >     > 2019-05-27 01:15:37 EDT  7428  LOG:  restartpoint starting: time
>     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  restartpoint complete:
>     wrote 406
>     >     > buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0
>     >     recycled;
>     >     > write=41.390 s, sync=0.001 s, total=41.582 s; sync file
>     >     > s=128, longest=0.000 s, average=0.000 s; distance=2005 kB,
>     >     estimate=2699 kB
>     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery restart point at
>     >     4/D096C4F8
>     >     >
>     >     > My primary settings(wals settings) : 
>     >     > wal_buffers = 16MB
>     >     > checkpoint_completion_target = 0.9
>     >     > checkpoint_timeout = 30min
>     >     >
>     >     > Any idea what can explain why the secondary doesnt replay
>     the wals ?
>     >
>     >
>


Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Fabio Pardi
Hi Mariel,

please keep the list posted. When you reply, use 'reply all'. That will maybe help others in the community and you might also get more help from others.

answers are in line here below



On 28/05/2019 10:54, Mariel Cherkassky wrote:
> I have pg 9.6, repmgr version 4.3 .
> I see in the logs that wal files are restored : 
> 2019-05-22 12:35:12 EEST  60942  LOG:  restored log file "000000010000377B000000DB" from archive
> that means that the restore_command worked right ? 
>

right

> According to the docs :
> "In standby mode, a restartpoint is also triggered if checkpoint_segments log segments have been replayed since last restartpoint and at least one checkpoint record has been replayed. Restartpoints can't be performed more frequently than checkpoints in the master because restartpoints can only be performed at checkpoint records"
> so maybe I should decrease max_wal_size or even checkpoint_timeout to force a restartpoint ? 
> During this gap (standby clone) 6-7 wals were generated on the primary
>


From what you posted earlier, you should in any case have hit a checkpoint every 30 minutes. (That was also the assumption in the previous messages. If that's not happening, then i would really investigate.)

That is, if during your cloning only a few WAL files were generated, then it is not enough to trigger a checkpoint and you fallback to the 30 minutes.

I would not be bothered if i were you, but can always force a checkpoint on the master issuing:

CHECKPOINT ;

at that stage, on the standby logs you will see the messages:

restartpoint starting: ..

restartpoint complete: ..



regards,

fabio pardi

>
> ‫בתאריך יום ב׳, 27 במאי 2019 ב-14:04 מאת ‪Fabio Pardi‬‏ <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     If you did not even see this messages on your standby logs:
>
>     restartpoint starting: xlog
>
>     then it means that the checkpoint was even never started.
>
>     In that case, I have no clue.
>
>     Try to describe step by step how to reproduce the problem together with
>     your setup and the version number of Postgres and repmgr, and i might be
>     able to help you further.
>
>     regards,
>
>     fabio pardi
>
>     On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
>     > standby_mode = 'on'
>     > primary_conninfo = 'host=X.X.X.X user=repmgr  connect_timeout=10 '
>     > recovery_target_timeline = 'latest'
>     > primary_slot_name = repmgr_slot_1
>     > restore_command = 'rsync -avzhe ssh
>     > [hidden email]:/var/lib/pgsql/archive/%f /var/lib/pgsql/archive/%f ;
>     > gunzip < /var/lib/pgsql/archive/%f > %p'
>     > archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
>     > /var/lib/pgsql/archive %r'
>     >
>     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
>     > <‪[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     Hi Mariel,
>     >
>     >     let s keep the list in cc...
>     >
>     >     settings look ok.
>     >
>     >     what's in the recovery.conf file then?
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     >     > Hey,
>     >     > the configuration is the same as in the primary : 
>     >     > max_wal_size = 2GB
>     >     > min_wal_size = 1GB
>     >     > wal_buffers = 16MB
>     >     > checkpoint_completion_target = 0.9
>     >     > checkpoint_timeout = 30min
>     >     >
>     >     > Regarding your question, I didnt see this message (consistent recovery
>     >     > state reached at), I guess thats why the secondary isnt avaialble
>     >     yet..
>     >     >
>     >     > Maybe I'm wrong, but what I understood from the documentation- restart
>     >     > point is generated only after the secondary had a checkpoint wihch
>     >     means
>     >     > only after 30 minutes or after max_wal_size is reached ?  But
>     >     still, why
>     >     > wont the secondary reach a consisteny recovery state (does it
>     >     requires a
>     >     > restart point to be generated ? )
>     >     >
>     >     >
>     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
>     >     > <‪[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>
>     >     <mailto:[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>>‬‏>:‬
>     >     >
>     >     >     Hi Mariel,
>     >     >
>     >     >     if i m not wrong, on the secondary you will see the messages you
>     >     >     mentioned when a checkpoint happens.
>     >     >
>     >     >     What are checkpoint_timeout and max_wal_size on your standby?
>     >     >
>     >     >     Did you ever see this on your standby log?
>     >     >
>     >     >     "consistent recovery state reached at .."
>     >     >
>     >     >
>     >     >     Maybe you can post your whole configuration of your standby
>     >     for easier
>     >     >     debug.
>     >     >
>     >     >     regards,
>     >     >
>     >     >     fabio pardi
>     >     >
>     >     >
>     >     >
>     >     >
>     >     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     >     > Hey,
>     >     >     > PG 9.6, I have a standalone configured. I tried to start up a
>     >     >     secondary,
>     >     >     > run standby clone (repmgr). The clone process took 3 hours
>     >     and during
>     >     >     > that time wals were generated(mostly because of the
>     >     >     checkpoint_timeout).
>     >     >     > As a result of that, when I start the secondary ,I see that the
>     >     >     > secondary keeps getting the wals but I dont see any messages
>     >     that
>     >     >     > indicate that the secondary tried to replay the wals. 
>     >     >     > messages that i see :
>     >     >     > receiving incremental file list
>     >     >     > 000000010000377B000000DE
>     >     >     >
>     >     >     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     >     >     > total size is 4.15M  speedup is 1.01
>     >     >     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored log file
>     >     >     > "000000010000377B000000DE" from archive
>     >     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the database system is
>     >     >     starting up
>     >     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the database system is
>     >     >     > starting up 
>     >     >     >
>     >     >     > I was hoping to see the following messages (taken from a
>     >     different
>     >     >     > machine) : 
>     >     >     > 2019-05-27 01:15:37 EDT  7428  LOG:  restartpoint starting: time
>     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  restartpoint complete:
>     >     wrote 406
>     >     >     > buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0
>     >     >     recycled;
>     >     >     > write=41.390 s, sync=0.001 s, total=41.582 s; sync file
>     >     >     > s=128, longest=0.000 s, average=0.000 s; distance=2005 kB,
>     >     >     estimate=2699 kB
>     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery restart point at
>     >     >     4/D096C4F8
>     >     >     >
>     >     >     > My primary settings(wals settings) : 
>     >     >     > wal_buffers = 16MB
>     >     >     > checkpoint_completion_target = 0.9
>     >     >     > checkpoint_timeout = 30min
>     >     >     >
>     >     >     > Any idea what can explain why the secondary doesnt replay
>     >     the wals ?
>     >     >
>     >     >
>     >
>


Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Mariel Cherkassky
First of all thanks Fabio.
I think that I'm missing something : 
In the next questionI'm not talking about streaming replication,rather on recovery : 

1.When the secondary get the wals from the primary it tries to replay them correct ? 

2. By replaying it just go over the wal records and run them in the secondary ?

3.All those changes are saved in the shared_buffer(secondary) or the changed are immediately done on the data files blocks ?

4.The secondary will need a checkpoint in order to flush those changes to the data files and in order to reach a restart point ?

So, basically If I had a checkpoint during the clone, the secondary should also have a checkpoint when I recover the secondary right ?


‫בתאריך יום ג׳, 28 במאי 2019 ב-13:54 מאת ‪Fabio Pardi‬‏ <‪[hidden email]‬‏>:‬
Hi Mariel,

please keep the list posted. When you reply, use 'reply all'. That will maybe help others in the community and you might also get more help from others.

answers are in line here below



On 28/05/2019 10:54, Mariel Cherkassky wrote:
> I have pg 9.6, repmgr version 4.3 .
> I see in the logs that wal files are restored : 
> 2019-05-22 12:35:12 EEST  60942  LOG:  restored log file "000000010000377B000000DB" from archive
> that means that the restore_command worked right ? 
>

right

> According to the docs :
> "In standby mode, a restartpoint is also triggered if checkpoint_segments log segments have been replayed since last restartpoint and at least one checkpoint record has been replayed. Restartpoints can't be performed more frequently than checkpoints in the master because restartpoints can only be performed at checkpoint records"
> so maybe I should decrease max_wal_size or even checkpoint_timeout to force a restartpoint ? 
> During this gap (standby clone) 6-7 wals were generated on the primary
>


From what you posted earlier, you should in any case have hit a checkpoint every 30 minutes. (That was also the assumption in the previous messages. If that's not happening, then i would really investigate.)

That is, if during your cloning only a few WAL files were generated, then it is not enough to trigger a checkpoint and you fallback to the 30 minutes.

I would not be bothered if i were you, but can always force a checkpoint on the master issuing:

CHECKPOINT ;

at that stage, on the standby logs you will see the messages:

restartpoint starting: ..

restartpoint complete: ..



regards,

fabio pardi

>
> ‫בתאריך יום ב׳, 27 במאי 2019 ב-14:04 מאת ‪Fabio Pardi‬‏ <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     If you did not even see this messages on your standby logs:
>
>     restartpoint starting: xlog
>
>     then it means that the checkpoint was even never started.
>
>     In that case, I have no clue.
>
>     Try to describe step by step how to reproduce the problem together with
>     your setup and the version number of Postgres and repmgr, and i might be
>     able to help you further.
>
>     regards,
>
>     fabio pardi
>
>     On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
>     > standby_mode = 'on'
>     > primary_conninfo = 'host=X.X.X.X user=repmgr  connect_timeout=10 '
>     > recovery_target_timeline = 'latest'
>     > primary_slot_name = repmgr_slot_1
>     > restore_command = 'rsync -avzhe ssh
>     > [hidden email]:/var/lib/pgsql/archive/%f /var/lib/pgsql/archive/%f ;
>     > gunzip < /var/lib/pgsql/archive/%f > %p'
>     > archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
>     > /var/lib/pgsql/archive %r'
>     >
>     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
>     > <‪[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     Hi Mariel,
>     >
>     >     let s keep the list in cc...
>     >
>     >     settings look ok.
>     >
>     >     what's in the recovery.conf file then?
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     >     > Hey,
>     >     > the configuration is the same as in the primary : 
>     >     > max_wal_size = 2GB
>     >     > min_wal_size = 1GB
>     >     > wal_buffers = 16MB
>     >     > checkpoint_completion_target = 0.9
>     >     > checkpoint_timeout = 30min
>     >     >
>     >     > Regarding your question, I didnt see this message (consistent recovery
>     >     > state reached at), I guess thats why the secondary isnt avaialble
>     >     yet..
>     >     >
>     >     > Maybe I'm wrong, but what I understood from the documentation- restart
>     >     > point is generated only after the secondary had a checkpoint wihch
>     >     means
>     >     > only after 30 minutes or after max_wal_size is reached ?  But
>     >     still, why
>     >     > wont the secondary reach a consisteny recovery state (does it
>     >     requires a
>     >     > restart point to be generated ? )
>     >     >
>     >     >
>     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
>     >     > <‪[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>
>     >     <mailto:[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>>‬‏>:‬
>     >     >
>     >     >     Hi Mariel,
>     >     >
>     >     >     if i m not wrong, on the secondary you will see the messages you
>     >     >     mentioned when a checkpoint happens.
>     >     >
>     >     >     What are checkpoint_timeout and max_wal_size on your standby?
>     >     >
>     >     >     Did you ever see this on your standby log?
>     >     >
>     >     >     "consistent recovery state reached at .."
>     >     >
>     >     >
>     >     >     Maybe you can post your whole configuration of your standby
>     >     for easier
>     >     >     debug.
>     >     >
>     >     >     regards,
>     >     >
>     >     >     fabio pardi
>     >     >
>     >     >
>     >     >
>     >     >
>     >     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     >     > Hey,
>     >     >     > PG 9.6, I have a standalone configured. I tried to start up a
>     >     >     secondary,
>     >     >     > run standby clone (repmgr). The clone process took 3 hours
>     >     and during
>     >     >     > that time wals were generated(mostly because of the
>     >     >     checkpoint_timeout).
>     >     >     > As a result of that, when I start the secondary ,I see that the
>     >     >     > secondary keeps getting the wals but I dont see any messages
>     >     that
>     >     >     > indicate that the secondary tried to replay the wals. 
>     >     >     > messages that i see :
>     >     >     > receiving incremental file list
>     >     >     > 000000010000377B000000DE
>     >     >     >
>     >     >     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     >     >     > total size is 4.15M  speedup is 1.01
>     >     >     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored log file
>     >     >     > "000000010000377B000000DE" from archive
>     >     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the database system is
>     >     >     starting up
>     >     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the database system is
>     >     >     > starting up 
>     >     >     >
>     >     >     > I was hoping to see the following messages (taken from a
>     >     different
>     >     >     > machine) : 
>     >     >     > 2019-05-27 01:15:37 EDT  7428  LOG:  restartpoint starting: time
>     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  restartpoint complete:
>     >     wrote 406
>     >     >     > buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0
>     >     >     recycled;
>     >     >     > write=41.390 s, sync=0.001 s, total=41.582 s; sync file
>     >     >     > s=128, longest=0.000 s, average=0.000 s; distance=2005 kB,
>     >     >     estimate=2699 kB
>     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery restart point at
>     >     >     4/D096C4F8
>     >     >     >
>     >     >     > My primary settings(wals settings) : 
>     >     >     > wal_buffers = 16MB
>     >     >     > checkpoint_completion_target = 0.9
>     >     >     > checkpoint_timeout = 30min
>     >     >     >
>     >     >     > Any idea what can explain why the secondary doesnt replay
>     >     the wals ?
>     >     >
>     >     >
>     >
>
Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Fabio Pardi


On 5/29/19 9:20 AM, Mariel Cherkassky wrote:
> First of all thanks Fabio.
> I think that I'm missing something : 
> In the next questionI'm not talking about streaming replication,rather
> on recovery : 
>
> 1.When the secondary get the wals from the primary it tries to replay
> them correct ? 


correct

>
> 2. By replaying it just go over the wal records and run them in the
> secondary ?
>

correct

> 3.All those changes are saved in the shared_buffer(secondary) or the
> changed are immediately done on the data files blocks ?
>

the changes are not saved to your datafile yet. That happens at
checkpoint time.

> 4.The secondary will need a checkpoint in order to flush those changes
> to the data files and in order to reach a restart point ?
>

yes

> So, basically If I had a checkpoint during the clone, the secondary
> should also have a checkpoint when I recover the secondary right ?
>

correct. Even after being in sync with master, if you restart Postgres
on standby, it will then re-apply the WAL files from the last checkpoint.

In the logfile of the standby, you will see as many messages reporting
"restored log file" as many WAL files were produced since the last
checkpoint

Hope it helps to clarify.

regards,

fabio pardi

>
> ‫בתאריך יום ג׳, 28 במאי 2019 ב-13:54 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     Hi Mariel,
>
>     please keep the list posted. When you reply, use 'reply all'. That
>     will maybe help others in the community and you might also get more
>     help from others.
>
>     answers are in line here below
>
>
>
>     On 28/05/2019 10:54, Mariel Cherkassky wrote:
>     > I have pg 9.6, repmgr version 4.3 .
>     > I see in the logs that wal files are restored : 
>     > 2019-05-22 12:35:12 EEST  60942  LOG:  restored log file
>     "000000010000377B000000DB" from archive
>     > that means that the restore_command worked right ? 
>     >
>
>     right
>
>     > According to the docs :
>     > "In standby mode, a restartpoint is also triggered
>     if checkpoint_segments log segments have been replayed since last
>     restartpoint and at least one checkpoint record has been replayed.
>     Restartpoints can't be performed more frequently than checkpoints in
>     the master because restartpoints can only be performed at checkpoint
>     records"
>     > so maybe I should decrease max_wal_size or even checkpoint_timeout
>     to force a restartpoint ? 
>     > During this gap (standby clone) 6-7 wals were generated on the primary
>     >
>
>
>     From what you posted earlier, you should in any case have hit a
>     checkpoint every 30 minutes. (That was also the assumption in the
>     previous messages. If that's not happening, then i would really
>     investigate.)
>
>     That is, if during your cloning only a few WAL files were generated,
>     then it is not enough to trigger a checkpoint and you fallback to
>     the 30 minutes.
>
>     I would not be bothered if i were you, but can always force a
>     checkpoint on the master issuing:
>
>     CHECKPOINT ;
>
>     at that stage, on the standby logs you will see the messages:
>
>     restartpoint starting: ..
>
>     restartpoint complete: ..
>
>
>
>     regards,
>
>     fabio pardi
>
>     >
>     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-14:04 מאת ‪Fabio Pardi‬‏
>     <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     If you did not even see this messages on your standby logs:
>     >
>     >     restartpoint starting: xlog
>     >
>     >     then it means that the checkpoint was even never started.
>     >
>     >     In that case, I have no clue.
>     >
>     >     Try to describe step by step how to reproduce the problem
>     together with
>     >     your setup and the version number of Postgres and repmgr, and
>     i might be
>     >     able to help you further.
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >     On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
>     >     > standby_mode = 'on'
>     >     > primary_conninfo = 'host=X.X.X.X user=repmgr 
>     connect_timeout=10 '
>     >     > recovery_target_timeline = 'latest'
>     >     > primary_slot_name = repmgr_slot_1
>     >     > restore_command = 'rsync -avzhe ssh
>     >     > [hidden email]:/var/lib/pgsql/archive/%f
>     /var/lib/pgsql/archive/%f ;
>     >     > gunzip < /var/lib/pgsql/archive/%f > %p'
>     >     > archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
>     >     > /var/lib/pgsql/archive %r'
>     >     >
>     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
>     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>‬‏>:‬
>     >     >
>     >     >     Hi Mariel,
>     >     >
>     >     >     let s keep the list in cc...
>     >     >
>     >     >     settings look ok.
>     >     >
>     >     >     what's in the recovery.conf file then?
>     >     >
>     >     >     regards,
>     >     >
>     >     >     fabio pardi
>     >     >
>     >     >     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     >     >     > Hey,
>     >     >     > the configuration is the same as in the primary : 
>     >     >     > max_wal_size = 2GB
>     >     >     > min_wal_size = 1GB
>     >     >     > wal_buffers = 16MB
>     >     >     > checkpoint_completion_target = 0.9
>     >     >     > checkpoint_timeout = 30min
>     >     >     >
>     >     >     > Regarding your question, I didnt see this message
>     (consistent recovery
>     >     >     > state reached at), I guess thats why the secondary
>     isnt avaialble
>     >     >     yet..
>     >     >     >
>     >     >     > Maybe I'm wrong, but what I understood from the
>     documentation- restart
>     >     >     > point is generated only after the secondary had a
>     checkpoint wihch
>     >     >     means
>     >     >     > only after 30 minutes or after max_wal_size is reached
>     ?  But
>     >     >     still, why
>     >     >     > wont the secondary reach a consisteny recovery state
>     (does it
>     >     >     requires a
>     >     >     > restart point to be generated ? )
>     >     >     >
>     >     >     >
>     >     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
>     >     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     >     <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>> <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>>>‬‏>:‬
>     >     >     >
>     >     >     >     Hi Mariel,
>     >     >     >
>     >     >     >     if i m not wrong, on the secondary you will see
>     the messages you
>     >     >     >     mentioned when a checkpoint happens.
>     >     >     >
>     >     >     >     What are checkpoint_timeout and max_wal_size on
>     your standby?
>     >     >     >
>     >     >     >     Did you ever see this on your standby log?
>     >     >     >
>     >     >     >     "consistent recovery state reached at .."
>     >     >     >
>     >     >     >
>     >     >     >     Maybe you can post your whole configuration of
>     your standby
>     >     >     for easier
>     >     >     >     debug.
>     >     >     >
>     >     >     >     regards,
>     >     >     >
>     >     >     >     fabio pardi
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     >     >     > Hey,
>     >     >     >     > PG 9.6, I have a standalone configured. I tried
>     to start up a
>     >     >     >     secondary,
>     >     >     >     > run standby clone (repmgr). The clone process
>     took 3 hours
>     >     >     and during
>     >     >     >     > that time wals were generated(mostly because of the
>     >     >     >     checkpoint_timeout).
>     >     >     >     > As a result of that, when I start the secondary
>     ,I see that the
>     >     >     >     > secondary keeps getting the wals but I dont see
>     any messages
>     >     >     that
>     >     >     >     > indicate that the secondary tried to replay the
>     wals. 
>     >     >     >     > messages that i see :
>     >     >     >     > receiving incremental file list
>     >     >     >     > 000000010000377B000000DE
>     >     >     >     >
>     >     >     >     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     >     >     >     > total size is 4.15M  speedup is 1.01
>     >     >     >     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored
>     log file
>     >     >     >     > "000000010000377B000000DE" from archive
>     >     >     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the
>     database system is
>     >     >     >     starting up
>     >     >     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the
>     database system is
>     >     >     >     > starting up 
>     >     >     >     >
>     >     >     >     > I was hoping to see the following messages
>     (taken from a
>     >     >     different
>     >     >     >     > machine) : 
>     >     >     >     > 2019-05-27 01:15:37 EDT  7428  LOG:
>      restartpoint starting: time
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:
>      restartpoint complete:
>     >     >     wrote 406
>     >     >     >     > buffers (0.2%); 1 transaction log file(s) added,
>     0 removed, 0
>     >     >     >     recycled;
>     >     >     >     > write=41.390 s, sync=0.001 s, total=41.582 s;
>     sync file
>     >     >     >     > s=128, longest=0.000 s, average=0.000 s;
>     distance=2005 kB,
>     >     >     >     estimate=2699 kB
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery
>     restart point at
>     >     >     >     4/D096C4F8
>     >     >     >     >
>     >     >     >     > My primary settings(wals settings) : 
>     >     >     >     > wal_buffers = 16MB
>     >     >     >     > checkpoint_completion_target = 0.9
>     >     >     >     > checkpoint_timeout = 30min
>     >     >     >     >
>     >     >     >     > Any idea what can explain why the secondary
>     doesnt replay
>     >     >     the wals ?
>     >     >     >
>     >     >     >
>     >     >
>     >
>


Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Mariel Cherkassky
Is there any messages that indicates that the secondary replayed a specific wal ? "restored 00000..." means that the restore_command succeeded but there isnt any proof that it replayed the wal.

My theory regarding the issue : 
 It seems, that my customer stopped the db 20 minutes after the clone have finished.  During those 20 minutes the secondary didnt get enough wal records (6 wal files) so it didnt reach the max_wal_size. My checkpoint_timeout is set to 30minutes, therefore there wasnt any checkpoint. As a result of that the secondary didnt reach a restart point. Does that sounds reasonable ?

So basically, if I clone a small primary db, the secondary would reach a restart point only if it reached a checkpoint (checkpoint_timeout or max_wal_size). However, I have cloned many small dbs and saw the it takes a sec to start the secondary (which means that restartpoint was reached). So what am I missing ?

‫בתאריך יום ד׳, 29 במאי 2019 ב-11:20 מאת ‪Fabio Pardi‬‏ <‪[hidden email]‬‏>:‬


On 5/29/19 9:20 AM, Mariel Cherkassky wrote:
> First of all thanks Fabio.
> I think that I'm missing something : 
> In the next questionI'm not talking about streaming replication,rather
> on recovery : 
>
> 1.When the secondary get the wals from the primary it tries to replay
> them correct ? 


correct

>
> 2. By replaying it just go over the wal records and run them in the
> secondary ?
>

correct

> 3.All those changes are saved in the shared_buffer(secondary) or the
> changed are immediately done on the data files blocks ?
>

the changes are not saved to your datafile yet. That happens at
checkpoint time.

> 4.The secondary will need a checkpoint in order to flush those changes
> to the data files and in order to reach a restart point ?
>

yes

> So, basically If I had a checkpoint during the clone, the secondary
> should also have a checkpoint when I recover the secondary right ?
>

correct. Even after being in sync with master, if you restart Postgres
on standby, it will then re-apply the WAL files from the last checkpoint.

In the logfile of the standby, you will see as many messages reporting
"restored log file" as many WAL files were produced since the last
checkpoint

Hope it helps to clarify.

regards,

fabio pardi
>
> ‫בתאריך יום ג׳, 28 במאי 2019 ב-13:54 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     Hi Mariel,
>
>     please keep the list posted. When you reply, use 'reply all'. That
>     will maybe help others in the community and you might also get more
>     help from others.
>
>     answers are in line here below
>
>
>
>     On 28/05/2019 10:54, Mariel Cherkassky wrote:
>     > I have pg 9.6, repmgr version 4.3 .
>     > I see in the logs that wal files are restored : 
>     > 2019-05-22 12:35:12 EEST  60942  LOG:  restored log file
>     "000000010000377B000000DB" from archive
>     > that means that the restore_command worked right ? 
>     >
>
>     right
>
>     > According to the docs :
>     > "In standby mode, a restartpoint is also triggered
>     if checkpoint_segments log segments have been replayed since last
>     restartpoint and at least one checkpoint record has been replayed.
>     Restartpoints can't be performed more frequently than checkpoints in
>     the master because restartpoints can only be performed at checkpoint
>     records"
>     > so maybe I should decrease max_wal_size or even checkpoint_timeout
>     to force a restartpoint ? 
>     > During this gap (standby clone) 6-7 wals were generated on the primary
>     >
>
>
>     From what you posted earlier, you should in any case have hit a
>     checkpoint every 30 minutes. (That was also the assumption in the
>     previous messages. If that's not happening, then i would really
>     investigate.)
>
>     That is, if during your cloning only a few WAL files were generated,
>     then it is not enough to trigger a checkpoint and you fallback to
>     the 30 minutes.
>
>     I would not be bothered if i were you, but can always force a
>     checkpoint on the master issuing:
>
>     CHECKPOINT ;
>
>     at that stage, on the standby logs you will see the messages:
>
>     restartpoint starting: ..
>
>     restartpoint complete: ..
>
>
>
>     regards,
>
>     fabio pardi
>
>     >
>     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-14:04 מאת ‪Fabio Pardi‬‏
>     <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     If you did not even see this messages on your standby logs:
>     >
>     >     restartpoint starting: xlog
>     >
>     >     then it means that the checkpoint was even never started.
>     >
>     >     In that case, I have no clue.
>     >
>     >     Try to describe step by step how to reproduce the problem
>     together with
>     >     your setup and the version number of Postgres and repmgr, and
>     i might be
>     >     able to help you further.
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >     On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
>     >     > standby_mode = 'on'
>     >     > primary_conninfo = 'host=X.X.X.X user=repmgr 
>     connect_timeout=10 '
>     >     > recovery_target_timeline = 'latest'
>     >     > primary_slot_name = repmgr_slot_1
>     >     > restore_command = 'rsync -avzhe ssh
>     >     > [hidden email]:/var/lib/pgsql/archive/%f
>     /var/lib/pgsql/archive/%f ;
>     >     > gunzip < /var/lib/pgsql/archive/%f > %p'
>     >     > archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
>     >     > /var/lib/pgsql/archive %r'
>     >     >
>     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
>     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>‬‏>:‬
>     >     >
>     >     >     Hi Mariel,
>     >     >
>     >     >     let s keep the list in cc...
>     >     >
>     >     >     settings look ok.
>     >     >
>     >     >     what's in the recovery.conf file then?
>     >     >
>     >     >     regards,
>     >     >
>     >     >     fabio pardi
>     >     >
>     >     >     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     >     >     > Hey,
>     >     >     > the configuration is the same as in the primary : 
>     >     >     > max_wal_size = 2GB
>     >     >     > min_wal_size = 1GB
>     >     >     > wal_buffers = 16MB
>     >     >     > checkpoint_completion_target = 0.9
>     >     >     > checkpoint_timeout = 30min
>     >     >     >
>     >     >     > Regarding your question, I didnt see this message
>     (consistent recovery
>     >     >     > state reached at), I guess thats why the secondary
>     isnt avaialble
>     >     >     yet..
>     >     >     >
>     >     >     > Maybe I'm wrong, but what I understood from the
>     documentation- restart
>     >     >     > point is generated only after the secondary had a
>     checkpoint wihch
>     >     >     means
>     >     >     > only after 30 minutes or after max_wal_size is reached
>     ?  But
>     >     >     still, why
>     >     >     > wont the secondary reach a consisteny recovery state
>     (does it
>     >     >     requires a
>     >     >     > restart point to be generated ? )
>     >     >     >
>     >     >     >
>     >     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
>     >     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     >     <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>> <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>>>‬‏>:‬
>     >     >     >
>     >     >     >     Hi Mariel,
>     >     >     >
>     >     >     >     if i m not wrong, on the secondary you will see
>     the messages you
>     >     >     >     mentioned when a checkpoint happens.
>     >     >     >
>     >     >     >     What are checkpoint_timeout and max_wal_size on
>     your standby?
>     >     >     >
>     >     >     >     Did you ever see this on your standby log?
>     >     >     >
>     >     >     >     "consistent recovery state reached at .."
>     >     >     >
>     >     >     >
>     >     >     >     Maybe you can post your whole configuration of
>     your standby
>     >     >     for easier
>     >     >     >     debug.
>     >     >     >
>     >     >     >     regards,
>     >     >     >
>     >     >     >     fabio pardi
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     >     >     > Hey,
>     >     >     >     > PG 9.6, I have a standalone configured. I tried
>     to start up a
>     >     >     >     secondary,
>     >     >     >     > run standby clone (repmgr). The clone process
>     took 3 hours
>     >     >     and during
>     >     >     >     > that time wals were generated(mostly because of the
>     >     >     >     checkpoint_timeout).
>     >     >     >     > As a result of that, when I start the secondary
>     ,I see that the
>     >     >     >     > secondary keeps getting the wals but I dont see
>     any messages
>     >     >     that
>     >     >     >     > indicate that the secondary tried to replay the
>     wals. 
>     >     >     >     > messages that i see :
>     >     >     >     > receiving incremental file list
>     >     >     >     > 000000010000377B000000DE
>     >     >     >     >
>     >     >     >     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     >     >     >     > total size is 4.15M  speedup is 1.01
>     >     >     >     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored
>     log file
>     >     >     >     > "000000010000377B000000DE" from archive
>     >     >     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the
>     database system is
>     >     >     >     starting up
>     >     >     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the
>     database system is
>     >     >     >     > starting up 
>     >     >     >     >
>     >     >     >     > I was hoping to see the following messages
>     (taken from a
>     >     >     different
>     >     >     >     > machine) : 
>     >     >     >     > 2019-05-27 01:15:37 EDT  7428  LOG:
>      restartpoint starting: time
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:
>      restartpoint complete:
>     >     >     wrote 406
>     >     >     >     > buffers (0.2%); 1 transaction log file(s) added,
>     0 removed, 0
>     >     >     >     recycled;
>     >     >     >     > write=41.390 s, sync=0.001 s, total=41.582 s;
>     sync file
>     >     >     >     > s=128, longest=0.000 s, average=0.000 s;
>     distance=2005 kB,
>     >     >     >     estimate=2699 kB
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery
>     restart point at
>     >     >     >     4/D096C4F8
>     >     >     >     >
>     >     >     >     > My primary settings(wals settings) : 
>     >     >     >     > wal_buffers = 16MB
>     >     >     >     > checkpoint_completion_target = 0.9
>     >     >     >     > checkpoint_timeout = 30min
>     >     >     >     >
>     >     >     >     > Any idea what can explain why the secondary
>     doesnt replay
>     >     >     the wals ?
>     >     >     >
>     >     >     >
>     >     >
>     >
>
Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Nikolay Samokhvalov
Hello Mariel,

1) Have you tried to run “CHECKPOINT;” on the standby to perform restartpoint explicitly? It is possible.

2) If we talk about streaming replication, do you use replication slots? If so, have you checked pg_replication_slots and pg_stat_replication? They can help to troubleshoot streaming replication delays — see columns sent_location, write_location, flush_location, and replay_location in pg_stat_replication and restart_lsn in pg_replication_slots. If you have delay in replaying, it should be seen there.

On Wed, May 29, 2019 at 11:39 Mariel Cherkassky <[hidden email]> wrote:
Is there any messages that indicates that the secondary replayed a specific wal ? "restored 00000..." means that the restore_command succeeded but there isnt any proof that it replayed the wal.

My theory regarding the issue : 
 It seems, that my customer stopped the db 20 minutes after the clone have finished.  During those 20 minutes the secondary didnt get enough wal records (6 wal files) so it didnt reach the max_wal_size. My checkpoint_timeout is set to 30minutes, therefore there wasnt any checkpoint. As a result of that the secondary didnt reach a restart point. Does that sounds reasonable ?

So basically, if I clone a small primary db, the secondary would reach a restart point only if it reached a checkpoint (checkpoint_timeout or max_wal_size). However, I have cloned many small dbs and saw the it takes a sec to start the secondary (which means that restartpoint was reached). So what am I missing ?

‫בתאריך יום ד׳, 29 במאי 2019 ב-11:20 מאת ‪Fabio Pardi‬‏ <‪[hidden email]‬‏>:‬


On 5/29/19 9:20 AM, Mariel Cherkassky wrote:
> First of all thanks Fabio.
> I think that I'm missing something : 
> In the next questionI'm not talking about streaming replication,rather
> on recovery : 
>
> 1.When the secondary get the wals from the primary it tries to replay
> them correct ? 


correct

>
> 2. By replaying it just go over the wal records and run them in the
> secondary ?
>

correct

> 3.All those changes are saved in the shared_buffer(secondary) or the
> changed are immediately done on the data files blocks ?
>

the changes are not saved to your datafile yet. That happens at
checkpoint time.

> 4.The secondary will need a checkpoint in order to flush those changes
> to the data files and in order to reach a restart point ?
>

yes

> So, basically If I had a checkpoint during the clone, the secondary
> should also have a checkpoint when I recover the secondary right ?
>

correct. Even after being in sync with master, if you restart Postgres
on standby, it will then re-apply the WAL files from the last checkpoint.

In the logfile of the standby, you will see as many messages reporting
"restored log file" as many WAL files were produced since the last
checkpoint

Hope it helps to clarify.

regards,

fabio pardi
>
> ‫בתאריך יום ג׳, 28 במאי 2019 ב-13:54 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     Hi Mariel,
>
>     please keep the list posted. When you reply, use 'reply all'. That
>     will maybe help others in the community and you might also get more
>     help from others.
>
>     answers are in line here below
>
>
>
>     On 28/05/2019 10:54, Mariel Cherkassky wrote:
>     > I have pg 9.6, repmgr version 4.3 .
>     > I see in the logs that wal files are restored : 
>     > 2019-05-22 12:35:12 EEST  60942  LOG:  restored log file
>     "000000010000377B000000DB" from archive
>     > that means that the restore_command worked right ? 
>     >
>
>     right
>
>     > According to the docs :
>     > "In standby mode, a restartpoint is also triggered
>     if checkpoint_segments log segments have been replayed since last
>     restartpoint and at least one checkpoint record has been replayed.
>     Restartpoints can't be performed more frequently than checkpoints in
>     the master because restartpoints can only be performed at checkpoint
>     records"
>     > so maybe I should decrease max_wal_size or even checkpoint_timeout
>     to force a restartpoint ? 
>     > During this gap (standby clone) 6-7 wals were generated on the primary
>     >
>
>
>     From what you posted earlier, you should in any case have hit a
>     checkpoint every 30 minutes. (That was also the assumption in the
>     previous messages. If that's not happening, then i would really
>     investigate.)
>
>     That is, if during your cloning only a few WAL files were generated,
>     then it is not enough to trigger a checkpoint and you fallback to
>     the 30 minutes.
>
>     I would not be bothered if i were you, but can always force a
>     checkpoint on the master issuing:
>
>     CHECKPOINT ;
>
>     at that stage, on the standby logs you will see the messages:
>
>     restartpoint starting: ..
>
>     restartpoint complete: ..
>
>
>
>     regards,
>
>     fabio pardi
>
>     >
>     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-14:04 מאת ‪Fabio Pardi‬‏
>     <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     If you did not even see this messages on your standby logs:
>     >
>     >     restartpoint starting: xlog
>     >
>     >     then it means that the checkpoint was even never started.
>     >
>     >     In that case, I have no clue.
>     >
>     >     Try to describe step by step how to reproduce the problem
>     together with
>     >     your setup and the version number of Postgres and repmgr, and
>     i might be
>     >     able to help you further.
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >     On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
>     >     > standby_mode = 'on'
>     >     > primary_conninfo = 'host=X.X.X.X user=repmgr 
>     connect_timeout=10 '
>     >     > recovery_target_timeline = 'latest'
>     >     > primary_slot_name = repmgr_slot_1
>     >     > restore_command = 'rsync -avzhe ssh
>     >     > [hidden email]:/var/lib/pgsql/archive/%f
>     /var/lib/pgsql/archive/%f ;
>     >     > gunzip < /var/lib/pgsql/archive/%f > %p'
>     >     > archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
>     >     > /var/lib/pgsql/archive %r'
>     >     >
>     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
>     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>‬‏>:‬
>     >     >
>     >     >     Hi Mariel,
>     >     >
>     >     >     let s keep the list in cc...
>     >     >
>     >     >     settings look ok.
>     >     >
>     >     >     what's in the recovery.conf file then?
>     >     >
>     >     >     regards,
>     >     >
>     >     >     fabio pardi
>     >     >
>     >     >     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     >     >     > Hey,
>     >     >     > the configuration is the same as in the primary : 
>     >     >     > max_wal_size = 2GB
>     >     >     > min_wal_size = 1GB
>     >     >     > wal_buffers = 16MB
>     >     >     > checkpoint_completion_target = 0.9
>     >     >     > checkpoint_timeout = 30min
>     >     >     >
>     >     >     > Regarding your question, I didnt see this message
>     (consistent recovery
>     >     >     > state reached at), I guess thats why the secondary
>     isnt avaialble
>     >     >     yet..
>     >     >     >
>     >     >     > Maybe I'm wrong, but what I understood from the
>     documentation- restart
>     >     >     > point is generated only after the secondary had a
>     checkpoint wihch
>     >     >     means
>     >     >     > only after 30 minutes or after max_wal_size is reached
>     ?  But
>     >     >     still, why
>     >     >     > wont the secondary reach a consisteny recovery state
>     (does it
>     >     >     requires a
>     >     >     > restart point to be generated ? )
>     >     >     >
>     >     >     >
>     >     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
>     >     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     >     <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>> <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>>>‬‏>:‬
>     >     >     >
>     >     >     >     Hi Mariel,
>     >     >     >
>     >     >     >     if i m not wrong, on the secondary you will see
>     the messages you
>     >     >     >     mentioned when a checkpoint happens.
>     >     >     >
>     >     >     >     What are checkpoint_timeout and max_wal_size on
>     your standby?
>     >     >     >
>     >     >     >     Did you ever see this on your standby log?
>     >     >     >
>     >     >     >     "consistent recovery state reached at .."
>     >     >     >
>     >     >     >
>     >     >     >     Maybe you can post your whole configuration of
>     your standby
>     >     >     for easier
>     >     >     >     debug.
>     >     >     >
>     >     >     >     regards,
>     >     >     >
>     >     >     >     fabio pardi
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     >     >     > Hey,
>     >     >     >     > PG 9.6, I have a standalone configured. I tried
>     to start up a
>     >     >     >     secondary,
>     >     >     >     > run standby clone (repmgr). The clone process
>     took 3 hours
>     >     >     and during
>     >     >     >     > that time wals were generated(mostly because of the
>     >     >     >     checkpoint_timeout).
>     >     >     >     > As a result of that, when I start the secondary
>     ,I see that the
>     >     >     >     > secondary keeps getting the wals but I dont see
>     any messages
>     >     >     that
>     >     >     >     > indicate that the secondary tried to replay the
>     wals. 
>     >     >     >     > messages that i see :
>     >     >     >     > receiving incremental file list
>     >     >     >     > 000000010000377B000000DE
>     >     >     >     >
>     >     >     >     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     >     >     >     > total size is 4.15M  speedup is 1.01
>     >     >     >     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored
>     log file
>     >     >     >     > "000000010000377B000000DE" from archive
>     >     >     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the
>     database system is
>     >     >     >     starting up
>     >     >     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the
>     database system is
>     >     >     >     > starting up 
>     >     >     >     >
>     >     >     >     > I was hoping to see the following messages
>     (taken from a
>     >     >     different
>     >     >     >     > machine) : 
>     >     >     >     > 2019-05-27 01:15:37 EDT  7428  LOG:
>      restartpoint starting: time
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:
>      restartpoint complete:
>     >     >     wrote 406
>     >     >     >     > buffers (0.2%); 1 transaction log file(s) added,
>     0 removed, 0
>     >     >     >     recycled;
>     >     >     >     > write=41.390 s, sync=0.001 s, total=41.582 s;
>     sync file
>     >     >     >     > s=128, longest=0.000 s, average=0.000 s;
>     distance=2005 kB,
>     >     >     >     estimate=2699 kB
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery
>     restart point at
>     >     >     >     4/D096C4F8
>     >     >     >     >
>     >     >     >     > My primary settings(wals settings) : 
>     >     >     >     > wal_buffers = 16MB
>     >     >     >     > checkpoint_completion_target = 0.9
>     >     >     >     > checkpoint_timeout = 30min
>     >     >     >     >
>     >     >     >     > Any idea what can explain why the secondary
>     doesnt replay
>     >     >     the wals ?
>     >     >     >
>     >     >     >
>     >     >
>     >
>
Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Mariel Cherkassky
In my case, I'm talking about creating a standby -> standby clone and then recover the wals . How can I run checkpoint in the secondary if it isnt up ?

‫בתאריך יום ד׳, 29 במאי 2019 ב-13:54 מאת ‪Nikolay Samokhvalov‬‏ <‪[hidden email]‬‏>:‬
Hello Mariel,

1) Have you tried to run “CHECKPOINT;” on the standby to perform restartpoint explicitly? It is possible.

2) If we talk about streaming replication, do you use replication slots? If so, have you checked pg_replication_slots and pg_stat_replication? They can help to troubleshoot streaming replication delays — see columns sent_location, write_location, flush_location, and replay_location in pg_stat_replication and restart_lsn in pg_replication_slots. If you have delay in replaying, it should be seen there.

On Wed, May 29, 2019 at 11:39 Mariel Cherkassky <[hidden email]> wrote:
Is there any messages that indicates that the secondary replayed a specific wal ? "restored 00000..." means that the restore_command succeeded but there isnt any proof that it replayed the wal.

My theory regarding the issue : 
 It seems, that my customer stopped the db 20 minutes after the clone have finished.  During those 20 minutes the secondary didnt get enough wal records (6 wal files) so it didnt reach the max_wal_size. My checkpoint_timeout is set to 30minutes, therefore there wasnt any checkpoint. As a result of that the secondary didnt reach a restart point. Does that sounds reasonable ?

So basically, if I clone a small primary db, the secondary would reach a restart point only if it reached a checkpoint (checkpoint_timeout or max_wal_size). However, I have cloned many small dbs and saw the it takes a sec to start the secondary (which means that restartpoint was reached). So what am I missing ?

‫בתאריך יום ד׳, 29 במאי 2019 ב-11:20 מאת ‪Fabio Pardi‬‏ <‪[hidden email]‬‏>:‬


On 5/29/19 9:20 AM, Mariel Cherkassky wrote:
> First of all thanks Fabio.
> I think that I'm missing something : 
> In the next questionI'm not talking about streaming replication,rather
> on recovery : 
>
> 1.When the secondary get the wals from the primary it tries to replay
> them correct ? 


correct

>
> 2. By replaying it just go over the wal records and run them in the
> secondary ?
>

correct

> 3.All those changes are saved in the shared_buffer(secondary) or the
> changed are immediately done on the data files blocks ?
>

the changes are not saved to your datafile yet. That happens at
checkpoint time.

> 4.The secondary will need a checkpoint in order to flush those changes
> to the data files and in order to reach a restart point ?
>

yes

> So, basically If I had a checkpoint during the clone, the secondary
> should also have a checkpoint when I recover the secondary right ?
>

correct. Even after being in sync with master, if you restart Postgres
on standby, it will then re-apply the WAL files from the last checkpoint.

In the logfile of the standby, you will see as many messages reporting
"restored log file" as many WAL files were produced since the last
checkpoint

Hope it helps to clarify.

regards,

fabio pardi
>
> ‫בתאריך יום ג׳, 28 במאי 2019 ב-13:54 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>     Hi Mariel,
>
>     please keep the list posted. When you reply, use 'reply all'. That
>     will maybe help others in the community and you might also get more
>     help from others.
>
>     answers are in line here below
>
>
>
>     On 28/05/2019 10:54, Mariel Cherkassky wrote:
>     > I have pg 9.6, repmgr version 4.3 .
>     > I see in the logs that wal files are restored : 
>     > 2019-05-22 12:35:12 EEST  60942  LOG:  restored log file
>     "000000010000377B000000DB" from archive
>     > that means that the restore_command worked right ? 
>     >
>
>     right
>
>     > According to the docs :
>     > "In standby mode, a restartpoint is also triggered
>     if checkpoint_segments log segments have been replayed since last
>     restartpoint and at least one checkpoint record has been replayed.
>     Restartpoints can't be performed more frequently than checkpoints in
>     the master because restartpoints can only be performed at checkpoint
>     records"
>     > so maybe I should decrease max_wal_size or even checkpoint_timeout
>     to force a restartpoint ? 
>     > During this gap (standby clone) 6-7 wals were generated on the primary
>     >
>
>
>     From what you posted earlier, you should in any case have hit a
>     checkpoint every 30 minutes. (That was also the assumption in the
>     previous messages. If that's not happening, then i would really
>     investigate.)
>
>     That is, if during your cloning only a few WAL files were generated,
>     then it is not enough to trigger a checkpoint and you fallback to
>     the 30 minutes.
>
>     I would not be bothered if i were you, but can always force a
>     checkpoint on the master issuing:
>
>     CHECKPOINT ;
>
>     at that stage, on the standby logs you will see the messages:
>
>     restartpoint starting: ..
>
>     restartpoint complete: ..
>
>
>
>     regards,
>
>     fabio pardi
>
>     >
>     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-14:04 מאת ‪Fabio Pardi‬‏
>     <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     If you did not even see this messages on your standby logs:
>     >
>     >     restartpoint starting: xlog
>     >
>     >     then it means that the checkpoint was even never started.
>     >
>     >     In that case, I have no clue.
>     >
>     >     Try to describe step by step how to reproduce the problem
>     together with
>     >     your setup and the version number of Postgres and repmgr, and
>     i might be
>     >     able to help you further.
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >     On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
>     >     > standby_mode = 'on'
>     >     > primary_conninfo = 'host=X.X.X.X user=repmgr 
>     connect_timeout=10 '
>     >     > recovery_target_timeline = 'latest'
>     >     > primary_slot_name = repmgr_slot_1
>     >     > restore_command = 'rsync -avzhe ssh
>     >     > [hidden email]:/var/lib/pgsql/archive/%f
>     /var/lib/pgsql/archive/%f ;
>     >     > gunzip < /var/lib/pgsql/archive/%f > %p'
>     >     > archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
>     >     > /var/lib/pgsql/archive %r'
>     >     >
>     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
>     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>‬‏>:‬
>     >     >
>     >     >     Hi Mariel,
>     >     >
>     >     >     let s keep the list in cc...
>     >     >
>     >     >     settings look ok.
>     >     >
>     >     >     what's in the recovery.conf file then?
>     >     >
>     >     >     regards,
>     >     >
>     >     >     fabio pardi
>     >     >
>     >     >     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     >     >     > Hey,
>     >     >     > the configuration is the same as in the primary : 
>     >     >     > max_wal_size = 2GB
>     >     >     > min_wal_size = 1GB
>     >     >     > wal_buffers = 16MB
>     >     >     > checkpoint_completion_target = 0.9
>     >     >     > checkpoint_timeout = 30min
>     >     >     >
>     >     >     > Regarding your question, I didnt see this message
>     (consistent recovery
>     >     >     > state reached at), I guess thats why the secondary
>     isnt avaialble
>     >     >     yet..
>     >     >     >
>     >     >     > Maybe I'm wrong, but what I understood from the
>     documentation- restart
>     >     >     > point is generated only after the secondary had a
>     checkpoint wihch
>     >     >     means
>     >     >     > only after 30 minutes or after max_wal_size is reached
>     ?  But
>     >     >     still, why
>     >     >     > wont the secondary reach a consisteny recovery state
>     (does it
>     >     >     requires a
>     >     >     > restart point to be generated ? )
>     >     >     >
>     >     >     >
>     >     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
>     >     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     >     <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>> <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>>>‬‏>:‬
>     >     >     >
>     >     >     >     Hi Mariel,
>     >     >     >
>     >     >     >     if i m not wrong, on the secondary you will see
>     the messages you
>     >     >     >     mentioned when a checkpoint happens.
>     >     >     >
>     >     >     >     What are checkpoint_timeout and max_wal_size on
>     your standby?
>     >     >     >
>     >     >     >     Did you ever see this on your standby log?
>     >     >     >
>     >     >     >     "consistent recovery state reached at .."
>     >     >     >
>     >     >     >
>     >     >     >     Maybe you can post your whole configuration of
>     your standby
>     >     >     for easier
>     >     >     >     debug.
>     >     >     >
>     >     >     >     regards,
>     >     >     >
>     >     >     >     fabio pardi
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     >     >     > Hey,
>     >     >     >     > PG 9.6, I have a standalone configured. I tried
>     to start up a
>     >     >     >     secondary,
>     >     >     >     > run standby clone (repmgr). The clone process
>     took 3 hours
>     >     >     and during
>     >     >     >     > that time wals were generated(mostly because of the
>     >     >     >     checkpoint_timeout).
>     >     >     >     > As a result of that, when I start the secondary
>     ,I see that the
>     >     >     >     > secondary keeps getting the wals but I dont see
>     any messages
>     >     >     that
>     >     >     >     > indicate that the secondary tried to replay the
>     wals. 
>     >     >     >     > messages that i see :
>     >     >     >     > receiving incremental file list
>     >     >     >     > 000000010000377B000000DE
>     >     >     >     >
>     >     >     >     > sent 30 bytes  received 4.11M bytes  8.22M bytes/sec
>     >     >     >     > total size is 4.15M  speedup is 1.01
>     >     >     >     > 2019-05-22 12:48:10 EEST  60942  LOG:  restored
>     log file
>     >     >     >     > "000000010000377B000000DE" from archive
>     >     >     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the
>     database system is
>     >     >     >     starting up
>     >     >     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the
>     database system is
>     >     >     >     > starting up 
>     >     >     >     >
>     >     >     >     > I was hoping to see the following messages
>     (taken from a
>     >     >     different
>     >     >     >     > machine) : 
>     >     >     >     > 2019-05-27 01:15:37 EDT  7428  LOG:
>      restartpoint starting: time
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:
>      restartpoint complete:
>     >     >     wrote 406
>     >     >     >     > buffers (0.2%); 1 transaction log file(s) added,
>     0 removed, 0
>     >     >     >     recycled;
>     >     >     >     > write=41.390 s, sync=0.001 s, total=41.582 s;
>     sync file
>     >     >     >     > s=128, longest=0.000 s, average=0.000 s;
>     distance=2005 kB,
>     >     >     >     estimate=2699 kB
>     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery
>     restart point at
>     >     >     >     4/D096C4F8
>     >     >     >     >
>     >     >     >     > My primary settings(wals settings) : 
>     >     >     >     > wal_buffers = 16MB
>     >     >     >     > checkpoint_completion_target = 0.9
>     >     >     >     > checkpoint_timeout = 30min
>     >     >     >     >
>     >     >     >     > Any idea what can explain why the secondary
>     doesnt replay
>     >     >     the wals ?
>     >     >     >
>     >     >     >
>     >     >
>     >
>
Reply | Threaded
Open this post in threaded view
|

Re: improve wals replay on secondary

Fabio Pardi
In reply to this post by Mariel Cherkassky


On 5/29/19 10:39 AM, Mariel Cherkassky wrote:
> Is there any messages that indicates that the secondary replayed a
> specific wal ? "restored 00000..." means that the restore_command
> succeeded but there isnt any proof that it replayed the wal.

I believe that the message "restored log file..' is an acknowledgement
that the file has been restored and the WAL file applied. At least
that's what i know and understand from the source code.



>
> My theory regarding the issue : 
>  It seems, that my customer stopped the db 20 minutes after the clone
> have finished.  During those 20 minutes the secondary didnt get enough
> wal records (6 wal files) so it didnt reach the max_wal_size. My
> checkpoint_timeout is set to 30minutes, therefore there wasnt any
> checkpoint. As a result of that the secondary didnt reach a restart
> point. Does that sounds reasonable ?
>

It could be, but i do not see the point here.

> So basically, if I clone a small primary db, the secondary would reach a
> restart point only if it reached a checkpoint (checkpoint_timeout or
> max_wal_size). However, I have cloned many small dbs and saw the it
> takes a sec to start the secondary (which means that restartpoint was
> reached). So what am I missing ?
>

The restart point is reached in relation to checkpoints.

But when all available WAL files have been applied, you should be able
to connect to your standby, regardless to when the last checkpoint
occurred, if your standby allows that (hot_standby = on).




regards,

fabio pardi

> ‫בתאריך יום ד׳, 29 במאי 2019 ב-11:20 מאת ‪Fabio Pardi‬‏
> <‪[hidden email] <mailto:[hidden email]>‬‏>:‬
>
>
>
>     On 5/29/19 9:20 AM, Mariel Cherkassky wrote:
>     > First of all thanks Fabio.
>     > I think that I'm missing something : 
>     > In the next questionI'm not talking about streaming replication,rather
>     > on recovery : 
>     >
>     > 1.When the secondary get the wals from the primary it tries to replay
>     > them correct ? 
>
>
>     correct
>
>     >
>     > 2. By replaying it just go over the wal records and run them in the
>     > secondary ?
>     >
>
>     correct
>
>     > 3.All those changes are saved in the shared_buffer(secondary) or the
>     > changed are immediately done on the data files blocks ?
>     >
>
>     the changes are not saved to your datafile yet. That happens at
>     checkpoint time.
>
>     > 4.The secondary will need a checkpoint in order to flush those changes
>     > to the data files and in order to reach a restart point ?
>     >
>
>     yes
>
>     > So, basically If I had a checkpoint during the clone, the secondary
>     > should also have a checkpoint when I recover the secondary right ?
>     >
>
>     correct. Even after being in sync with master, if you restart Postgres
>     on standby, it will then re-apply the WAL files from the last
>     checkpoint.
>
>     In the logfile of the standby, you will see as many messages reporting
>     "restored log file" as many WAL files were produced since the last
>     checkpoint
>
>     Hope it helps to clarify.
>
>     regards,
>
>     fabio pardi
>     >
>     > ‫בתאריך יום ג׳, 28 במאי 2019 ב-13:54 מאת ‪Fabio Pardi‬‏
>     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>‬‏>:‬
>     >
>     >     Hi Mariel,
>     >
>     >     please keep the list posted. When you reply, use 'reply all'. That
>     >     will maybe help others in the community and you might also get
>     more
>     >     help from others.
>     >
>     >     answers are in line here below
>     >
>     >
>     >
>     >     On 28/05/2019 10:54, Mariel Cherkassky wrote:
>     >     > I have pg 9.6, repmgr version 4.3 .
>     >     > I see in the logs that wal files are restored : 
>     >     > 2019-05-22 12:35:12 EEST  60942  LOG:  restored log file
>     >     "000000010000377B000000DB" from archive
>     >     > that means that the restore_command worked right ? 
>     >     >
>     >
>     >     right
>     >
>     >     > According to the docs :
>     >     > "In standby mode, a restartpoint is also triggered
>     >     if checkpoint_segments log segments have been replayed since last
>     >     restartpoint and at least one checkpoint record has been replayed.
>     >     Restartpoints can't be performed more frequently than
>     checkpoints in
>     >     the master because restartpoints can only be performed at
>     checkpoint
>     >     records"
>     >     > so maybe I should decrease max_wal_size or even
>     checkpoint_timeout
>     >     to force a restartpoint ? 
>     >     > During this gap (standby clone) 6-7 wals were generated on
>     the primary
>     >     >
>     >
>     >
>     >     From what you posted earlier, you should in any case have hit a
>     >     checkpoint every 30 minutes. (That was also the assumption in the
>     >     previous messages. If that's not happening, then i would really
>     >     investigate.)
>     >
>     >     That is, if during your cloning only a few WAL files were
>     generated,
>     >     then it is not enough to trigger a checkpoint and you fallback to
>     >     the 30 minutes.
>     >
>     >     I would not be bothered if i were you, but can always force a
>     >     checkpoint on the master issuing:
>     >
>     >     CHECKPOINT ;
>     >
>     >     at that stage, on the standby logs you will see the messages:
>     >
>     >     restartpoint starting: ..
>     >
>     >     restartpoint complete: ..
>     >
>     >
>     >
>     >     regards,
>     >
>     >     fabio pardi
>     >
>     >     >
>     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-14:04 מאת ‪Fabio Pardi‬‏
>     >     <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>‬‏>:‬
>     >     >
>     >     >     If you did not even see this messages on your standby logs:
>     >     >
>     >     >     restartpoint starting: xlog
>     >     >
>     >     >     then it means that the checkpoint was even never started.
>     >     >
>     >     >     In that case, I have no clue.
>     >     >
>     >     >     Try to describe step by step how to reproduce the problem
>     >     together with
>     >     >     your setup and the version number of Postgres and
>     repmgr, and
>     >     i might be
>     >     >     able to help you further.
>     >     >
>     >     >     regards,
>     >     >
>     >     >     fabio pardi
>     >     >
>     >     >     On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
>     >     >     > standby_mode = 'on'
>     >     >     > primary_conninfo = 'host=X.X.X.X user=repmgr 
>     >     connect_timeout=10 '
>     >     >     > recovery_target_timeline = 'latest'
>     >     >     > primary_slot_name = repmgr_slot_1
>     >     >     > restore_command = 'rsync -avzhe ssh
>     >     >     > [hidden email]:/var/lib/pgsql/archive/%f
>     >     /var/lib/pgsql/archive/%f ;
>     >     >     > gunzip < /var/lib/pgsql/archive/%f > %p'
>     >     >     > archive_cleanup_command =
>     '/usr/pgsql-9.6/bin/pg_archivecleanup
>     >     >     > /var/lib/pgsql/archive %r'
>     >     >     >
>     >     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
>     >     >     > <‪[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>>‬‏>:‬
>     >     >     >
>     >     >     >     Hi Mariel,
>     >     >     >
>     >     >     >     let s keep the list in cc...
>     >     >     >
>     >     >     >     settings look ok.
>     >     >     >
>     >     >     >     what's in the recovery.conf file then?
>     >     >     >
>     >     >     >     regards,
>     >     >     >
>     >     >     >     fabio pardi
>     >     >     >
>     >     >     >     On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
>     >     >     >     > Hey,
>     >     >     >     > the configuration is the same as in the primary : 
>     >     >     >     > max_wal_size = 2GB
>     >     >     >     > min_wal_size = 1GB
>     >     >     >     > wal_buffers = 16MB
>     >     >     >     > checkpoint_completion_target = 0.9
>     >     >     >     > checkpoint_timeout = 30min
>     >     >     >     >
>     >     >     >     > Regarding your question, I didnt see this message
>     >     (consistent recovery
>     >     >     >     > state reached at), I guess thats why the secondary
>     >     isnt avaialble
>     >     >     >     yet..
>     >     >     >     >
>     >     >     >     > Maybe I'm wrong, but what I understood from the
>     >     documentation- restart
>     >     >     >     > point is generated only after the secondary had a
>     >     checkpoint wihch
>     >     >     >     means
>     >     >     >     > only after 30 minutes or after max_wal_size is
>     reached
>     >     ?  But
>     >     >     >     still, why
>     >     >     >     > wont the secondary reach a consisteny recovery state
>     >     (does it
>     >     >     >     requires a
>     >     >     >     > restart point to be generated ? )
>     >     >     >     >
>     >     >     >     >
>     >     >     >     > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio
>     Pardi‬‏
>     >     >     >     > <‪[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>
>     >     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >     <mailto:[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>>>
>     >     >     >     <mailto:[hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     >     <mailto:[hidden email] <mailto:[hidden email]>>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     >     <mailto:[hidden email] <mailto:[hidden email]>>
>     <mailto:[hidden email] <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>>>>>‬‏>:‬
>     >     >     >     >
>     >     >     >     >     Hi Mariel,
>     >     >     >     >
>     >     >     >     >     if i m not wrong, on the secondary you will see
>     >     the messages you
>     >     >     >     >     mentioned when a checkpoint happens.
>     >     >     >     >
>     >     >     >     >     What are checkpoint_timeout and max_wal_size on
>     >     your standby?
>     >     >     >     >
>     >     >     >     >     Did you ever see this on your standby log?
>     >     >     >     >
>     >     >     >     >     "consistent recovery state reached at .."
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >     Maybe you can post your whole configuration of
>     >     your standby
>     >     >     >     for easier
>     >     >     >     >     debug.
>     >     >     >     >
>     >     >     >     >     regards,
>     >     >     >     >
>     >     >     >     >     fabio pardi
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >     On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
>     >     >     >     >     > Hey,
>     >     >     >     >     > PG 9.6, I have a standalone configured. I
>     tried
>     >     to start up a
>     >     >     >     >     secondary,
>     >     >     >     >     > run standby clone (repmgr). The clone process
>     >     took 3 hours
>     >     >     >     and during
>     >     >     >     >     > that time wals were generated(mostly
>     because of the
>     >     >     >     >     checkpoint_timeout).
>     >     >     >     >     > As a result of that, when I start the
>     secondary
>     >     ,I see that the
>     >     >     >     >     > secondary keeps getting the wals but I
>     dont see
>     >     any messages
>     >     >     >     that
>     >     >     >     >     > indicate that the secondary tried to
>     replay the
>     >     wals. 
>     >     >     >     >     > messages that i see :
>     >     >     >     >     > receiving incremental file list
>     >     >     >     >     > 000000010000377B000000DE
>     >     >     >     >     >
>     >     >     >     >     > sent 30 bytes  received 4.11M bytes  8.22M
>     bytes/sec
>     >     >     >     >     > total size is 4.15M  speedup is 1.01
>     >     >     >     >     > 2019-05-22 12:48:10 EEST  60942  LOG:
>      restored
>     >     log file
>     >     >     >     >     > "000000010000377B000000DE" from archive
>     >     >     >     >     > 2019-05-22 12:48:11 EEST db63311  FATAL:  the
>     >     database system is
>     >     >     >     >     starting up
>     >     >     >     >     > 2019-05-22 12:48:12 EEST db63313  FATAL:  the
>     >     database system is
>     >     >     >     >     > starting up 
>     >     >     >     >     >
>     >     >     >     >     > I was hoping to see the following messages
>     >     (taken from a
>     >     >     >     different
>     >     >     >     >     > machine) : 
>     >     >     >     >     > 2019-05-27 01:15:37 EDT  7428  LOG:
>     >      restartpoint starting: time
>     >     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:
>     >      restartpoint complete:
>     >     >     >     wrote 406
>     >     >     >     >     > buffers (0.2%); 1 transaction log file(s)
>     added,
>     >     0 removed, 0
>     >     >     >     >     recycled;
>     >     >     >     >     > write=41.390 s, sync=0.001 s, total=41.582 s;
>     >     sync file
>     >     >     >     >     > s=128, longest=0.000 s, average=0.000 s;
>     >     distance=2005 kB,
>     >     >     >     >     estimate=2699 kB
>     >     >     >     >     > 2019-05-27 01:16:18 EDT  7428  LOG:  recovery
>     >     restart point at
>     >     >     >     >     4/D096C4F8
>     >     >     >     >     >
>     >     >     >     >     > My primary settings(wals settings) : 
>     >     >     >     >     > wal_buffers = 16MB
>     >     >     >     >     > checkpoint_completion_target = 0.9
>     >     >     >     >     > checkpoint_timeout = 30min
>     >     >     >     >     >
>     >     >     >     >     > Any idea what can explain why the secondary
>     >     doesnt replay
>     >     >     >     the wals ?
>     >     >     >     >
>     >     >     >     >
>     >     >     >
>     >     >
>     >
>