pgsql: Fix corner case failure of new standby to follow new primary.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

pgsql: Fix corner case failure of new standby to follow new primary.

Robert Haas-5
Fix corner case failure of new standby to follow new primary.

This only happens if (1) the new standby has no WAL available locally,
(2) the new standby is starting from the old timeline, (3) the promotion
happened in the WAL segment from which the new standby is starting,
(4) the timeline history file for the new timeline is available from
the archive but the WAL files for are not (i.e. this is a race),
(5) the WAL files for the new timeline are available via streaming,
and (6) recovery_target_timeline='latest'.

Commit ee994272ca50f70b53074f0febaec97e28f83c4e introduced this
logic and was an improvement over the previous code, but it mishandled
this case. If recovery_target_timeline='latest' and restore_command is
set, validateRecoveryParameters() can change recoveryTargetTLI to be
different from receiveTLI. If streaming is then tried afterward,
expectedTLEs gets initialized with the history of the wrong timeline.
It's supposed to be a list of entries explaining how to get to the
target timeline, but in this case it ends up with a list of entries
explaining how to get to the new standby's original timeline, which
isn't right.

Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi.

Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@...

Branch
------
REL_11_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/ca158c168ea381ac54aab3e7a3c1239747ee4a7f

Modified Files
--------------
src/backend/access/transam/xlog.c                | 10 ++-
src/test/recovery/t/025_stuck_on_old_timeline.pl | 96 ++++++++++++++++++++++++
src/test/recovery/t/cp_history_files             | 10 +++
3 files changed, 115 insertions(+), 1 deletion(-)