pg_rewind is not crash safe

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

pg_rewind is not crash safe

Heikki Linnakangas
A colleague of mine brought to my attention that pg_rewind is not crash
safe. If it is interrupted for any reason, it leaves behind a data
directory with a mix of data from the source and target images. If
you're "lucky", the server will start up, but it can be in an
inconsistent state. That's obviously not good. It would be nice to:

1. Detect the situation, and refuse to start up.

Or even better:

2. Make pg_rewind crash safe, so that you could safely restart it if
it's interrupted.

Has anyone else run into this? How did you work around it?

It doesn't seem hard to detect this. pg_rewind can somehow "poison" the
data directory just before it starts making irreversible changes. I'm
thinking of updating the 'state' in the control file to a new
PG_IN_REWIND value.

It also doesn't seem too hard to make it restartable. As long as you
point it to the same source server, it is already almost safe to run
pg_rewind again. If we re-order the way it writes the control or backup
files and makes other changes, pg_rewind can verify that you pointed it
at the same or compatible primary as before.

I think there's one corner case with truncated files, if pg_rewind has
extended a file by copying missing "tail" from the source system, but
the system crashes before it's fsynced to disk. But I think we can fix
that too, by paying attention to SMGR_TRUNCATE records when scanning the
source WAL.

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: pg_rewind is not crash safe

Andrey Borodin-2


> 5 авг. 2020 г., в 23:13, Heikki Linnakangas <[hidden email]> написал(а):
>
> A colleague of mine brought to my attention that pg_rewind is not crash safe. If it is interrupted for any reason, it leaves behind a data directory with a mix of data from the source and target images. If you're "lucky", the server will start up, but it can be in an inconsistent state.

FWIW we routinely encounter cases when after unsuccessful pg_rewind databases refuses to start with "contrecord requested" message.
I did not investigate this in detail yet, but I think it is a result of wrong redo recptr written to control file (due to interruption or insufficient present WAL segments).

Best regards, Andrey Borodin.