Offline enabling/disabling of data checksums

classic Classic list List threaded Threaded
149 messages Options
12345 ... 8
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Banck-2
Hi,

Am Freitag, den 28.12.2018, 10:12 +0100 schrieb Magnus Hagander:

> On Fri, Dec 28, 2018 at 1:14 AM Tomas Vondra <[hidden email]> wrote:
> > On 12/28/18 12:25 AM, Michael Paquier wrote:
> > > On Thu, Dec 27, 2018 at 03:46:48PM +0100, Tomas Vondra wrote:
> > >> On 12/27/18 11:43 AM, Magnus Hagander wrote:
> > >>> Should we double-check with packagers that this won't cause a problem?
> > >>> Though the fact that it's done in a major release should make it
> > >>> perfectly fine I think -- and it's a smaller change than when we did all
> > >>> those xlog->wal changes...
> > >>>
> > >>
> > >> I think it makes little sense to not rename the tool now. I'm pretty
> > >> sure we'd end up doing that sooner or later anyway, and we'll just live
> > >> with a misnamed tool until then.
> > >
> > > Do you think that a thread Would on -packagers be more adapted then?
> >
> > I'm sorry, but I'm not sure I understand the question. Of course, asking
> > over at -packagers won't hurt, but my guess is the response will be it's
> > not a big deal from the packaging perspective.
>
> I think a heads- up in the way of "planning to change it, now's the
> time to yell" is the reasonable thing.

Renaming applications shouldn't be a problem unless they have to be
moved from one binary package to another. I assume all packagers ship
all client/server binaries in one package, respectively (and not e.g. a
dedicated postgresql-11-pg_test_fsync package), this should only be a
matter of updating package metadata.

In any case, it should be identical to the xlog->wal rename.


Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: [hidden email]

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Paquier-2
On Sat, Dec 29, 2018 at 11:55:43AM +0100, Michael Banck wrote:
> Renaming applications shouldn't be a problem unless they have to be
> moved from one binary package to another. I assume all packagers ship
> all client/server binaries in one package, respectively (and not e.g. a
> dedicated postgresql-11-pg_test_fsync package), this should only be a
> matter of updating package metadata.
>
> In any case, it should be identical to the xlog->wal rename.

I have poked -packagers on the matter and I am seeing no complains, so
let's move forward with this stuff.  From the consensus I am seeing on
the thread, we have been discussing about the following points:
1) Rename pg_verify_checksums to pg_checksums.
2) Have separate switches for each action, aka --verify, --enable and
--disable, or a unified --action switch which can take different
values.
3) Do we want to imply --verify by default if no switch is specified?

About 2), folks who have expressed an opinion are:
- Multiple switches: Robert, Fabien, Magnus
- Single --action switch: Michael B, Michael P

About 3), aka --verify implied if no action is specified:
- In favor: Fabien C, Magnus
- Against: Michael P

If I missed what someone said, please feel free to complete with your
votes here.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Banck-2
Hi,

Am Dienstag, den 01.01.2019, 11:38 +0900 schrieb Michael Paquier:

> On Sat, Dec 29, 2018 at 11:55:43AM +0100, Michael Banck wrote:
> > Renaming applications shouldn't be a problem unless they have to be
> > moved from one binary package to another. I assume all packagers ship
> > all client/server binaries in one package, respectively (and not e.g. a
> > dedicated postgresql-11-pg_test_fsync package), this should only be a
> > matter of updating package metadata.
> >
> > In any case, it should be identical to the xlog->wal rename.
>
> I have poked -packagers on the matter and I am seeing no complains, so
> let's move forward with this stuff.  From the consensus I am seeing on
> the thread, we have been discussing about the following points:
> 1) Rename pg_verify_checksums to pg_checksums.
> 2) Have separate switches for each action, aka --verify, --enable and
> --disable, or a unified --action switch which can take different
> values.
> 3) Do we want to imply --verify by default if no switch is specified?
>
> About 2), folks who have expressed an opinion are:
> - Multiple switches: Robert, Fabien, Magnus
> - Single --action switch: Michael B, Michael P

I implemented the multiple switches thing in my branch first anyway and
don't mind a lot either way; I think the consensus goes towards multiple
switches.

> About 3), aka --verify implied if no action is specified:
> - In favor: Fabien C, Magnus
> - Against: Michael P

I think I'm in favor as well.

I wonder whether we (or packagers) could then just ship a
pg_verify_checksums -> pg_checksums symlink for compatibility if we/they
want, as the behaviour would stay the same?


Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: [hidden email]

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Paquier-2
On Tue, Jan 01, 2019 at 11:42:49AM +0100, Michael Banck wrote:
> Am Dienstag, den 01.01.2019, 11:38 +0900 schrieb Michael Paquier:
>> About 3), aka --verify implied if no action is specified:
>> - In favor: Fabien C, Magnus
>> - Against: Michael P
>
> I think I'm in favor as well.

Okay, it looks to be the direction to take then.

> I wonder whether we (or packagers) could then just ship a
> pg_verify_checksums -> pg_checksums symlink for compatibility if we/they
> want, as the behaviour would stay the same?

In the v10 dev cycle this part has been discarded for the switch from
pg_xlogdump to pg_waldump.  I don't think that's worth bothering this
time either in the build.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Stephen Frost
In reply to this post by Tomas Vondra-4
Greetings,

* Tomas Vondra ([hidden email]) wrote:
> On 12/27/18 11:43 AM, Magnus Hagander wrote:
> > Plus, the majority of people *should* want them on :) We don't run with
> > say synchronous_commit=off by default either to make it easier on those
> > that don't want to pay the overhead of full data safety :P (I know it's
> > not a direct match, but you get the idea)

+1 to having them on by default, we should have done that a long time
ago.

> I don't know, TBH. I agree making the on/off change cheaper makes moves
> us closer to 'on' by default, because they may disable it if needed. But
> it's not the whole story.
>
> If we enable checksums by default, 99% users will have them enabled.

Yes, and they'll then be able to catch data corruption much earlier.
Today, 99% of our users don't have them enabled and have no clue if
their data has been corrupted on disk, or not.  That's not good.

> That means more people will actually observe data corruption cases that
> went unnoticed so far. What shall we do with that? We don't have very
> good answers to that (tooling, docs) and I'd say "disable checksums" is
> not a particularly amazing response in this case :-(

Now that we've got a number of tools available which will check the
checksums in a running system and throw up warnings when found
(pg_basebackup, pgBackRest and I think other backup tools,
pg_checksums...), users will see corruption and have the option to
restore from a backup before those backups expire out and they're left
with a corrupt database and backups which also have that corruption.

This ongoing call for specific tooling to do "something" about checksums
is certainly good, but it's not right to say that we don't have existing
documentation- we do, quite a bit of it, and it's all under the heading
of "Backup and Recovery".

> FWIW I don't know what to do about that. We certainly can't prevent the
> data corruption, but maybe we could help with fixing it (although that's
> bound to be low-level work).

There's been some effort to try and automagically correct corrupted
pages but it's certainly not something I'm ready to trust beyond a
"well, this is what it might have been" review.  The answer today is to
find a backup which isn't corrupt and restore from it on a known-good
system.  If adding explicit documentation to that effect would reduce
your level of concern when it comes to enabling checksums by default,
then I'm happy to do that.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Banck-2
In reply to this post by Fabien COELHO-3
Hi,

Am Donnerstag, den 27.12.2018, 12:26 +0100 schrieb Fabien COELHO:

> > > For enable/disable, while the command is running, it should mark the
> > > cluster as opened to prevent an unwanted database start. I do not see
> > > where this is done.
> > >
> > > You have pretty much the same class of problems if you attempt to
> > > start a cluster on which pg_rewind or the existing pg_verify_checksums
> > > is run after these have scanned the control file to make sure that
> > > they work on a cleanly-stopped instance. [...]
> >
> > I think it comes down to what the outcome is. If we're going to end up with
> > a corrupt database (e.g. one where checksums aren't set everywhere but they
> > are marked as such in pg_control) then it's not acceptable. If the only
> > outcome is the tool gives an error that's not an error and if re-run it's
> > fine, then it's a different story.
>
> ISTM that such an outcome is indeed a risk, as a starting postgres could
> update already checksummed pages without putting a checksum. It could be
> even worse, although with a (very) low probability, with updates
> overwritten on a race condition between the processes. In any case, no
> error would be reported before much later, with invalid checksums or
> inconsistent data, or undetected forgotten committed data.

One difference between pg_rewind and pg_checksums is that the latter
potentially runs for a longer time (or rather a non-trivial amount of
time, compared to pg_rewind), so the margin of error of another DBA
saying "oh, that DB is down, let me start it again" might be much
higher.

The question is how to reliably do this in an acceptable way? Just
faking a postmaster.pid sounds pretty hackish to me, do you have any
suggestions here?

The alternative would be to document that it needs to be made sure that
the database is not started up during enabling of checksums, yielding to
pilot error.


Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: [hidden email]

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Banck-2
In reply to this post by Fabien COELHO-3
Hi,

Am Mittwoch, den 26.12.2018, 19:43 +0100 schrieb Fabien COELHO:
> > It adds an (now mandatory) --action parameter that takes either verify,
> > enable or disable as argument.

> I'd rather have explicit switches for verify, enable & disable, and verify 
> would be the default if none is provided.

I changed that to the switches -c/--verify (-c for check as -v is taken,
should it be --check as well? I personally like verify better), 
-d/--disable and -e/--enable.

> About the patch: applies, compiles, "make check" ok.
>
> There is no documentation.

Yeah, I'll write that once the CLI is settled.

> In "scan_file", I would open RW only for enable, but keep RO for verify.

OK, I've changed that.

> Also, the full page is rewritten... would it make sense to only overwrite
> the checksum part itself?

So just writing the page header? I find that a bit scary and don't
expect much speedup as the OS would write the whole block anyway I
guess? I haven't touched that yet.

> It seems that the control file is unlinked and then rewritten. If the
> rewritting fails, or the command is interrupted, the user has a problem.
>
> Could the control file be simply opened RW? Else, I would suggest to
> rename (eg add .tmp), write the new one, then unlink the old one, so that
> recovering the old state in case of problem is possible.

I have mostly taken the pg_rewind code here; if there was a function
that allowed for safe offline changes of the control file, I'd be happy
to use it but I don't think it should be this patch to invent that.

In any case, I have removed the unlink() now (not sure where that came
from), and changed it to open(O_WRONLY) same as in Michael's code and
pg_rewind.

V2 attached.


Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: [hidden email]

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

offline-activation-of-checksums_V2.patch (21K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Fabien COELHO-3
In reply to this post by Michael Banck-2



> One difference between pg_rewind and pg_checksums is that the latter
> potentially runs for a longer time (or rather a non-trivial amount of
> time, compared to pg_rewind), so the margin of error of another DBA
> saying "oh, that DB is down, let me start it again" might be much
> higher.
>
> The question is how to reliably do this in an acceptable way? Just
> faking a postmaster.pid sounds pretty hackish to me, do you have any
> suggestions here?

Adding a new state to ControlFileData which would prevent it from
starting?

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Bernd Helmle
Am Dienstag, den 08.01.2019, 15:09 +0100 schrieb Fabien COELHO:
> > The question is how to reliably do this in an acceptable way? Just
> > faking a postmaster.pid sounds pretty hackish to me, do you have
> > any
> > suggestions here?
>
> Adding a new state to ControlFileData which would prevent it from
> starting?

But then you have to make sure the control flag gets cleared in any
case pg_verify_checksums crashes somehow or gets SIGKILL'ed ...

Setting the checksum flag is done after having finished all blocks, so
there is no problem. But we need to set this new flag before and reset
it afterwards, so in between strange things can happen (as the various
calls to exit() within error handling illustrates).

Bernd.



Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Banck-2
Am Dienstag, den 08.01.2019, 15:39 +0100 schrieb Bernd Helmle:

> Am Dienstag, den 08.01.2019, 15:09 +0100 schrieb Fabien COELHO:
> > > The question is how to reliably do this in an acceptable way? Just
> > > faking a postmaster.pid sounds pretty hackish to me, do you have
> > > any
> > > suggestions here?
> >
> > Adding a new state to ControlFileData which would prevent it from
> > starting?
>
> But then you have to make sure the control flag gets cleared in any
> case pg_verify_checksums crashes somehow or gets SIGKILL'ed ...
>
> Setting the checksum flag is done after having finished all blocks, so
> there is no problem. But we need to set this new flag before and reset
> it afterwards, so in between strange things can happen (as the various
> calls to exit() within error handling illustrates).

It seems writing a note like "pg_checksums is running" into the
postmaster.pid would work, and would give a hopefully useful hint to
somebody trying to start Postgres while pg_checksums is running:

postgres@kohn:~$ echo  "pg_checksums running with pid 1231, cluster disabled" > data/postmaster.pid 
postgres@kohn:~$ pg_ctl -D data -l logfile start
pg_ctl: invalid data in PID file "data/postmaster.pid"
postgres@kohn:~$ echo $?
1
postgres@kohn:~$ 

If the DBA then just simply deletes postmaster.pid and starts over, well
then I call pilot error; though we could in theory change pg_ctl (or
whatever checks postmaster.pid) to emit an even more useful error
message if it encounters a "cluster is locked" keyword in it.

Not sure whether everybody likes that (or is future-proof for that
matter), but I like it better than adding a new field to the control
file, for the reasons Bernd outlined above.


Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: [hidden email]

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Fabien COELHO-3
In reply to this post by Bernd Helmle

>> Adding a new state to ControlFileData which would prevent it from
>> starting?
>
> But then you have to make sure the control flag gets cleared in any
> case pg_verify_checksums crashes somehow or gets SIGKILL'ed ...

The usual approach is a restart with some --force option?

> Setting the checksum flag is done after having finished all blocks, so
> there is no problem.

There is also a problem if the db is started while the checksum is being
enabled.

> But we need to set this new flag before and reset it afterwards, so in
> between strange things can happen (as the various calls to exit() within
> error handling illustrates).

Sure, there is some need for a backup plan if it fails and the control
file is let in a wrong state.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Fabien COELHO-3
In reply to this post by Michael Banck-2

>> Setting the checksum flag is done after having finished all blocks, so
>> there is no problem. But we need to set this new flag before and reset
>> it afterwards, so in between strange things can happen (as the various
>> calls to exit() within error handling illustrates).
>
> It seems writing a note like "pg_checksums is running" into the
> postmaster.pid would work, and would give a hopefully useful hint to
> somebody trying to start Postgres while pg_checksums is running:
>
> postgres@kohn:~$ echo  "pg_checksums running with pid 1231, cluster disabled" > data/postmaster.pid 
> postgres@kohn:~$ pg_ctl -D data -l logfile start
> pg_ctl: invalid data in PID file "data/postmaster.pid"
> postgres@kohn:~$ echo $?
> 1
> postgres@kohn:~$ 
Looks ok, but I'm unsure how portable it is though. What if started with
"postmater" directly?

> If the DBA then just simply deletes postmaster.pid and starts over, well
> then I call pilot error; though we could in theory change pg_ctl (or
> whatever checks postmaster.pid) to emit an even more useful error
> message if it encounters a "cluster is locked" keyword in it.
>
> Not sure whether everybody likes that (or is future-proof for that
> matter), but I like it better than adding a new field to the control
> file, for the reasons Bernd outlined above.

ISTM that the point of the control file is exactly to tell what is current
the status of the cluster, so it is where this information really belongs?

AFAICS all commands take care of the status in some way to avoid
accidents.

--
Fabien.
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Bernd Helmle
In reply to this post by Fabien COELHO-3
Am Dienstag, den 08.01.2019, 16:17 +0100 schrieb Fabien COELHO:

> > > Adding a new state to ControlFileData which would prevent it from
> > > starting?
> >
> > But then you have to make sure the control flag gets cleared in any
> > case pg_verify_checksums crashes somehow or gets SIGKILL'ed ...
>
> The usual approach is a restart with some --force option?
>
> > Setting the checksum flag is done after having finished all blocks,
> > so
> > there is no problem.
>
> There is also a problem if the db is started while the checksum is
> being
> enabled.

What i mean is that interrupting pg_verify_checksums won't leave
pg_control in a state where starting the cluster won't work without any
further interaction.

Bernd.



Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Fabien COELHO-3

>>> But then you have to make sure the control flag gets cleared in any
>>> case pg_verify_checksums crashes somehow or gets SIGKILL'ed ...
>>
>> The usual approach is a restart with some --force option?
>>
>>> Setting the checksum flag is done after having finished all blocks, so
>>> there is no problem.
>>
>> There is also a problem if the db is started while the checksum is
>> being enabled.
>
> What i mean is that interrupting pg_verify_checksums won't leave
> pg_control in a state where starting the cluster won't work without any
> further interaction.

Yep, I understood that, and agree that a way out is needed, hence the
--force option suggestion.

--
Fabien.

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Paquier-2
In reply to this post by Michael Banck-2
On Tue, Jan 08, 2019 at 01:03:25PM +0100, Michael Banck wrote:
> I changed that to the switches -c/--verify (-c for check as -v is taken,
> should it be --check as well? I personally like verify better), 
> -d/--disable and -e/--enable.

Indeed we could use --check, pg_checksums --check looks repetitive
still that makes the short option more consistent with the rest.

+   printf(_("  -A, --action   action to take on the cluster, can be set as\n"));
+   printf(_("                 \"verify\", \"enable\" and \"disable\"\n"));
Not reflected yet in the --help portion.

>> Also, the full page is rewritten... would it make sense to only overwrite
>> the checksum part itself?
>
> So just writing the page header? I find that a bit scary and don't
> expect much speedup as the OS would write the whole block anyway I
> guess? I haven't touched that yet.

The OS would write blocks of 4kB out of the 8kB as that's the usual
page size, no?  So this could save a lot of I/O.

> I have mostly taken the pg_rewind code here; if there was a function
> that allowed for safe offline changes of the control file, I'd be happy
> to use it but I don't think it should be this patch to invent that.
>
> In any case, I have removed the unlink() now (not sure where that came
> from), and changed it to open(O_WRONLY) same as in Michael's code and
> pg_rewind.

My own stuff in pg_checksums.c does not have an unlink(), anyway...  I
think that there is room for improvement for both pg_rewind and
pg_checksums here.  What about refactoring updateControlFile() and
move it to controldata_utils.c()?  This centralizes the CRC check,
static assertions, file open and writes into a single place.  The
backend has a similar flavor with UpdateControlFile.  By combining
both we need some extra "ifdef FRONTEND" for BasicOpenFile and the
wait events which generates some noise, still both share a lot.  The
backend also includes a fsync() for the control file which happens
when the file is written, but for pg_checksums and pg_rewind we just
do it in one go at the end, so we would need an extra flag to decide
if fsync should happen or not.  pg_rewind has partially the right
interface by passing ControlFileData contents as an argument.

> V2 attached.

+/* Filename components */
+#define PG_TEMP_FILES_DIR "pgsql_tmp"
+#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
This may look strange, but these are needed because pg_checksums
calls some of the sync-related routines which are defined in fd.c.
Amen.

+   if (fsync(fd) != 0)
+   {
+       fprintf(stderr, _("%s: fsync error: %s\n"), progname, strerror(errno));
+       exit(1);
+   }
No need for that as fsync_pgdata() gets called at the end.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Fabien COELHO-3
In reply to this post by Michael Banck-2

> I changed that to the switches -c/--verify (-c for check as -v is taken,
> should it be --check as well? I personally like verify better), 
> -d/--disable and -e/--enable.

I agree that checking the checksum sounds repetitive, but I think that for
consistency --check should be provided.

About the patch: applies, compiles, global & local "make check" are ok.

There is still no documentation.

I think that there is a consensus about renaming the command.

The --help string documents --action which does not exists anymore.

The code in "updateControlFile" seems to allow to create the file
(O_CREAT). I do not think that it should, it should only apply to an
existing file.

ISTM that some generalized version of this function should be in
"src/common/controldata_utils.c" instead of duplicating it from command to
command (as suggested by Michaël as well).

In "scan_file" verbose output, ISTM that the checksum is more computed
than enabled on the file. It is really enabled at the cluster level in the
end.

Maybe there could be only one open call with a ?: for RO vs RW.

Non root check: as files are only manipulated RW, ISTM that there is no
reason why the ownership would be changed, so I do not think that this
constraint is useful.

There is kind of a copy paste for enabling/disabling, I'd consider
skipping the scan when not necessary and merge both branches.

>> Also, the full page is rewritten... would it make sense to only overwrite
>> the checksum part itself?
>
> So just writing the page header? I find that a bit scary and don't
> expect much speedup as the OS would write the whole block anyway I
> guess? I haven't touched that yet.

Possibly the OS would write its block size, which is not necessary the
same as postgres page size?

>> It seems that the control file is unlinked and then rewritten. If the
>> rewritting fails, or the command is interrupted, the user has a problem.
>>
>> Could the control file be simply opened RW? Else, I would suggest to
>> rename (eg add .tmp), write the new one, then unlink the old one, so that
>> recovering the old state in case of problem is possible.
>
> I have mostly taken the pg_rewind code here; if there was a function
> that allowed for safe offline changes of the control file, I'd be happy
> to use it but I don't think it should be this patch to invent that.
It is reinventing it somehow by duplicating the stuff anyway. I'd suggest
a separate preparatory patch to do the cleanup.

--
Fabien.
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Andres Freund
Hi,

On 2019-01-09 07:07:17 +0100, Fabien COELHO wrote:
> There is still no documentation.

Michael, are you planning to address this?  It'd also be useful to state
when you just don't agree with things / don't plan to address them.

Given the docs piece hasn't been addressed, and seems uncontroversial,
I'm marking this patch as returned with feedback. Please resubmit once
ready.


> > > Also, the full page is rewritten... would it make sense to only overwrite
> > > the checksum part itself?
> >
> > So just writing the page header? I find that a bit scary and don't
> > expect much speedup as the OS would write the whole block anyway I
> > guess? I haven't touched that yet.
>
> Possibly the OS would write its block size, which is not necessary the same
> as postgres page size?

I think it'd be a bad idea to write more granular. Very commonly that'll
turn a write operation into a read-modify-write (although caching will
often prevent that from being a problem here), and it'll be bad for
flash translation layers.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Banck-2
In reply to this post by Fabien COELHO-3
Hi,

sorry for letting this slack.

First off, thanks for the review!

Am Mittwoch, den 09.01.2019, 07:07 +0100 schrieb Fabien COELHO:
> > I changed that to the switches -c/--verify (-c for check as -v is taken,
> > should it be --check as well? I personally like verify better), 
> > -d/--disable and -e/--enable.
>
> I agree that checking the checksum sounds repetitive, but I think that for
> consistency --check should be provided.

Ok then. The enum is currently called PG_ACTION_VERIFY, I changed that
to PG_ACTION_CHECK as well.

> About the patch: applies, compiles, global & local "make check" are ok.
>
> There is still no documentation.

I've added that now, though I did that blindly and have not checked the
output yet.

> I think that there is a consensus about renaming the command.

I think so as well, but doing that right now will make the patch
difficult to review, so I'd prefer to leave it to the committer to do
that. 

I can submit a patch with the directory/file rename if that is
preferred.

> The --help string documents --action which does not exists anymore.

Fixed that.

> The code in "updateControlFile" seems to allow to create the file
> (O_CREAT). I do not think that it should, it should only apply to an
> existing file.

Removed that.

> ISTM that some generalized version of this function should be in
> "src/common/controldata_utils.c" instead of duplicating it from command to
> command (as suggested by Michaël as well).

Haven't done that yet.

> In "scan_file" verbose output, ISTM that the checksum is more computed
> than enabled on the file. It is really enabled at the cluster level in the
> end.

It's certainly not just computed but also written. It's true that it
will be only meaningful if the control file is updated accordingly at
the end, but I don't think that message is very incorrect, so left it
as-is for now.

> Maybe there could be only one open call with a ?: for RO vs RW.

Done that.

> Non root check: as files are only manipulated RW, ISTM that there is no
> reason why the ownership would be changed, so I do not think that this
> constraint is useful.

Now that we no longer unlink() pg_control, I believe you are right and I
have removed it.
`
> There is kind of a copy paste for enabling/disabling, I'd consider
> skipping the scan when not necessary and merge both branches.

Done so.

> > > Also, the full page is rewritten... would it make sense to only overwrite
> > > the checksum part itself?
> >
> > So just writing the page header? I find that a bit scary and don't
> > expect much speedup as the OS would write the whole block anyway I
> > guess? I haven't touched that yet.
>
> Possibly the OS would write its block size, which is not necessary the
> same as postgres page size?

I haven't changed that yet, I think Andres was also of the opinion that
this is not necessary?

> > > It seems that the control file is unlinked and then rewritten. If the
> > > rewritting fails, or the command is interrupted, the user has a problem.
> > >
> > > Could the control file be simply opened RW?

I've done that now.

New patch attached.


Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: [hidden email]

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

offline-activation-of-checksums_V3.patch (26K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Paquier-2
On Sun, Feb 17, 2019 at 07:31:38PM +0100, Michael Banck wrote:
> New patch attached.

- * src/bin/pg_verify_checksums/pg_verify_checksums.c
+ * src/bin/pg_checksums/pg_checksums.c
That's lacking a rename, or this comment is incorrect.

+#if PG_VERSION_NUM >= 100000
+   StaticAssertStmt(sizeof(ControlFileData) <= PG_CONTROL_MAX_SAFE_SIZE,
+                    "pg_control is too large for atomic disk writes");
+#endif
This is compiled with only one version of the control file data, so
you don't need that.

Any reason why we don't refactor updateControlFile() into
controldata_utils.c?  This duplicates the code, at the exception of
some details.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Offline enabling/disabling of data checksums

Michael Banck-2
Hi,

Am Dienstag, den 19.02.2019, 14:02 +0900 schrieb Michael Paquier:
> On Sun, Feb 17, 2019 at 07:31:38PM +0100, Michael Banck wrote:
> > New patch attached.
>
> - * src/bin/pg_verify_checksums/pg_verify_checksums.c
> + * src/bin/pg_checksums/pg_checksums.c
> That's lacking a rename, or this comment is incorrect.

Right, I started the rename, but then backed off pending further
discussion whether I should submit that or whether the committer will
just do it.

I've backed those 4 in-file renames out for now.

> +#if PG_VERSION_NUM >= 100000
> +   StaticAssertStmt(sizeof(ControlFileData) <= PG_CONTROL_MAX_SAFE_SIZE,
> +                    "pg_control is too large for atomic disk writes");
> +#endif
> This is compiled with only one version of the control file data, so
> you don't need that.

Oops, yeah.

> Any reason why we don't refactor updateControlFile() into
> controldata_utils.c?  This duplicates the code, at the exception of
> some details.

Ok, I've done that now, and migrated pg_rewind as well, do you know of
any other programs that might benefit here?

This could/should probably be committed separately beforehand.

New patch attached.


Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: [hidden email]

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

offline-activation-of-checksums_V4.patch (31K) Download Attachment
12345 ... 8