Does PostgreSQL check database integrity at startup?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
55 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter

Recently I had a problem with a base file with size 0 in a standby server.

This raised one question: does PostgreSQL (9.6.6) check base integrity at startup?

At least if there are 0 byte size files in base dir? Or CRC? Something?


Regards,

--

Edson Carlos Ericksson Richter
SimKorp Ltda
Fone: (51) 3366-7964
Embedded Image
"A mente que se abre a uma nova ideia jamais voltará ao seu tamanho original"
- Albert Einstein

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Martín Marqués-3
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
> Recently I had a problem with a base file with size 0 in a standby server.
>
> This raised one question: does PostgreSQL (9.6.6) check base integrity
> at startup?
>
> At least if there are 0 byte size files in base dir? Or CRC? Something?

Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.


--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Pavel Stehule


2017-12-26 14:44 GMT+01:00 Martin Marques <[hidden email]>:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
> Recently I had a problem with a base file with size 0 in a standby server.
>
> This raised one question: does PostgreSQL (9.6.6) check base integrity
> at startup?
>
> At least if there are 0 byte size files in base dir? Or CRC? Something?

Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.


It has CRC check, but it is used in runtime - when data are necessary

So Postgres usually check nothing on start - few system tables and indexes

Regards

Pavel

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter
Em 26/12/2017 12:25, Pavel Stehule escreveu:


2017-12-26 14:44 GMT+01:00 Martin Marques <[hidden email]>:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
> Recently I had a problem with a base file with size 0 in a standby server.
>
> This raised one question: does PostgreSQL (9.6.6) check base integrity
> at startup?
>
> At least if there are 0 byte size files in base dir? Or CRC? Something?

Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.


It has CRC check, but it is used in runtime - when data are necessary

So Postgres usually check nothing on start - few system tables and indexes

Regards

Pavel

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Any tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before effectively starting the database, what would be the recommendations?

One that seems obvious to me are empty data files (something like "find -size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests to check before startup.

Thanks,

Edson
Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Peter Geoghegan-4
Look into amcheck:

https://github.com/petergeoghegan/amcheck

--
Peter Geoghegan
(Sent from my phone)
Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Pavel Stehule
In reply to this post by Edson Carlos Ericksson Richter


2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter <[hidden email]>:
Em 26/12/2017 12:25, Pavel Stehule escreveu:


2017-12-26 14:44 GMT+01:00 Martin Marques <[hidden email]>:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
> Recently I had a problem with a base file with size 0 in a standby server.
>
> This raised one question: does PostgreSQL (9.6.6) check base integrity
> at startup?
>
> At least if there are 0 byte size files in base dir? Or CRC? Something?

Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.


It has CRC check, but it is used in runtime - when data are necessary

So Postgres usually check nothing on start - few system tables and indexes

Regards

Pavel

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Any tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before effectively starting the database, what would be the recommendations?

One that seems obvious to me are empty data files (something like "find -size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests to check before startup.

I don't think so anybody does it. Reading 1TB database needs more then few hours.

Regards

 

Thanks,

Edson

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter
Em 26/12/2017 13:40, Pavel Stehule escreveu:


2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter <[hidden email]>:
Em 26/12/2017 12:25, Pavel Stehule escreveu:


2017-12-26 14:44 GMT+01:00 Martin Marques <[hidden email]>:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
> Recently I had a problem with a base file with size 0 in a standby server.
>
> This raised one question: does PostgreSQL (9.6.6) check base integrity
> at startup?
>
> At least if there are 0 byte size files in base dir? Or CRC? Something?

Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.


It has CRC check, but it is used in runtime - when data are necessary

So Postgres usually check nothing on start - few system tables and indexes

Regards

Pavel

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Any tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before effectively starting the database, what would be the recommendations?

One that seems obvious to me are empty data files (something like "find -size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests to check before startup.

I don't think so anybody does it. Reading 1TB database needs more then few hours.

Regards

 

Thanks,

Edson

I'm rebuilding the standby server for two days already, with 23% of completion status...
If lost the database and backups because of that failure, it would be a giant disaster.
Few hours checking integrity would be acceptable... Specially if I can run it on standby only.

Regards,

Edson
Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Pavel Stehule


2017-12-26 16:50 GMT+01:00 Edson Carlos Ericksson Richter <[hidden email]>:
Em 26/12/2017 13:40, Pavel Stehule escreveu:


2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter <[hidden email]>:
Em 26/12/2017 12:25, Pavel Stehule escreveu:


2017-12-26 14:44 GMT+01:00 Martin Marques <[hidden email]>:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
> Recently I had a problem with a base file with size 0 in a standby server.
>
> This raised one question: does PostgreSQL (9.6.6) check base integrity
> at startup?
>
> At least if there are 0 byte size files in base dir? Or CRC? Something?

Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.


It has CRC check, but it is used in runtime - when data are necessary

So Postgres usually check nothing on start - few system tables and indexes

Regards

Pavel

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Any tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before effectively starting the database, what would be the recommendations?

One that seems obvious to me are empty data files (something like "find -size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests to check before startup.

I don't think so anybody does it. Reading 1TB database needs more then few hours.

Regards

 

Thanks,

Edson

I'm rebuilding the standby server for two days already, with 23% of completion status...
If lost the database and backups because of that failure, it would be a giant disaster.
Few hours checking integrity would be acceptable... Specially if I can run it on standby only.

very simple check

pgdumpall > /dev/null

but this doesn't check indexes.

Regards

Pavel




Regards,

Edson

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter

Em 26/12/2017 13:57, Pavel Stehule escreveu:


2017-12-26 16:50 GMT+01:00 Edson Carlos Ericksson Richter <[hidden email]>:
Em 26/12/2017 13:40, Pavel Stehule escreveu:


2017-12-26 16:37 GMT+01:00 Edson Carlos Ericksson Richter <[hidden email]>:
Em 26/12/2017 12:25, Pavel Stehule escreveu:


2017-12-26 14:44 GMT+01:00 Martin Marques <[hidden email]>:
El 26/12/17 a las 09:52, Edson Carlos Ericksson Richter escribió:
> Recently I had a problem with a base file with size 0 in a standby server.
>
> This raised one question: does PostgreSQL (9.6.6) check base integrity
> at startup?
>
> At least if there are 0 byte size files in base dir? Or CRC? Something?

Yes it has CRC check, but only if you initialize the cluster with
--data-checksums, and there's a price to pay in performance.


It has CRC check, but it is used in runtime - when data are necessary

So Postgres usually check nothing on start - few system tables and indexes

Regards

Pavel

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Any tips to make database server don't start if corrupt?
If I can change the startup script to make some checks before effectively starting the database, what would be the recommendations?

One that seems obvious to me are empty data files (something like "find -size 0 $PG_DATA/base")...
But I'm sure that more experienced PostgreSQL DBA would have more tests to check before startup.

I don't think so anybody does it. Reading 1TB database needs more then few hours.

Regards

 

Thanks,

Edson

I'm rebuilding the standby server for two days already, with 23% of completion status...
If lost the database and backups because of that failure, it would be a giant disaster.
Few hours checking integrity would be acceptable... Specially if I can run it on standby only.

very simple check

pgdumpall > /dev/null

but this doesn't check indexes.

Regards

Pavel



I'll check it and make some tests.
Thanks!

Edson
Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Álvaro Herrera
In reply to this post by Edson Carlos Ericksson Richter
Edson Carlos Ericksson Richter wrote:

> I'm rebuilding the standby server for two days already, with 23% of
> completion status...

So how do you build your standbys, exactly?  Maybe there's a bug in your
procedure, rather than a bug in the software.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter

Em 26/12/2017 15:13, Alvaro Herrera escreveu:
> Edson Carlos Ericksson Richter wrote:
>
>> I'm rebuilding the standby server for two days already, with 23% of
>> completion status...
> So how do you build your standbys, exactly?  Maybe there's a bug in your
> procedure, rather than a bug in the software.
>
Usually, or by "pg_basebackup" or by "rsync" as described in PostgreSQL
Wiki.
Right now, I'm doing via pg_basebackup.
Nothing extraordinary.

When using rsync:

rsync -e "ssh -2 -C -p slave_ssh_port" --progress --partial -az
/pgsql/9.6/master_data_folder/*
superuser_name@$slave_host:/home/pgsql/9.6/slave_data_folder/ --exclude
postmaster.pid --exclude postgresql.conf --exclude pg_log

When using pg_basebackup:

/usr/pgsql-9.6/bin/pg_basebackup -h master_host -p master_port -D
/home/pgsql/9.6/master_data_folder -R -S slot_name -X stream -P -d "
user=superuser_name password=superuser_pass sslmode=require "


Of course, this is inside a bash script with variables at right places
to make script generic as needed to make it works with dozens of
databases...

Regards,

Edson

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

David Steele
In reply to this post by Pavel Stehule
On 12/26/17 10:57 AM, Pavel Stehule wrote:

> 2017-12-26 16:50 GMT+01:00 Edson Carlos Ericksson Richter
>>
>     I'm rebuilding the standby server for two days already, with 23% of
>     completion status...
>     If lost the database and backups because of that failure, it would
>     be a giant disaster.
>     Few hours checking integrity would be acceptable... Specially if I
>     can run it on standby only.
>
>
> very simple check
>
> pgdumpall > /dev/null
>
> but this doesn't check indexes.

pgBackRest will validate all page checksums (including indexes, etc.) in
the cluster during backup.  Full backups check everything,
incr/differential backups check only the files that have changed.

--
-David
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Álvaro Herrera
David Steele wrote:

> pgBackRest will validate all page checksums (including indexes, etc.) in the
> cluster during backup.  Full backups check everything, incr/differential
> backups check only the files that have changed.

If a table or index file is of zero length when backed up, as in the
described case, nothing will be checked, right?  I mean, there is
nothing externally indicating that the file ought to be of a different
size.  Am I wrong?  So Edson's situation here would not raise any red
flags.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

rob stone-2
Hello,


On Tue, 2017-12-26 at 18:58 -0300, Alvaro Herrera wrote:Hello,

> David Steele wrote:
>
> > pgBackRest will validate all page checksums (including indexes,
> > etc.) in the
> > cluster during backup.  Full backups check everything,
> > incr/differential
> > backups check only the files that have changed.
>
> If a table or index file is of zero length when backed up, as in the
> described case, nothing will be checked, right?  I mean, there is
> nothing externally indicating that the file ought to be of a
> different
> size.  Am I wrong?  So Edson's situation here would not raise any red
> flags.
>


Could the following occur:-
1) Your app. issues a BEGIN followed by an INSERT.
2) Postgres decides to open a new file in order to store the new row.
3) Your app. then does a ROLLBACK.

Wouldn't that leave you with a zero length file on disk?

There's no reason for Postgres to delete the file just because a
rollback was issued. All it has to do is clear the buffer in memory.

My 2 cents.

Rob

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Martín Marqués-3
In reply to this post by Edson Carlos Ericksson Richter
El 26/12/17 a las 14:46, Edson Carlos Ericksson Richter escribió:

>
> Usually, or by "pg_basebackup" or by "rsync" as described in PostgreSQL
> Wiki.
> Right now, I'm doing via pg_basebackup.
> Nothing extraordinary.
>
> When using rsync:
>
> rsync -e "ssh -2 -C -p slave_ssh_port" --progress --partial -az
> /pgsql/9.6/master_data_folder/*
> superuser_name@$slave_host:/home/pgsql/9.6/slave_data_folder/ --exclude
> postmaster.pid --exclude postgresql.conf --exclude pg_log

You don't say so anywhere, but I suspect you run the rsync between a
pg_start_backup() and pg_stop_backup(), right?

That is the way it's described in the wiki.

Regards,

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter
Em 26/12/2017 20:42, Martin Marques escreveu:

> El 26/12/17 a las 14:46, Edson Carlos Ericksson Richter escribió:
>> Usually, or by "pg_basebackup" or by "rsync" as described in PostgreSQL
>> Wiki.
>> Right now, I'm doing via pg_basebackup.
>> Nothing extraordinary.
>>
>> When using rsync:
>>
>> rsync -e "ssh -2 -C -p slave_ssh_port" --progress --partial -az
>> /pgsql/9.6/master_data_folder/*
>> superuser_name@$slave_host:/home/pgsql/9.6/slave_data_folder/ --exclude
>> postmaster.pid --exclude postgresql.conf --exclude pg_log
> You don't say so anywhere, but I suspect you run the rsync between a
> pg_start_backup() and pg_stop_backup(), right?
>
> That is the way it's described in the wiki.
>
> Regards,
>
Yes, sure. When using wiki, I'm following all instructions. Process
revisited many times do check if something changed (since 9.0 days :-) )

Right now, I do prefer to use pg_basebackup instead - but sometimes
(when database is just too large), rsync seems more reliable (maybe it
is the cause of the problem).

Anyway, instead digging into rsync functionality (or bugs - I doubt, but
who knows?), I do prefer to have a script I can run to check if there is
obvious failures in standby servers.

Looking for empty files would be a start point.
I'm learning from experienced people from the list that are other points
I would like to check as well.

Regards,

Edson.

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter
In reply to this post by rob stone-2
Em 26/12/2017 20:11, rob stone escreveu:

> Hello,
>
>
> On Tue, 2017-12-26 at 18:58 -0300, Alvaro Herrera wrote:Hello,
>> David Steele wrote:
>>
>>> pgBackRest will validate all page checksums (including indexes,
>>> etc.) in the
>>> cluster during backup.  Full backups check everything,
>>> incr/differential
>>> backups check only the files that have changed.
>> If a table or index file is of zero length when backed up, as in the
>> described case, nothing will be checked, right?  I mean, there is
>> nothing externally indicating that the file ought to be of a
>> different
>> size.  Am I wrong?  So Edson's situation here would not raise any red
>> flags.
>>
>
> Could the following occur:-
> 1) Your app. issues a BEGIN followed by an INSERT.
> 2) Postgres decides to open a new file in order to store the new row.
> 3) Your app. then does a ROLLBACK.
>
> Wouldn't that leave you with a zero length file on disk?
>
> There's no reason for Postgres to delete the file just because a
> rollback was issued. All it has to do is clear the buffer in memory.
>
> My 2 cents.
>
> Rob
>
>

You are right, in several databases I've looked there are many files
with 0 byte size in ./base folder.
This is not a good quest.

Regards,

Edson

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Stephen Frost
In reply to this post by Edson Carlos Ericksson Richter
Greetings,

* Edson Carlos Ericksson Richter ([hidden email]) wrote:
> Right now, I do prefer to use pg_basebackup instead - but sometimes
> (when database is just too large), rsync seems more reliable (maybe
> it is the cause of the problem).

I don't generally recommend rsync for various reasons, but if you make
sure to call pg_start/stop_backup(), keep track of all your WAL and
verify everything ends up written to disk then it should work.  Don't
use it for incrementals though- there are cases where that can fail.

> Anyway, instead digging into rsync functionality (or bugs - I doubt,
> but who knows?), I do prefer to have a script I can run to check if
> there is obvious failures in standby servers.

As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
have page-level CRCs, if you initialized your database with them (which
I would strongly recommend).  There are also backup tools which will
verify those checksums when performing a backup of the system.  In
addition to that, you can do parallel backup and restore which can
reduce the downtime for doing restores quite a bit (though this will
depend on what you're bottleneck is, of course).  I'd suggest you take a
look at pgBackRest, though I think that other solutions now also have
all of these features (though they all have their own features too).

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Alvaro Herrera-9
Stephen Frost wrote:

> * Edson Carlos Ericksson Richter ([hidden email]) wrote:

> > Anyway, instead digging into rsync functionality (or bugs - I doubt,
> > but who knows?), I do prefer to have a script I can run to check if
> > there is obvious failures in standby servers.
>
> As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
> have page-level CRCs, if you initialized your database with them (which
> I would strongly recommend).

Page-level checksums would not detect the problem being complained in
this thread, however.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: Does PostgreSQL check database integrity at startup?

Stephen Frost
Alvaro,

* Alvaro Herrera ([hidden email]) wrote:

> Stephen Frost wrote:
>
> > * Edson Carlos Ericksson Richter ([hidden email]) wrote:
>
> > > Anyway, instead digging into rsync functionality (or bugs - I doubt,
> > > but who knows?), I do prefer to have a script I can run to check if
> > > there is obvious failures in standby servers.
> >
> > As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
> > have page-level CRCs, if you initialized your database with them (which
> > I would strongly recommend).
>
> Page-level checksums would not detect the problem being complained in
> this thread, however.
It's entirely unclear to me what the problem being complained about in
this thread actually is.  The complaint so far was about zero-byte
files, but those are entirely valid, so that isn't a problem that anyone
can solve..

Given the thread subject, if someone actually wanted to do a database
integrity check before startup, they could use pgBackRest to perform a
backup with a CRC-enabled database and at least verify that all of the
checksums are valid.

We could possibly look into adding some set of additional checks for
files which can't actually be zero-byte, perhaps..  I know we have some
other one-off checks already.

Thanks!

Stephen

signature.asc (836 bytes) Download Attachment
123