A customer reported a strange behaviour on a PITR restoration.
After a drop database, he tried to recover the data on the last inserted transaction by using the recovery_target_time.
The issue is the database is present in the system catalog but the directory was still deleted.
Here the technical information of the database
default postgresql.conf except for this options
wal_level = replica
archive_mode = on
archive_command = 'cp %p /tmp/wal_archive/%f '
log_statement = 'all'
log_min_messages = debug5
The following method was used
By analysing the wal file with pg_waldump
We notice that the following log
is executed between the last commit that we are interested in and the next record with a timestamp
We understand that the drop database command is not transactional but the drop dir is attached to the xact whose xid has a commit with timestime out of recovery_target_time bound.
On the other hand, DBA role is to determine which at which xact recovery should stop and define recovery_target_xid rather than recovery_target_time.
Humans are prone to use natural things such as time to define "when" to stop or start things.
We know that this rarely happens in production, because you can't drop a database if users are still connected. But with the new force drop database option, it might be a reasonable choice to improve the situation with that recovery_target_time directive.
It turns out there are two different choices we can make :
We are willing to help on this case either with code patching or documentation improvement.
On Mon, 18 Nov 2019 at 18:48, Nicolas Lutic <[hidden email]> wrote:
If this only happens when a DB is dropped under load with force, I lean toward just documenting it as a corner case.
On 11/19/19 1:40 AM, Craig Ringer wrote:
> On Mon, 18 Nov 2019 at 18:48, Nicolas Lutic <[hidden email]
> <mailto:[hidden email]>> wrote:
> Dear Hackers,
> After a drop database
> with FORCE?
No, we tested with PostgreSQL v 11 and we don't have this option.
> , he tried to recover the data on the last inserted transaction by
> using the recovery_target_time.
> The issue is the database is present in the system catalog but the
> directory was still deleted.
> Here the technical information of the database
> version 11
> default postgresql.conf except for this options
> wal_level = replica
> archive_mode = on
> archive_command = 'cp %p /tmp/wal_archive/%f '
> log_statement = 'all'
> log_min_messages = debug5
> The following method was used
> * create cluster
> * create database
> * create 1 table
> * create 1 index on 1 column
> * insert 1 rows
> * backup with pg_base_backup
> * insert 2 rows
Yes, I forgot to mention it.
> * drop database
> * Change recovery behaviour in that case to prevent all xact
> operation to perform until COMMIT timestamp is checked against
> recovery_time bound (but it seems to be difficult as
> state https://www.postgresql.org/message-id/flat/20141125160629.GC21475%40msg.df7cb.dewhich
> also identifies the problem and tries to give some solutions.
> Maybe another way, as a trivial guess (all apologises) is to
> buffer immediate xacts until we have the commit for each and
> apply the whole buffer xact once the timestamp known (and
> checked agains recovery_target_time value);
> * The other way to improve this is to update PostgreSQL
> documentation by specifying that recovery_target_time cannot be
> used in this case.There should be multiple places where it can
> be stated. The best one (if only one) seems to be in
> If this only happens when a DB is dropped under load with force, I lean
> toward just documenting it as a corner case.
This can happen in the case of a non-transactional instruction, DROP
DATABASE (with or without FORCE) is one case but there may be other cases ?
The documentation modification have to mention this case and list the
other most likely operations.
An idea, without insight knowledge of the code, in case of
recovery_target_time (only), would be to move forward each record for an
Each record that is «timestamped» can be applied but once we encounter a
non timestamped record we could buffer the following records for any
xaxts until a timestamped commit/rollback for the transaction where that
non transactionnal op appearsin. Once the commit/rollback records are
found, there's two options :
1) the commit/rollback timestamp is inside the "replay" bound, then the
whole buffer can be applied
2) the commit/rollback timestamp is beyond the upper time bound for
"replay", then the whole buffer for that transaction could be canceled.
This can only be done on DROP DATABASE "DELETE" operation ?
Maybe, this will lead to skewed pages and this is a wrong way to do such
Another assumption is that "DROP DATABASE" sequence can be changed for
this operation to perform correctly.
We are aware that this part is tricky and will have little effects on
normal operations, as best practices are to use xid_target or lsn_target.
> Craig Ringer http://www.2ndQuadrant.com/
> 2ndQuadrant - PostgreSQL Solutions for the Enterprise
Consulting - Training - Support
Le mar. 19 nov. 2019 à 16:15, Nicolas Lutic <[hidden email]> a écrit :
I'm working with Nicolas and we made some further testing. If we use xid target with inclusive to false at the next xid after the insert, we end up with the same DELETE/DROP directory behaviour which is quite confusing. One have to choose the xid-1 value with inclusive behaviour to lake it work.
I assume this is the right first thing to document the behaviour. And give some examples on this.
Maybe we could add some documentation in the xlog explanation and a warning in the recovery_target_time and xid in guc doc ?
If there are better places in the docs let us know.
|Free forum by Nabble||Edit this page|