automatic restore point

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

automatic restore point

Yotsunaga, Naoki
Hi, I'm a newbie to the hackers but I'd like to propose the "automatic restore point" feature.
This feature automatically create backup label just before making a huge change to DB. It's useful when this change is accidental case.

The following is a description of "automatic restore point".
【Background】
  When DBA's operation failure, for example DBA accidently drop table, the database is restored from the file system backup and recovered by using time or transaction ID. The transaction ID is identified from WAL.
  But below are the following problems in using time or transaction ID.
   -Time
   ・Need to memorize the time of failure operation.
     (It is possible to identify the time from WAL. But it takes time and effort to identify the time.)
   ・Difficult to specify detail point.
   -Transaction ID
   ・It takes time and effort to identify the transaction ID.
 
  In order to solve the above problem,
  I'd like propose a feature to implement automatic recording function of recovery point.
 
【Feature Description】
  In PostgreSQL, there is a backup control function "pg_create_restore_point()".
  User can create a named point for performing restore by using "pg_create_restore_point()".
  And user can recover by using the named point.
  So, execute "pg_create_restore_point()" automatically before executing the following command to create a point for performing restore(recovery point).
  The name of recovery point is the date and time when the command was executed.
  In this operation, target resource (database name, table name) and recovery point name are output as a message to PostgreSQL server log.
 
  - Commands wherein this feature can be appended  
   ・TRUNCATE
  ・DROP
   ・DELETE(Without WHERE clause)
  ・UPDATE(Without WHERE clause)
  ・COPY FROM
 
【How to use】
  1) When executing the above command, identify the command and recovery point name that matches the resource indicating the operation failure from the server log.
     
     ex)Message for executing TRUNCATE at 2018/6/1 12:30:30 (database name:testdb, table name:testtb)
        set recovery point. operation = 'truncate'
        database = 'testdb' relation = 'testtb' recovery_point_name = '2018-06-01-12:30:30'

   2) Implement PostgreSQL document '25 .3.4.Recovering Using a Continuous Archive Backup.'
     ※Set "recovery_target_name = 'recovery_point name'" at recovery.conf.

【Setting file】
  Set postgres.conf.
  auto_create_restore_point = on # Switch on/off automatic recording function of recovery point. The default value is 'off'.

So what do you think about it? Do you think is it useful?

Also, when recovering with the current specification, tables other than the returned table also return to the state of the specified recovery point.
So, I’m looking for ways to recover only specific tables. Do you have any ideas?

------
Naoki Yotsunaga



Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

David G Johnston
On Mon, Jun 25, 2018 at 6:17 PM, Yotsunaga, Naoki <[hidden email]> wrote:
​​
So what do you think about it? Do you think is it useful?

​The cost/benefit ratio seems low...​

Also, when recovering with the current specification, tables other than the returned table also return to the state of the specified recovery point.
So, I’m looking for ways to recover only specific tables. Do you have any ideas?

...and this lowers it even further.
I'd rather spend effort making the initial execution of said commands less likely.  Something like:

TRUNCATE table YES_I_REALLY_WANT_TO_DO_THIS;

which will fail if you don't add the keyword "YES_I..." to the end of the command and the system was setup to require it.

Or, less annoyingly:

BEGIN;
SET LOCAL perform_dangerous_action = true; --can we require local?
TRUNCATE table;
COMMIT;

David J.
Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Isaac Morland
On 25 June 2018 at 21:33, David G. Johnston <[hidden email]> wrote:
On Mon, Jun 25, 2018 at 6:17 PM, Yotsunaga, Naoki <[hidden email]> wrote:
​​
So what do you think about it? Do you think is it useful?

I'd rather spend effort making the initial execution of said commands less likely.  Something like:

TRUNCATE table YES_I_REALLY_WANT_TO_DO_THIS;
 
I think an optional setting making DELETE and UPDATE without a WHERE clause illegal would be handy. Obviously this would have to be optional for backward compatibility. Perhaps even just a GUC setting, with the intent being that one would set it in .psqlrc so that omitting the WHERE clause at the command line would just be a syntax error. If one actually does need to affect the whole table one can just say WHERE TRUE. For applications, which presumably have their SQL queries tightly controlled and pre-written anyway, this would most likely not be particularly useful.

Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Rui DeSousa
Why not use auto commit off in the session or .psqlrc file or begin and then use rollback?  \set AUTOCOMMIT off

What would be nice is if a syntax error didn’t abort the transaction when auto commit is off — being a bad typist.






Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Justin Pryzby
On Tue, Jun 26, 2018 at 12:04:59AM -0400, Rui DeSousa wrote:
> Why not use auto commit off in the session or .psqlrc file or begin and then use rollback?  \set AUTOCOMMIT off
>
> What would be nice is if a syntax error didn’t abort the transaction when auto commit is off — being a bad typist.

I think you'll get that behavior with ON_ERROR_ROLLBACK.

Justin

Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Rui DeSousa

> On Jun 26, 2018, at 12:37 AM, Justin Pryzby <[hidden email]> wrote:
>
> I think you'll get that behavior with ON_ERROR_ROLLBACK.
>

Awesome. Thanks!

Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Michael Paquier-2
In reply to this post by Isaac Morland
On Mon, Jun 25, 2018 at 11:01:06PM -0400, Isaac Morland wrote:
> I think an optional setting making DELETE and UPDATE without a WHERE clause
> illegal would be handy. Obviously this would have to be optional for
> backward compatibility. Perhaps even just a GUC setting, with the intent
> being that one would set it in .psqlrc so that omitting the WHERE clause at
> the command line would just be a syntax error. If one actually does need to
> affect the whole table one can just say WHERE TRUE. For applications, which
> presumably have their SQL queries tightly controlled and pre-written
> anyway, this would most likely not be particularly useful.

There was a patch doing exactly that which was discussed last year:
https://commitfest.postgresql.org/13/948/
https://www.postgresql.org/message-id/20160721045746.GA25043@...
What was proposed was rather limiting though, see my messages on the
thread.  Using a hook, that's simple enough to develop an extension
which does that.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Michael Paquier-2
In reply to this post by Yotsunaga, Naoki
On Tue, Jun 26, 2018 at 01:17:31AM +0000, Yotsunaga, Naoki wrote:

> The following is a description of "automatic restore point".
> 【Background】
>   When DBA's operation failure, for example DBA accidently drop table,
> the database is restored from the file system backup and recovered by
> using time or transaction ID. The transaction ID is identified from
> WAL.
>  
>   In order to solve the above problem,
>   I'd like propose a feature to implement automatic recording function
>   of recovery point.
There is also recovery_target_lsn which is new as of v10.  This
parameter is way better than having to track down time or XID, which is
a reason why I developped it.  Please note that this is also one of the
reasons why it is possible to delay WAL replays on standbys, so as an
operator has room to fix such operator errors.  Having of course cold
backups with a proper WAL archive and a correct retention policy never
hurts.

> 【Setting file】
>   Set postgres.conf.
>   auto_create_restore_point = on # Switch on/off automatic recording
>   function of recovery point. The default value is 'off'.
>
> So what do you think about it? Do you think is it useful?

So basically what you are looking for here is a way to enforce a restore
point to be created depending on a set of pre-defined conditions?  How
would you define and choose those?

> Also, when recovering with the current specification, tables other
> than the returned table also return to the state of the specified
> recovery point.
> So, I’m looking for ways to recover only specific tables. Do you have
> any ideas?

Why not using the utility hook which filters out for commands you'd
like to forbid, in this case TRUNCATE or a DROP TABLE on a given
relation?  Or why not simply using an event trigger at your application
level so as you can actually *prevent* the error to happen first?  With
the last option you don't have to write C code, but this would not
filter TRUNCATE.  In short, what you propose looks over-complicated to
me and there are options on the table which allow the problem you are
trying to solve to not happen at all.  You could also use the utility
hook to log or register somewhere hte XID/time/LSN associated to a given
command and then use it as your restore point.  This could also happen
out of core.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: automatic restore point

Yotsunaga, Naoki
In reply to this post by Michael Paquier-2
Hi. Thanks for comments.

Explanation of the background of the function proposal was inadequate.
So, I explain again.

I assume the following situation.
User needs to make a quick, seemingly simple fix to an important production database. User composes the query, gives it an once-over, and lets it run. Seconds later user realizes that user forgot the WHERE clause, dropped the wrong table, or made another serious mistake, and interrupts the query, but the damage has been done.
Also user did not record the time and did not look at a lsn position.

Certainly, I thought about reducing the possibility of executing the wrong command, but I thought that the possibility could not be completely eliminated.
So I proposed the “automatic restore point”.
With this function, user can recover quickly and reliably even if you perform a failure operation.

> I'd rather spend effort making the initial execution of said commands less likely.  
  I think that the function to prohibit DELETE and UPDATE without a WHERE clause in the later response is good way.
  But I think that it is impossible to completely eliminate the failure of the other commands.
  For example, drop the wrong table.

-----
Naoki Yotsunaga

-----Original Message-----
From: Michael Paquier [mailto:[hidden email]]
Sent: Tuesday, June 26, 2018 2:16 PM
To: Isaac Morland <[hidden email]>
Cc: David G. Johnston <[hidden email]>; Yotsunaga, Naoki/四ツ永 直輝 <[hidden email]>; Postgres hackers <[hidden email]>
Subject: Re: automatic restore point

On Mon, Jun 25, 2018 at 11:01:06PM -0400, Isaac Morland wrote:
> I think an optional setting making DELETE and UPDATE without a WHERE
> clause illegal would be handy. Obviously this would have to be
> optional for backward compatibility. Perhaps even just a GUC setting,
> with the intent being that one would set it in .psqlrc so that
> omitting the WHERE clause at the command line would just be a syntax
> error. If one actually does need to affect the whole table one can
> just say WHERE TRUE. For applications, which presumably have their SQL
> queries tightly controlled and pre-written anyway, this would most likely not be particularly useful.

There was a patch doing exactly that which was discussed last year:
https://commitfest.postgresql.org/13/948/
https://www.postgresql.org/message-id/20160721045746.GA25043@...
What was proposed was rather limiting though, see my messages on the thread.  Using a hook, that's simple enough to develop an extension which does that.
--
Michael


Reply | Threaded
Open this post in threaded view
|

RE: automatic restore point

Yotsunaga, Naoki
In reply to this post by Michael Paquier-2
Hi. Thanks for comments.

>There is also recovery_target_lsn which is new as of v10.
 In this method, it is necessary to look at a lsn position before operating.
 But I assume the user who did not look it before operating.
 So I think that this method is not appropriate.

> So basically what you are looking for here is a way to enforce a restore point to be created depending on a set of pre-defined conditions?  
>How would you define and choose those?
 I understand that I was asked how to set up a command to apply this function.
 Ex) DROP = on
     TRUNCATE = off
 Is my interpretation right?
 If my interpretation is correct, all the above commands will be applied.
 When this function is turned on, this function works when all the above commands are executed.

-------
Naoki Yotsynaga
-----Original Message-----
From: Michael Paquier [mailto:[hidden email]]
Sent: Tuesday, June 26, 2018 2:31 PM
To: Yotsunaga, Naoki/四ツ永 直輝 <[hidden email]>
Cc: Postgres hackers <[hidden email]>
Subject: Re: automatic restore point

On Tue, Jun 26, 2018 at 01:17:31AM +0000, Yotsunaga, Naoki wrote:

> The following is a description of "automatic restore point".
> 【Background】
>   When DBA's operation failure, for example DBA accidently drop table,
> the database is restored from the file system backup and recovered by
> using time or transaction ID. The transaction ID is identified from
> WAL.
>  
>   In order to solve the above problem,
>   I'd like propose a feature to implement automatic recording function
>   of recovery point.

There is also recovery_target_lsn which is new as of v10.  This parameter is way better than having to track down time or XID, which is a reason why I developped it.  Please note that this is also one of the reasons why it is possible to delay WAL replays on standbys, so as an operator has room to fix such operator errors.  Having of course cold backups with a proper WAL archive and a correct retention policy never hurts.

> 【Setting file】
>   Set postgres.conf.
>   auto_create_restore_point = on # Switch on/off automatic recording
>   function of recovery point. The default value is 'off'.
>
> So what do you think about it? Do you think is it useful?

So basically what you are looking for here is a way to enforce a restore point to be created depending on a set of pre-defined conditions?  How would you define and choose those?

> Also, when recovering with the current specification, tables other
> than the returned table also return to the state of the specified
> recovery point.
> So, I’m looking for ways to recover only specific tables. Do you have
> any ideas?

Why not using the utility hook which filters out for commands you'd like to forbid, in this case TRUNCATE or a DROP TABLE on a given relation?  Or why not simply using an event trigger at your application level so as you can actually *prevent* the error to happen first?  With the last option you don't have to write C code, but this would not filter TRUNCATE.  In short, what you propose looks over-complicated to me and there are options on the table which allow the problem you are trying to solve to not happen at all.  You could also use the utility hook to log or register somewhere hte XID/time/LSN associated to a given command and then use it as your restore point.  This could also happen out of core.
--
Michael
Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Michael Paquier-2
On Tue, Jul 03, 2018 at 01:07:41AM +0000, Yotsunaga, Naoki wrote:
>> There is also recovery_target_lsn which is new as of v10.
>  In this method, it is necessary to look at a lsn position before operating.
>  But I assume the user who did not look it before operating.
>  So I think that this method is not appropriate.

You should avoid top-posting on the mailing lists, this breaks the
consistency of the thread.

>> So basically what you are looking for here is a way to enforce a
>> restore point to be created depending on a set of pre-defined
>> conditions?  How would you define and choose those?
>
> I understand that I was asked how to set up a command to apply this function.
>  Ex) DROP = on
>      TRUNCATE = off
>  Is my interpretation right?
>  If my interpretation is correct, all the above commands will be
>  applied.
>  When this function is turned on, this function works when all the
>  above commands are executed.
Yeah, but based on which factors are you able to define that such
conditions are enough to say that this feature is fully-compliant with
user's need, and how can you be sure that this is not going to result in
an additional maintenance burden if you need to define a new set of
conditions in the future.  For example an operator has issued a costly
ALTER TABLE which causes a full table rewrite, which could be also an
operation that you would like to prevent.  Having a set of GUCs which
define such low-level behavior is not really user-friendly.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Michael Paquier-2
In reply to this post by Yotsunaga, Naoki
On Tue, Jul 03, 2018 at 01:06:31AM +0000, Yotsunaga, Naoki wrote:
>> I'd rather spend effort making the initial execution of said commands
>> less likely.
>
> I think that the function to prohibit DELETE and UPDATE without a
> WHERE clause in the later response is good way.

This has popped up already in the lists in the past.

> But I think that it is impossible to completely eliminate the failure
> of the other commands.  For example, drop the wrong table.

This kind of thing is heavily application-dependent.  For example, you
would likely not care if an operator, who has newly-joined the team in
charge of the maintenance of this data, drops unfortunately a table
which includes logs from 10 years back, and you would very likely care
about a table dropped which has user's login data.  My point is that you
need to carefully design the shape of the configuration you would use,
so as any application's admin would be able to cope with it, for example
allowing exclusion filters with regular expressions could be a good idea
to dig into.  And also you need to think about it so as it is backward
compatible.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Jaime Casanova-4
In reply to this post by Yotsunaga, Naoki
On Mon, 2 Jul 2018 at 20:07, Yotsunaga, Naoki
<[hidden email]> wrote:

>
> Hi. Thanks for comments.
>
> Explanation of the background of the function proposal was inadequate.
> So, I explain again.
>
> I assume the following situation.
> User needs to make a quick, seemingly simple fix to an important production database. User composes the query, gives it an once-over, and lets it run. Seconds later user realizes that user forgot the WHERE clause, dropped the wrong table, or made another serious mistake, and interrupts the query, but the damage has been done.
> Also user did not record the time and did not look at a lsn position.
>

Thinking on Michael's suggestion of using event triggers, you can
create an event trigger to run pg_create_restore_point() on DROP,
here's a simple example of how that should like:
https://www.postgresql.org/docs/current/static/functions-event-triggers.html

You can also create a normal trigger BEFORE TRUNCATE to create a
restore point just before running the TRUNCATE command.

Those would run *on the background* (you don't need to call them
manually), you can use them right now, won't affect performance for
people not wanting this "functionality".

BTW, Michael's suggestion also included the idea of recording
xid/time/lsn which can be done through triggers too

--
Jaime Casanova                      www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

RE: automatic restore point

Yotsunaga, Naoki
In reply to this post by Michael Paquier-2
>-----Original Message-----
>From: Michael Paquier [mailto:[hidden email]]
>Sent: Tuesday, July 3, 2018 10:22 AM

>This kind of thing is heavily application-dependent.  For example, you would likely not care if an operator, who has newly-joined the team in >charge of the maintenance of this data, drops unfortunately a table which includes logs from 10 years back, and you would very likely care >about a table dropped which has user's login data.  My point is that you need to carefully design the shape of the configuration you would use, >so as any application's admin would be able to cope with it, for example allowing exclusion filters with regular expressions could be a good >idea to dig into.  And also you need to think about it so as it is backward compatible.

Thanks for comments.

Does that mean that the application (user) is interested in which table?
For example, there are two tables A. It is ok even if one table disappears, but it is troubled if another table B disappears. So, when the table B is dropped, automatic restore point works. In the table A, automatic restore point does not work.
So, it is difficult to implement that automatic restore point in postgresql by default.
Is my interpretation right?

---
Naoki Yotsunaga


Reply | Threaded
Open this post in threaded view
|

RE: automatic restore point

Yotsunaga, Naoki
In reply to this post by Jaime Casanova-4
>-----Original Message-----
>From: Jaime Casanova [mailto:[hidden email]]
>Sent: Tuesday, July 3, 2018 11:06 AM

>Thinking on Michael's suggestion of using event triggers, you can create an event >trigger to run pg_create_restore_point() on DROP, here's a simple example of how >that should like:
>https://www.postgresql.org/docs/current/static/functions-event-triggers.html

>You can also create a normal trigger BEFORE TRUNCATE to create a restore point just >before running the TRUNCATE command.

Thanks for comments.
I was able to understand.

---
Naoki Yotsunaga
Reply | Threaded
Open this post in threaded view
|

RE: automatic restore point

Yotsunaga, Naoki
In reply to this post by Yotsunaga, Naoki
>-----Original Message-----
>From: Yotsunaga, Naoki [mailto:[hidden email]]
>Sent: Friday, July 6, 2018 5:05 PM

>Does that mean that the application (user) is interested in which table?
>For example, there are two tables A. It is ok even if one table disappears, but it is troubled if another table B disappears. So, when the table B is dropped, automatic restore point works. In the table A, automatic restore point does not work.
>So, it is difficult to implement that automatic restore point in postgresql by default.
>Is my interpretation right?

I want to hear about the following in addition to the previous comment.
What would you do if your customer dropped the table and asked you to restore it?
Everyone is thinking what to do to avoid operation failure, but I’m thinking about after the user’s failure.
What I mean is that not all users will set up in advance.
For example, if you make the settings described in the manual, you will not drop the table by operation failure. However, not all users do that setting.
For such users, I think that it is necessary to have a function to easily restore data after failing operation without setting anything in advance.
So I proposed this function.

---
Naoki Yotsunaga





Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Michael Paquier-2
On Wed, Jul 11, 2018 at 06:11:01AM +0000, Yotsunaga, Naoki wrote:
> I want to hear about the following in addition to the previous
> comment.   What would you do if your customer dropped the table and asked you to
> restore it?

I can think of 4 reasons on top of my mind:
1) Don't do that.
2) Implement safe-guards using utility hooks or event triggers.
3) Have a delayed standby if you don't believe that your administrators
are skilled enough in case.
4) Have backups and a WAL archive.

> Everyone is thinking what to do to avoid operation failure, but I’m
> thinking about after the user’s failure.
> What I mean is that not all users will set up in advance.
> For example, if you make the settings described in the manual, you
> will not drop the table by operation failure. However, not all users
> do that setting.
> For such users, I think that it is necessary to have a function to
> easily restore data after failing operation without setting anything
> in advance. So I proposed this function.

Well, if you put in place correct measures from the start you would not
have problems.  It seems to me that there is no point in implementing
something which is a solution for a very narrow case, where the user has
shot his own foot to begin with.  Having backups anyway is mandatory by
the way, standby replicas are not backups.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: automatic restore point

Yotsunaga, Naoki
>-----Original Message-----
>From: Michael Paquier [mailto:[hidden email]]
>Sent: Wednesday, July 11, 2018 3:34 PM

>Well, if you put in place correct measures from the start you would not have problems.  
>It seems to me that there is no point in implementing something which is a solution for a very narrow case, where the user has shot his own foot to begin with.
>Having backups anyway is mandatory by the way, standby replicas are not backups.

I think that the Undo function of AWS and Oracle's Flashback function are to save such users, and it is a function to prevent human error.
So, how about postgres implementing such a function?
 
Also, as an approach to achieving the goal, I thought about outputting lsn to the server log when a specific command was executed.
 
I do not think the source code of postgres will be complicated when implementing this function.
Do you feel it is too complicated?

-------
Naoki Yotsunaga
Reply | Threaded
Open this post in threaded view
|

Re: automatic restore point

Michael Paquier-2
On Fri, Jul 13, 2018 at 08:16:00AM +0000, Yotsunaga, Naoki wrote:
> Do you feel it is too complicated?

In short, yes.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: automatic restore point

Yotsunaga, Naoki
In reply to this post by Yotsunaga, Naoki
-----Original Message-----
From: Yotsunaga, Naoki [mailto:[hidden email]]
Sent: Tuesday, June 26, 2018 10:18 AM
To: Postgres hackers <[hidden email]>
Subject: automatic restore point

Hi, I attached a patch to output the LSN before execution to the server log when executing a specific command and accidentally erasing data.

A detailed background has been presented before.
In short explain: After the DBA's operation failure and erases the data, it is necessary to perform PITR immediately.
Since it is not possible to easily obtain information for doing the current PITR, I would like to solve it.

The specification has changed from the first proposal.
-Target command
 DROP TABLE
 TRUNCATE

-Setting file
 postgresql.conf
 log_recovery_points = on #default value is 'off'. When the switch is turned on, LSN is output to the server log when DROP TABLE, TRUNCATE is executed.

-How to use
1) When executing the above command, identify the command and recovery point that matches the resource indicating the operation failure from the server log.    
ex) LOG:  recovery_point_lsn: 0/201BB70
       STATEMENT:  drop table test ;
 2) Implement PostgreSQL document '25 .3.4.Recovering Using a Continuous Archive Backup.'
    *Set "recovery_target_lsn = 'recovery_point_lsn'" at recovery.conf.

Although there was pointed out that the source becomes complicated in the past, we could add the function by adding about 20 steps.

What do you think about it? Do you think is it useful?
------
Naoki Yotsunaga





automatic_restore_point.patch (4K) Download Attachment
12