PITR problems

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

PITR problems

Per Lauvås
PITR problems

Hi

PITR (Point in Time Recovery) seams like an excellent feature, but I have trouble getting the hang of it.

My environment:
PostgreSQL 8.0.3 on windows 2000.

My procedure:
- Set up archiving with the archive_command. The files are stored on a remote computer in a folder called wal_archive. Looks fine.

- Set up a scheduled task running a script every half hour. The script copies the last modified file in the pg_xlog dir (I call this file current_wal) to the same remote computer, but in a different folder. The script is described later.

- SELECT pg_start_backup('label')
- Copy the data directory to the remote computer. I remove the contents (in the copy) of the pg_xlog directory and archive_status directory. The pg_start_backup function produces a backup history file indicating where in the current WAL segment the backup took place.

Question 1: What do I need to save in my wal_archive. The backup file is automatically placed there. Do I need the WAL segment in which the backup file is based upon? It is not automatically copied to the wal_archive. In that case; what if the backup file is called .....39.00198218.backup and I have no ....39 WAL segment? Do I use the last modified segment, althoug it has a different file name, for example ....3F? Do I rename it???

The DOC tells me that the WAL segments numerically less than the current WAL segment are no longer needed. My base backup gave me a backup history file numbered 39 although all the WAL segments er numerically greater than 39!

Question 2: Should I delete the other WAL segments in the pg_xlog directory, although they are numerically greater?

So, after setting up the backup I now test the recovery process.
I think I'm well off when I have my backup history file, the archived WAL segments, the current_wal_segment on the data directory.

- I replace the data directory, fill the pg_xlog directory with the backup history file and the archived WAL segments. But what about the current_wal_segment produced by the scheduled task? It doesn't seem to fit in. The archived WAL segments are numbered in sequence based upon the backup history file. So the numbers do not match the numbers in the pg_xlog directory. The result of the copy is a current_wal_segment with a number not following the last archived wal.

The result is that I am able to backup to the last archived WAL segment, but I am not anywhere close to recover to the last half hour. Has anyone mastered this?

Per

The script:
@echo off

@setlocal

set PG_WAL_AREA="D:\PostgreSQL\8.0\data\pg_xlog"
set CUR_WAL_AREA="<path to current_wal>"

SET lastfil=
FOR /F %%f in ('DIR %PG_WAL_AREA% /AA /OD /B') do set lastfil=%%f
ECHO. last file in directory is %lastfil%
COPY /Y %PG_WAL_AREA%\%lastfil% %CUR_WAL_AREA%

@endlocal




Reply | Threaded
Open this post in threaded view
|

Re: PITR problems

Frank Finner
Hi,

I can tell you how I do this on Linux:

1. Set up the archiving command in postgresql.conf. Archiving of complete WAL files (that is, of WALs whose 16M space is completely filled) starts automatically after restarting the engine. No renaming, no nothing else with complete WALs. No deleting also. You__ll need all the WALs since the time of the full backup for recovery!
2. Copy the database with the pg_start_backup feature. Clean up pg_xlog in that copy.
3. Run a cronjob that finds the latest incomplete WAL and copy it also into the archive (but into a subdirectory, not to confuse with the complete WALs).

I copy the full backup to another machine once I do the backup, usually once at night. Also I copy the WAL archive including the incomplete WAL to that other machine once per minute while taking the incomplete WAL with rsync. Also no renaming.

To start the backup database on that other machine I have written a script that basically does the following:

0. Shutdown an eventually running postgresql engine. Shutdown the interface connecting to the "master" to prevent further copying of WALs.
1. Wipe the postgresql data directory and copy the full backup to that directory, not including any WALs that might be within the full backup. pg_xlog must be empty except an empty directory "archive_status".
2. Create a file "recovery.conf" in the data directory containing the single line 'cp /archivedirectory/%f %p'. /archivedirectory contains all the archived WALs including the incomplete last one. The oldest one must be at least as old as the start time of the full backup. I usually keep some even older ones, so I have at least one complete WAL, that is older than the full backup.
3. Start postgresql. It will recover with all the WALs from the archive directory, starting with the one the full backup requires and using all the WALs piece by piece up to and including the last incomplete one. Incomplete does NOT mean, that this WAL is shorter than 16M, but the 16M space is not completely filled.

If everything goes well, you have a running consistent database. You may now start the interface to the "master" again.
       
Recovery might take some minutes due to data masses of WALs. For example, I have about 160M of WALs per hour.

Regards, Frank.



On Tue, 5 Jul 2005 15:24:35 +0200 Per Lauvås <[hidden email]> thought long, then sat down and wrote:

> Hi
>
> PITR (Point in Time Recovery) seams like an excellent feature, but I have trouble getting the hang of it.
>
> My environment:
> PostgreSQL 8.0.3 on windows 2000.
>
> My procedure:
> - Set up archiving with the archive_command. The files are stored on a remote computer in a folder called wal_archive. Looks fine.
> - Set up a scheduled task running a script every half hour. The script copies the last modified file in the pg_xlog dir (I call this file current_wal) to the same remote computer, but in a different folder. The script is described later.
> - SELECT pg_start_backup('label')
> - Copy the data directory to the remote computer. I remove the contents (in the copy) of the pg_xlog directory and archive_status directory. The pg_start_backup function produces a backup history file indicating where in the current WAL segment the backup took place.
> Question 1: What do I need to save in my wal_archive. The backup file is automatically placed there. Do I need the WAL segment in which the backup file is based upon? It is not automatically copied to the wal_archive. In that case; what if the backup file is called .....39.00198218.backup and I have no ....39 WAL segment? Do I use the last modified segment, althoug it has a different file name, for example ....3F? Do I rename it???
>
> The DOC tells me that the WAL segments numerically less than the current WAL segment are no longer needed. My base backup gave me a backup history file numbered 39 although all the WAL segments er numerically greater than 39!
>
> Question 2: Should I delete the other WAL segments in the pg_xlog directory, although they are numerically greater?
>
> So, after setting up the backup I now test the recovery process.
> I think I'm well off when I have my backup history file, the archived WAL segments, the current_wal_segment on the data directory.
> - I replace the data directory, fill the pg_xlog directory with the backup history file and the archived WAL segments. But what about the current_wal_segment produced by the scheduled task? It doesn't seem to fit in. The archived WAL segments are numbered in sequence based upon the backup history file. So the numbers do not match the numbers in the pg_xlog directory. The result of the copy is a current_wal_segment with a number not following the last archived wal.
>
> The result is that I am able to backup to the last archived WAL segment, but I am not anywhere close to recover to the last half hour. Has anyone mastered this?
>
> Per
>
> The script:
> @echo off
>
> @setlocal
>
> set PG_WAL_AREA="D:\PostgreSQL\8.0\data\pg_xlog"
> set CUR_WAL_AREA="<path to current_wal>"
>
> SET lastfil=
> FOR /F %%f in ('DIR %PG_WAL_AREA% /AA /OD /B') do set lastfil=%%f
> ECHO. last file in directory is %lastfil%
> COPY /Y %PG_WAL_AREA%\%lastfil% %CUR_WAL_AREA%
>
> @endlocal
>
>
>
>
>

--
Frank Finner

Invenius - Lösungen mit Linux
Köpfchenstraße 36
57072 Siegen
Telefon: 0271 231 8606    Mail: [hidden email]
Telefax: 0271 231 8608    Web:  http://www.invenius.de
Key fingerprint = 90DF FF40 582E 6D6B BADF  6E6A A74E 67E4 E788 2651


attachment0 (196 bytes) Download Attachment